Distributed Communication Backend¶
The distributed communication requirements required by
MMEval in the distributed evaluation mainly include the following:
All-gather the intermediate results of the metric saved in each process.
Broadcast the metric result calculated by the rank 0 process to all processes
In order to flexibly support multiple distributed communication libraries, MMEval abstracts the above distributed communication requirements and defines a distributed communication interface BaseDistBackend:
To implement a distributed communication backend, you need to inherit BaseDistBackend and implement the above interfaces, where:
is_initialized: identifies whether the initialization of the distributed communication environment has been completed.
rank: the rank index of the current process group.
world_size: the world size of the current process group.
all_gather_object: perform the all_tather operation on any Python object that can be serialized by
broadcast_object: broadcasts any Python object that can be serialized by
Take the implementation of MPI4PyDist as an example:
from mpi4py import MPI class MPI4PyDist(BaseDistBackend): """A distributed communication backend for mpi4py.""" @property def is_initialized(self) -> bool: """Returns True if the distributed environment has been initialized.""" return 'OMPI_COMM_WORLD_SIZE' in os.environ @property def rank(self) -> int: """Returns the rank index of the current process group.""" comm = MPI.COMM_WORLD return comm.Get_rank() @property def world_size(self) -> int: """Returns the world size of the current process group.""" comm = MPI.COMM_WORLD return comm.Get_size() def all_gather_object(self, obj: Any) -> List[Any]: """All gather the given object from the current process group and returns a list consisting gathered object of each process.""" comm = MPI.COMM_WORLD return comm.allgather(obj) def broadcast_object(self, obj: Any, src: int = 0) -> Any: """Broadcast the given object from source process to the current process group.""" comm = MPI.COMM_WORLD return comm.bcast(obj, root=src)
Some distributed communication backends have been implemented in
MMEval, which can be viewed in the support matrix.