core#
Package Contents#
Manage |
|
Abstract class. |
- class DistNetwork(address: tuple, world_size: int, rank: int, ethernet: str = None, dist_backend: str = 'gloo')#
Bases:
object
Manage
torch.distributed
network.- 参数
address (tuple) – Address of this server in form of
(SERVER_ADDR, SERVER_IP)
world_size (int) – the size of this distributed group (including server).
rank (int) – the rank of process in distributed group.
ethernet (str) – the name of local ethernet. User could check it using command ifconfig.
dist_backend (str or torch.distributed.Backend) –
backend
oftorch.distributed
. Valid values includempi
,gloo
, andnccl
. Default:gloo
.
- init_network_connection()#
Initialize
torch.distributed
communication group
- close_network_connection()#
Destroy current
torch.distributed
process group
- send(content=None, message_code=None, dst=0, count=True)#
Send tensor to process rank=dst
- recv(src=None, count=True)#
Receive tensor from process rank=src
- broadcast_send(content=None, message_code=None, dst=None, count=True)#
- broadcast_recv(src=None, count=True)#
- __str__()#
Return str(self).
- class NetworkManager(network: fedlab.core.network.DistNetwork)#
Bases:
torch.multiprocessing.Process
Abstract class.
- 参数
network (DistNetwork) – object to manage torch.distributed network communication.
- run()#
Main Process:
Initialization stage.
FL communication stage.
Shutdown stage. Close network connection.
- setup()#
Initialize network connection and necessary setups.
At first,
self._network.init_network_connection()
is required to be called.Overwrite this method to implement system setup message communication procedure.
- abstract main_loop()#
Define the actions of communication stage.
- shutdown()#
Shutdown stage.
Close the network connection in the end.