network#

Module Contents#

DistNetwork

Manage torch.distributed network.

type2byte#
class DistNetwork(address: tuple, world_size: int, rank: int, ethernet: str = None, dist_backend: str = 'gloo')#

Bases: object

Manage torch.distributed network.

Parameters:
  • address (tuple) – Address of this server in form of (SERVER_ADDR, SERVER_IP)

  • world_size (int) – the size of this distributed group (including server).

  • rank (int) – the rank of process in distributed group.

  • ethernet (str) – the name of local ethernet. User could check it using command ifconfig.

  • dist_backend (str or torch.distributed.Backend) – backend of torch.distributed. Valid values include mpi, gloo, and nccl. Default: gloo.

init_network_connection()#

Initialize torch.distributed communication group

close_network_connection()#

Destroy current torch.distributed process group

send(content=None, message_code=None, dst=0, count=True)#

Send tensor to process rank=dst

recv(src=None, count=True)#

Receive tensor from process rank=src

broadcast_send(content=None, message_code=None, dst=None, count=True)#
broadcast_recv(src=None, count=True)#
__str__()#

Return str(self).