partitioned_mnist#

Module Contents#

PartitionedMNIST

FedDataset with partitioning preprocess. For detailed partitioning, please

class PartitionedMNIST(root, path, num_clients, download=True, preprocess=False, partition='iid', dir_alpha=None, verbose=True, seed=None, transform=None, target_transform=None)#

Bases: fedlab.contrib.dataset.basic_dataset.FedDataset

FedDataset with partitioning preprocess. For detailed partitioning, please check Federated Dataset and DataPartitioner.

Parameters:
  • root (str) – Path to download raw dataset.

  • path (str) – Path to save partitioned subdataset.

  • num_clients (int) – Number of clients.

  • download (bool) – Whether to download the raw dataset.

  • preprocess (bool) – Whether to preprocess the dataset.

  • partition (str, optional) – Partition name. Only supports "noniid-#label", "noniid-labeldir", "unbalance" and "iid" partition schemes.

  • dir_alpha (float, optional) – Dirichlet distribution parameter for non-iid partition. Only works if partition="dirichlet". Default as None.

  • verbose (bool, optional) – Whether to print partition process. Default as True.

  • seed (int, optional) – Random seed. Default as None.

  • transform (callable, optional) – A function/transform that takes in an PIL image and returns a transformed version.

  • target_transform (callable, optional) – A function/transform that takes in the target and transforms it.

preprocess(partition='iid', dir_alpha=None, verbose=True, seed=None, download=True, transform=None, target_transform=None)#

Perform FL partition on the dataset, and save each subset for each client into data{cid}.pkl file.

For details of partition schemes, please check Federated Dataset and DataPartitioner.

get_dataset(cid, type='train')#

Load subdataset for client with client ID cid from local file.

Parameters:
  • cid (int) – client id

  • type (str, optional) – Dataset type, can be "train", "val" or "test". Default as "train".

Returns:

Dataset

get_dataloader(cid, batch_size=None, type='train')#

Return dataload for client with client ID cid.

Parameters:
  • cid (int) – client id

  • batch_size (int, optional) – batch size in DataLoader.

  • type (str, optional) – Dataset type, can be "train", "val" or "test". Default as "train".