partition#

Module Contents#

DataPartitioner

Base class for data partition in federated learning.

CIFAR10Partitioner

CIFAR10 data partitioner.

CIFAR100Partitioner

CIFAR100 data partitioner.

BasicPartitioner

Basic data partitioner.

VisionPartitioner

Data partitioner for vision data.

MNISTPartitioner

Data partitioner for MNIST.

FMNISTPartitioner

Data partitioner for FashionMNIST.

SVHNPartitioner

Data partitioner for SVHN.

FCUBEPartitioner

FCUBE data partitioner.

AdultPartitioner

Data partitioner for Adult.

RCV1Partitioner

Data partitioner for RCV1.

CovtypePartitioner

Data partitioner for Covtype.

class DataPartitioner#

Bases: abc.ABC

Base class for data partition in federated learning.

Examples of DataPartitioner: BasicPartitioner, CIFAR10Partitioner.

Details and tutorials of different data partition and datasets, please check Federated Dataset and DataPartitioner.

abstract _perform_partition()#
abstract __getitem__(index)#
abstract __len__()#
class CIFAR10Partitioner(targets, num_clients, balance=True, partition='iid', unbalance_sgm=0, num_shards=None, dir_alpha=None, verbose=True, min_require_size=None, seed=None)#

Bases: DataPartitioner

CIFAR10 data partitioner.

Partition CIFAR10 given specific client number. Currently 6 supported partition schemes can be achieved by passing different combination of parameters in initialization:

For detail usage, please check Federated Dataset and DataPartitioner.

Parameters:
  • targets (list or numpy.ndarray) – Targets of dataset for partition. Each element is in range of [0, 1, …, 9].

  • num_clients (int) – Number of clients for data partition.

  • balance (bool, optional) – Balanced partition over all clients or not. Default as True.

  • partition (str, optional) – Partition type, only "iid", shards, "dirichlet" are supported. Default as "iid".

  • unbalance_sgm (float, optional) – Log-normal distribution variance for unbalanced data partition over clients. Default as 0 for balanced partition.

  • num_shards (int, optional) – Number of shards in non-iid "shards" partition. Only works if partition="shards". Default as None.

  • dir_alpha (float, optional) – Dirichlet distribution parameter for non-iid partition. Only works if partition="dirichlet". Default as None.

  • verbose (bool, optional) – Whether to print partition process. Default as True.

  • min_require_size (int, optional) – Minimum required sample number for each client. If set to None, then equals to num_classes. Only works if partition="noniid-labeldir".

  • seed (int, optional) – Random seed. Default as None.

num_classes = 10#
_perform_partition()#
__getitem__(index)#

Obtain sample indices for client index.

Parameters:

index (int) – Client ID.

Returns:

List of sample indices for client ID index.

Return type:

list

__len__()#

Usually equals to number of clients.

class CIFAR100Partitioner(targets, num_clients, balance=True, partition='iid', unbalance_sgm=0, num_shards=None, dir_alpha=None, verbose=True, min_require_size=None, seed=None)#

Bases: CIFAR10Partitioner

CIFAR100 data partitioner.

This is a subclass of the CIFAR10Partitioner. For details, please check Federated Dataset and DataPartitioner.

num_classes = 100#
class BasicPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=1, verbose=True, min_require_size=None, seed=None)#

Bases: DataPartitioner

Basic data partitioner.

Basic data partitioner, supported partition:

  • label-distribution-skew:quantity-based

  • label-distribution-skew:distributed-based (Dirichlet)

  • quantity-skew (Dirichlet)

  • IID

For more details, please check Federated Learning on Non-IID Data Silos: An Experimental Study and Federated Dataset and DataPartitioner.

Parameters:
  • targets (list or numpy.ndarray) – Sample targets. Unshuffled preferred.

  • num_clients (int) – Number of clients for partition.

  • partition (str) – Partition name. Only supports "noniid-#label", "noniid-labeldir", "unbalance" and "iid" partition schemes.

  • dir_alpha (float) – Parameter alpha for Dirichlet distribution. Only works if partition="noniid-labeldir".

  • major_classes_num (int) – Number of major class for each clients. Only works if partition="noniid-#label".

  • verbose (bool) – Whether output intermediate information. Default as True.

  • min_require_size (int, optional) – Minimum required sample number for each client. If set to None, then equals to num_classes. Only works if partition="noniid-labeldir".

  • seed (int) – Random seed. Default as None.

Returns:

{ client_id: indices}.

Return type:

dict

num_classes = 2#
_perform_partition()#
__getitem__(index)#
__len__()#
class VisionPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=None, verbose=True, seed=None)#

Bases: BasicPartitioner

Data partitioner for vision data.

Supported partition for vision data:

  • label-distribution-skew:quantity-based

  • label-distribution-skew:distributed-based (Dirichlet)

  • quantity-skew (Dirichlet)

  • IID

For more details, please check Federated Learning on Non-IID Data Silos: An Experimental Study.

Parameters:
  • targets (list or numpy.ndarray) – Sample targets. Unshuffled preferred.

  • num_clients (int) – Number of clients for partition.

  • partition (str) – Partition name. Only supports "noniid-#label", "noniid-labeldir", "unbalance" and "iid" partition schemes.

  • dir_alpha (float) – Parameter alpha for Dirichlet distribution. Only works if partition="noniid-labeldir".

  • major_classes_num (int) – Number of major class for each clients. Only works if partition="noniid-#label".

  • verbose (bool) – Whether output intermediate information. Default as True.

  • seed (int) – Random seed. Default as None.

Returns:

{ client_id: indices}.

Return type:

dict

num_classes = 10#
class MNISTPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=None, verbose=True, seed=None)#

Bases: VisionPartitioner

Data partitioner for MNIST.

For details, please check VisionPartitioner and Federated Dataset and DataPartitioner.

num_features = 784#
class FMNISTPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=None, verbose=True, seed=None)#

Bases: VisionPartitioner

Data partitioner for FashionMNIST.

For details, please check VisionPartitioner and Federated Dataset and DataPartitioner

num_features = 784#
class SVHNPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=None, verbose=True, seed=None)#

Bases: VisionPartitioner

Data partitioner for SVHN.

For details, please check VisionPartitioner and Federated Dataset and DataPartitioner

num_features = 1024#
class FCUBEPartitioner(data, partition)#

Bases: DataPartitioner

FCUBE data partitioner.

FCUBE is a synthetic dataset for research in non-IID scenario with feature imbalance. This dataset and its partition methods are proposed in Federated Learning on Non-IID Data Silos: An Experimental Study.

Supported partition methods for FCUBE:

  • feature-distribution-skew:synthetic

  • IID

For more details, please refer to Section (IV-B-b) of original paper. For detailed usage, please check Federated Dataset and DataPartitioner.

Parameters:
  • data (numpy.ndarray) – Data of dataset FCUBE.

  • partition (str) – Partition type. Only supports ‘synthetic’ and ‘iid’.

num_classes = 2#
num_clients = 4#
_perform_partition()#
__getitem__(index)#
__len__()#
class AdultPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=1, verbose=True, min_require_size=None, seed=None)#

Bases: BasicPartitioner

Data partitioner for Adult.

For details, please check BasicPartitioner and Federated Dataset and DataPartitioner

num_features = 123#
num_classes = 2#
class RCV1Partitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=1, verbose=True, min_require_size=None, seed=None)#

Bases: BasicPartitioner

Data partitioner for RCV1.

For details, please check BasicPartitioner and Federated Dataset and DataPartitioner

num_features = 47236#
num_classes = 2#
class CovtypePartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=1, verbose=True, min_require_size=None, seed=None)#

Bases: BasicPartitioner

Data partitioner for Covtype.

For details, please check BasicPartitioner and Federated Dataset and DataPartitioner

num_features = 54#
num_classes = 2#