partition#

Module Contents#

DataPartitioner

Base class for data partition in federated learning.

CIFAR10Partitioner

CIFAR10 data partitioner.

CIFAR100Partitioner

CIFAR100 data partitioner.

BasicPartitioner

  • label-distribution-skew:quantity-based

VisionPartitioner

  • label-distribution-skew:quantity-based

MNISTPartitioner

  • label-distribution-skew:quantity-based

FMNISTPartitioner

  • label-distribution-skew:quantity-based

SVHNPartitioner

  • label-distribution-skew:quantity-based

FCUBEPartitioner

FCUBE data partitioner.

AdultPartitioner

  • label-distribution-skew:quantity-based

RCV1Partitioner

  • label-distribution-skew:quantity-based

CovtypePartitioner

  • label-distribution-skew:quantity-based

class DataPartitioner#

Bases: abc.ABC

Base class for data partition in federated learning.

abstract _perform_partition(self)#
abstract __getitem__(self, index)#
abstract __len__(self)#
class CIFAR10Partitioner(targets, num_clients, balance=True, partition='iid', unbalance_sgm=0, num_shards=None, dir_alpha=None, verbose=True, seed=None)#

Bases: DataPartitioner

CIFAR10 data partitioner.

Partition CIFAR10 given specific client number. Currently 6 supported partition schemes can be achieved by passing different combination of parameters in initialization:

Parameters
  • targets (list or numpy.ndarray) – Targets of dataset for partition. Each element is in range of [0, 1, …, 9].

  • num_clients (int) – Number of clients for data partition.

  • balance (bool, optional) – Balanced partition over all clients or not. Default as True.

  • partition (str, optional) – Partition type, only "iid", shards, "dirichlet" are supported. Default as "iid".

  • unbalance_sgm (float, optional) – Log-normal distribution variance for unbalanced data partition over clients. Default as 0 for balanced partition.

  • num_shards (int, optional) – Number of shards in non-iid "shards" partition. Only works if partition="shards". Default as None.

  • dir_alpha (float, optional) – Dirichlet distribution parameter for non-iid partition. Only works if partition="dirichlet". Default as None.

  • verbose (bool, optional) – Whether to print partition process. Default as True.

  • seed (int, optional) – Random seed. Default as None.

num_classes = 10#
_perform_partition(self)#
__getitem__(self, index)#

Obtain sample indices for client index.

Parameters

index (int) – Client ID.

Returns

List of sample indices for client ID index.

Return type

list

__len__(self)#

Usually equals to number of clients.

class CIFAR100Partitioner(targets, num_clients, balance=True, partition='iid', unbalance_sgm=0, num_shards=None, dir_alpha=None, verbose=True, seed=None)#

Bases: CIFAR10Partitioner

CIFAR100 data partitioner.

This is a subclass of the CIFAR10Partitioner.

num_classes = 100#
class BasicPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=1, verbose=True, seed=None)#

Bases: DataPartitioner

  • label-distribution-skew:quantity-based

  • label-distribution-skew:distributed-based (Dirichlet)

  • quantity-skew (Dirichlet)

  • IID

Parameters
  • targets

  • num_clients

  • partition

  • dir_alpha

  • major_classes_num

  • verbose

  • seed

num_classes = 2#
_perform_partition(self)#
__getitem__(self, index)#
__len__(self)#
class VisionPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=None, verbose=True, seed=None)#

Bases: BasicPartitioner

  • label-distribution-skew:quantity-based

  • label-distribution-skew:distributed-based (Dirichlet)

  • quantity-skew (Dirichlet)

  • IID

Parameters
  • targets

  • num_clients

  • partition

  • dir_alpha

  • major_classes_num

  • verbose

  • seed

num_classes = 10#
class MNISTPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=None, verbose=True, seed=None)#

Bases: VisionPartitioner

  • label-distribution-skew:quantity-based

  • label-distribution-skew:distributed-based (Dirichlet)

  • quantity-skew (Dirichlet)

  • IID

Parameters
  • targets

  • num_clients

  • partition

  • dir_alpha

  • major_classes_num

  • verbose

  • seed

num_features = 784#
class FMNISTPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=None, verbose=True, seed=None)#

Bases: VisionPartitioner

  • label-distribution-skew:quantity-based

  • label-distribution-skew:distributed-based (Dirichlet)

  • quantity-skew (Dirichlet)

  • IID

Parameters
  • targets

  • num_clients

  • partition

  • dir_alpha

  • major_classes_num

  • verbose

  • seed

num_features = 784#
class SVHNPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=None, verbose=True, seed=None)#

Bases: VisionPartitioner

  • label-distribution-skew:quantity-based

  • label-distribution-skew:distributed-based (Dirichlet)

  • quantity-skew (Dirichlet)

  • IID

Parameters
  • targets

  • num_clients

  • partition

  • dir_alpha

  • major_classes_num

  • verbose

  • seed

num_features = 1024#
class FCUBEPartitioner(data, partition)#

Bases: DataPartitioner

FCUBE data partitioner.

FCUBE is a synthetic dataset for research in non-IID scenario with feature imbalance. This dataset and its partition methods are proposed in Federated Learning on Non-IID Data Silos: An Experimental Study.

Supported partition methods for FCUBE:

  • feature-distribution-skew:synthetic

  • IID

For more details, please refer to Section (IV-B-b) of original paper.

Parameters

data (numpy.ndarray) – Data of dataset FCUBE.

num_classes = 2#
num_clients = 4#
_perform_partition(self)#
__getitem__(self, index)#
__len__(self)#
class AdultPartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=1, verbose=True, seed=None)#

Bases: BasicPartitioner

  • label-distribution-skew:quantity-based

  • label-distribution-skew:distributed-based (Dirichlet)

  • quantity-skew (Dirichlet)

  • IID

Parameters
  • targets

  • num_clients

  • partition

  • dir_alpha

  • major_classes_num

  • verbose

  • seed

num_features = 123#
num_classes = 2#
class RCV1Partitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=1, verbose=True, seed=None)#

Bases: BasicPartitioner

  • label-distribution-skew:quantity-based

  • label-distribution-skew:distributed-based (Dirichlet)

  • quantity-skew (Dirichlet)

  • IID

Parameters
  • targets

  • num_clients

  • partition

  • dir_alpha

  • major_classes_num

  • verbose

  • seed

num_features = 47236#
num_classes = 2#
class CovtypePartitioner(targets, num_clients, partition='iid', dir_alpha=None, major_classes_num=1, verbose=True, seed=None)#

Bases: BasicPartitioner

  • label-distribution-skew:quantity-based

  • label-distribution-skew:distributed-based (Dirichlet)

  • quantity-skew (Dirichlet)

  • IID

Parameters
  • targets

  • num_clients

  • partition

  • dir_alpha

  • major_classes_num

  • verbose

  • seed

num_features = 54#
num_classes = 2#