functional#
Module Contents#
|
|
|
Assign same sample sample for each client. |
|
Assign different sample number for each client using Log-Normal distribution. |
|
Assign different sample number for each client using Log-Normal distribution. |
|
Partition data indices in IID way given sample numbers for each clients. |
|
Non-iid partition based on Dirichlet distribution. The method is from "hetero-dir" partition of |
|
Non-iid partition used in FedAvg paper. |
|
Non-iid Dirichlet partition. |
|
|
Feature-distribution-skew:synthetic partition. |
|
|
- split_indices(num_cumsum, rand_perm)#
- balance_split(num_clients, num_samples)#
Assign same sample sample for each client.
- Parameters
- Returns
A numpy array consisting
num_clients
integer elements, each represents sample number of corresponding clients.- Return type
- lognormal_unbalance_split(num_clients, num_samples, unbalance_sgm)#
Assign different sample number for each client using Log-Normal distribution.
Sample numbers for clients are drawn from Log-Normal distribution.
- Parameters
- Returns
A numpy array consisting
num_clients
integer elements, each represents sample number of corresponding clients.- Return type
- dirichlet_unbalance_split(num_clients, num_samples, alpha)#
Assign different sample number for each client using Log-Normal distribution.
Sample numbers for clients are drawn from Log-Normal distribution.
- Parameters
- Returns
A numpy array consisting
num_clients
integer elements, each represents sample number of corresponding clients.- Return type
- homo_partition(client_sample_nums, num_samples)#
Partition data indices in IID way given sample numbers for each clients.
- Parameters
client_sample_nums (numpy.ndarray) – Sample numbers for each clients.
num_samples (int) – Number of samples.
- Returns
{ client_id: indices}
.- Return type
- hetero_dir_partition(targets, num_clients, num_classes, dir_alpha, min_require_size=None)#
Non-iid partition based on Dirichlet distribution. The method is from “hetero-dir” partition of Bayesian Nonparametric Federated Learning of Neural Networks and Federated Learning with Matched Averaging.
This method simulates heterogeneous partition for which number of data points and class proportions are unbalanced. Samples will be partitioned into \(J\) clients by sampling \(p_k \sim ext{Dir}_{J}(lpha)\) and allocating a \(p_{p,j}\) proportion of the samples of class \(k\) to local client \(j\).
Sample number for each client is decided in this function.
- Parameters
targets (list or numpy.ndarray) – Sample targets. Unshuffled preferred.
num_clients (int) – Number of clients for partition.
num_classes (int) – Number of classes in samples.
dir_alpha (float) – Parameter alpha for Dirichlet distribution.
min_require_size (int, optional) – Minimum required sample number for each client. If set to
None
, then equals tonum_classes
.
- Returns
{ client_id: indices}
.- Return type
- shards_partition(targets, num_clients, num_shards)#
Non-iid partition used in FedAvg paper.
- Parameters
targets (list or numpy.ndarray) – Sample targets. Unshuffled preferred.
num_clients (int) – Number of clients for partition.
num_shards (int) – Number of shards in partition.
- Returns
{ client_id: indices}
.- Return type
- client_inner_dirichlet_partition(targets, num_clients, num_classes, dir_alpha, client_sample_nums, verbose=True)#
Non-iid Dirichlet partition.
The method is from The method is from paper Federated Learning Based on Dynamic Regularization. This function can be used by given specific sample number for all clients
client_sample_nums
. It’s different fromhetero_dir_partition()
.- Parameters
targets (list or numpy.ndarray) – Sample targets.
num_clients (int) – Number of clients for partition.
num_classes (int) – Number of classes in samples.
dir_alpha (float) – Parameter alpha for Dirichlet distribution.
client_sample_nums (numpy.ndarray) – A numpy array consisting
num_clients
integer elements, each represents sample number of corresponding clients.verbose (bool, optional) – Whether to print partition process. Default as
True
.
- Returns
{ client_id: indices}
.- Return type
- label_skew_quantity_based_partition(targets, num_clients, num_classes, major_classes_num)#
- fcube_synthetic_partition(data)#
Feature-distribution-skew:synthetic partition.
Synthetic partition for FCUBE dataset. This partition is from Federated Learning on Non-IID Data Silos: An Experimental Study.
- Parameters
data (np.ndarray) – Data of dataset
FCUBE
.- Returns
{ client_id: indices}
.- Return type
- samples_num_count(client_dict, num_clients)#