Read This Document?
This document is for machine learning practitioners, researchers, and engineers interested in exploring custom aggregation schemes in federated learning. It is particularly useful for those who want to design, test, and analyze new aggregation methods in real, distributed environments.
Introduction
Machine Learning (ML) continues to drive innovation across diverse domains such as healthcare, finance, and defense. However, while ML models rely heavily on large volumes of centralized high-quality data, concerns around data privacy, ownership, and regulatory compliance remain significant barriers.
Federated Learning (FL) addresses these challenges by enabling model training across distributed datasets, allowing data to remain decentralized while still contributing to a shared global model. This decentralized approach makes FL especially valuable in sensitive domains where centralizing data is impractical or restricted.
To fully leverage FL, understanding and designing custom aggregation schemes is important. Effective aggregation not only influences model accuracy and robustness but also handles issues such as data heterogeneity, client reliability, and adversarial behavior. Analyzing and building tailored aggregation methods is, therefore, a key step toward optimizing performance and ensuring fairness in federated settings.
In my previous article, we explored the cyberattack landscape in federated learning and how it differs from traditional centralized ML. That discussion introduced Scaleout’s open-source attack simulation toolkit, which helps researchers and practitioners analyze the impact of various attack types and evaluate the limitations of existing mitigation schemes, especially under challenging conditions such as extreme data imbalance, non-IID distributions, and the presence of stragglers or late joiners.
However, that first article did not cover how to design, implement, and test your own schemes in live, distributed environments.
This may sound complex, but this article offers a quick, step-by-step guide covering the basic concepts needed to design and develop custom aggregation schemes for real-world federated settings.
What are Server Functions?
Custom Aggregators are functions that run on the server side of a federated learning setup. They define how model updates from clients are processed, such as aggregation, validation, or filtering, before producing the next global model. Server Function is the term used in the Scaleout Edge AI platform to refer to custom aggregators.
Beyond simple aggregation, Server Functions enable researchers and developers to design new schemes to include privacy preservation, fairness, robustness, and convergence efficiency. By customizing this layer, one can also experiment with innovative defense mechanisms, adaptive aggregation rules, or optimization methods tailored to specific data and deployment scenarios.
Server Functions are the mechanism in the Scaleout Edge AI platform for implementing custom aggregation schemes. In addition to this flexibility, the platform includes four built-in aggregators: FedAvg (default), FedAdam (FedOpt), FedYogi(FedOpt), and FedAdaGrad(FedOpt). The implementation of the methods from the FedOpt family follows the approach described in the paper. For more details, please refer to the platform’s documentation.
Building A Custom Aggregation Scheme
In this blog post, I will demonstrate how you can build and test your own attack mitigation scheme using Scaleout’s Edge AI platform. One of the key enablers for this is the platform’s Server Function feature.
Within the scope of this post, the focus will be on how to develop and execute a custom aggregation process in a geographically distributed setting. I will skip the basic setup and assume you are already familiar with the Scaleout platform. If you’re just getting started, here are some useful resources:
Quickstart FL Example: https://docs.scaleoutsystems.com/en/stable/quickstart.html
Framework Architecture: https://docs.scaleoutsystems.com/en/stable/architecture.html
Client API Reference: https://docs.scaleoutsystems.com/en/stable/apiclient.html Server
Function Guide: https://docs.scaleoutsystems.com/en/stable/serverfunctions.html
Example: Label-Flipping Attack
Let’s consider a concrete attack, Label-Flipping. In this scenario, a malicious client intentionally alters the labels in its local dataset (for example, flipping “cat” to “dog”). When these poisoned updates are sent to the server, the global model learns incorrect patterns, leading to reduced accuracy and reliability.
For this experiment, we used the MNIST open source dataset, divided into six partitions, each assigned to a separate client. client-6 acts as a malicious participant by flipping labels in its local data, using it for training, and sending the resulting local model to the aggregator.

| Malicious Client | ID |
|---|---|
| client-6 | 07505d04-08a5-4453-ad55-d541e9e4ef57 |
Here you can see an example of a data point (a grayscale image) from client-6, where the labels have been intentionally flipped. By altering the labels, this client attempts to corrupt the global model during the aggregation process.

Task
Prevent client-6 from interfering with the development of the global model.
Mitigation with Cosine Similarity
To counter this, we use a cosine similarity-based approach as an example. The idea is straightforward: compare each client’s update vector with the global model (Step 1). If the similarity score is below a predefined threshold (s̄ᵢ < 𝓣), the update is likely from a malicious client (Steps 2–4). In the final step (Step 5), that client’s contribution is excluded from the aggregation process, which in this case uses a weighted federated averaging scheme (FedAVG).

Note: This is a straightforward attack mitigation scheme, primarily intended to demonstrate the flexibility in supporting custom aggregation in realistic federated learning scenarios. The same scheme may not be effective for highly challenging cases, such as those with imbalanced datasets or non-IID conditions.
The Python implementation below illustrates the custom aggregator workflow: identifying online clients (Step 1), computing cosine-similarity scores (Steps 2–3), excluding the malicious client (Step 4), and performing federated weighted averaging (Step 5).
Step 1
# --- Compute deltas (client - previous_global) ---
prev_flat = self._flatten_params(previous_global)
flat_deltas, norms = [], []
for params in client_params:
flat = self._flatten_params(params)
delta = flat - prev_flat
flat_deltas.append(delta)
norms.append(np.linalg.norm(delta))
Step 2 and 3
# --- Cosine similarity matrix ---
similarity_matrix = np.zeros((num_clients, num_clients), dtype=float)
for i in range(num_clients):
for j in range(i + 1, num_clients):
denom = norms[i] * norms[j]
sim = float(np.dot(flat_deltas[i], flat_deltas[j]) / denom) if denom > 0 else 0.0
similarity_matrix[i, j] = sim
similarity_matrix[j, i] = sim
# --- Average similarity per client ---
avg_sim = np.zeros(num_clients, dtype=float)
if num_clients > 1:
avg_sim = np.sum(similarity_matrix, axis=1) / (num_clients - 1)
for cid, s in zip(client_ids, avg_sim):
logger.info(f"Round {self.round}: Avg delta-cosine similarity for {cid}: {s:.4f}")
Step 4
# --- Mark suspicious clients ---
suspicious_indices = [i for i, s in enumerate(avg_sim) if s < self.similarity_threshold]
suspicious_clients = [client_ids[i] for i in suspicious_indices]
if suspicious_clients:
logger.warning(f"Round {self.round}: Excluding suspicious clients: {suspicious_clients}")
# --- Keep only non-suspicious clients ---
keep_indices = [i for i in range(num_clients) if i not in suspicious_indices]
if len(keep_indices) < 3: # safeguard
logger.warning("Too many exclusions, falling back to ALL clients.")
keep_indices = list(range(num_clients))
kept_client_ids = [client_ids[i] for i in keep_indices]
logger.info(f"Round {self.round}: Clients used for FedAvg: {kept_client_ids}")
Step 5
# === Weighted FedAvg ===
weighted_sum = [np.zeros_like(param) for param in previous_global]
total_weight = 0
for i in keep_indices:
client_id = client_ids[i]
client_parameters, metadata = client_updates[client_id]
num_examples = metadata.get("num_examples", 1)
total_weight += num_examples
for j, param in enumerate(client_parameters):
weighted_sum[j] += param * num_examples
if total_weight == 0:
logger.error("Aggregation failed: total_weight = 0.")
return previous_global
new_global = [param / total_weight for param in weighted_sum]
logger.info("Models aggregated with FedAvg + cosine similarity filtering")
return new_global
Full code is available at Github: https://github.com/sztoor/server_functions_cosine_similarity_example.git
Step-by-Step Activation of Server Functions
- Log in to Scaleout Studio and create a new project.
- Upload the Compute Package and Seed Model to the project.
- Connect clients to the Studio by installing Scaleout’s client-side libraries via pip ( # pip install fedn) . I have connected six clients.
Note: For Steps 1–3, please refer to the Quick Start Guide for detailed instructions.
- Clone the Github repository from the provided link. This will give you two Python files:
- server_functions.py – contains the cosine similarity–based aggregation scheme.
- scaleout_start_session.py – connects to the Studio, pushes the local server_function, and starts a training session based on predefined settings (e.g., 5 rounds, 180-second round timeout).
- Execute scaleout_start_session.py script.
For full details about these steps, refer to the official Server Functions Guide.
Result and Discussion
The steps involved when running scaleout_start_session.py are outlined below.
In this setup, six clients were connected to the studio, with client-6 using a deliberately flipped dataset to simulate malicious behavior. The logs from Scaleout Studio provide a detailed view of the Server Function execution and how the system responds during aggregation. In Section 1, the logs show that the server successfully received models from all clients, confirming that communication and model upload were functioning as expected. Section 2 presents the similarity scores for each client. These scores quantify how closely each client’s update aligns with the overall trend, providing a metric for detecting anomalies.
Section 1
2025-10-20 11:35:08 [INFO] Received client selection request.
2025-10-20 11:35:08 [INFO] Clients selected: ['eef3e17f-d498-474c-aafe-f7fa7203e9a9', 'e578482e-86b0-42fc-8e56-e4499e6ca553', '7b4b5238-ff67-4f03-9561-4e16ccd9eee7', '69f6c936-c784-4ab9-afb2-f8ccffe15733', '6ca55527-0fec-4c98-be94-ef3ffb09c872', '07505d04-08a5-4453-ad55-d541e9e4ef57']
2025-10-20 11:35:14 [INFO] Received previous global model
2025-10-20 11:35:14 [INFO] Received metadata
2025-10-20 11:35:14 [INFO] Received client model from client eef3e17f-d498-474c-aafe-f7fa7203e9a9
2025-10-20 11:35:14 [INFO] Received metadata
2025-10-20 11:35:15 [INFO] Received client model from client e578482e-86b0-42fc-8e56-e4499e6ca553
2025-10-20 11:35:15 [INFO] Received metadata
2025-10-20 11:35:15 [INFO] Received client model from client 07505d04-08a5-4453-ad55-d541e9e4ef57
2025-10-20 11:35:15 [INFO] Received metadata
2025-10-20 11:35:15 [INFO] Received client model from client 69f6c936-c784-4ab9-afb2-f8ccffe15733
2025-10-20 11:35:15 [INFO] Received metadata
2025-10-20 11:35:15 [INFO] Received client model from client 6ca55527-0fec-4c98-be94-ef3ffb09c872
2025-10-20 11:35:16 [INFO] Received metadata
2025-10-20 11:35:16 [INFO] Received client model from client 7b4b5238-ff67-4f03-9561-4e16ccd9eee7
Section 2
2025-10-20 11:35:16 [INFO] Receieved aggregation request: aggregate
2025-10-20 11:35:16 [INFO] Round 0: Avg delta-cosine similarity for eef3e17f-d498-474c-aafe-f7fa7203e9a9: 0.7498
2025-10-20 11:35:16 [INFO] Round 0: Avg delta-cosine similarity for e578482e-86b0-42fc-8e56-e4499e6ca553: 0.7531
2025-10-20 11:35:16 [INFO] Round 0: Avg delta-cosine similarity for 07505d04-08a5-4453-ad55-d541e9e4ef57: -0.1346
2025-10-20 11:35:16 [INFO] Round 0: Avg delta-cosine similarity for 69f6c936-c784-4ab9-afb2-f8ccffe15733: 0.7528
2025-10-20 11:35:16 [INFO] Round 0: Avg delta-cosine similarity for 6ca55527-0fec-4c98-be94-ef3ffb09c872: 0.7475
2025-10-20 11:35:16 [INFO] Round 0: Avg delta-cosine similarity for 7b4b5238-ff67-4f03-9561-4e16ccd9eee7: 0.7460
Section 3
2025-10-20 11:35:16 ⚠️ [WARNING] Round 0: Excluding suspicious clients: ['07505d04-08a5-4453-ad55-d541e9e4ef57']
Section 4
2025-10-20 11:35:16 [INFO] Round 0: Clients used for FedAvg: ['eef3e17f-d498-474c-aafe-f7fa7203e9a9', 'e578482e-86b0-42fc-8e56-e4499e6ca553', '69f6c936-c784-4ab9-afb2-f8ccffe15733', '6ca55527-0fec-4c98-be94-ef3ffb09c872', '7b4b5238-ff67-4f03-9561-4e16ccd9eee7']
2025-10-20 11:35:16 [INFO] Models aggregated with FedAvg + cosine similarity filtering
Section 3 highlights that client-6 (ID …e4ef57) was identified as malicious. The warning message indicates that this client would be excluded from the aggregation process. This exclusion is justified by its extremely low similarity score of –0.1346, the lowest among all clients, and by the threshold value (𝓣) set in the server function. By removing client-6, the aggregation step ensures that the global model is not adversely affected by outlier updates. Section 4 then lists the client IDs included in the aggregation, confirming that only the trusted clients contributed to the updated global model.
Task Accomplished
client-6 has been successfully excluded from the aggregation process using the custom aggregation scheme.
Together with the log messages, the platform provides rich, per-client insights that enhance understanding of client behavior. These include per-client training loss and accuracy, global model performance metrics, and training times for each client. Another key feature is the model trail, which tracks model updates over time and enables a detailed analysis of how individual clients impact the global model. Together, the sections and supplementary data provide a comprehensive view of client contributions, clearly illustrating the distinction between normal clients and the malicious client-6 (ID …e4ef57), and demonstrating the effectiveness of the server function in mitigating potential attacks.

(image created by the author).


The client-side training accuracy and loss plots highlight the performance of each client. They show that client-6 deviates significantly from the others. Including it in the aggregation process would harm the global model’s performance.
Summary
This blog post highlights the importance of custom aggregators to address special needs and enables the development of robust, efficient, and highly accurate models in federated machine learning settings. It also emphasizes the value of leveraging existing building blocks rather than starting entirely from scratch and spending significant time on components that are not central to the core idea.
Federated Learning offers a powerful way to unlock datasets that are difficult or even impossible to centralize in a single location. Studies like this demonstrate that while the approach is highly promising, it also requires a careful understanding of the underlying systems. Fortunately, there are ready-to-use platforms that make it easier to explore real-world use cases, experiment with different aggregation schenes, and design solutions tailored to specific needs.
Useful Links
Author’s Details:
Salman Toor
CTO and Co-founder, Scaleout.
Associate Professor
Department of Information Technology,
Uppsala University, Sweden
LinkedIn Google Scholar



