Flower is an agnostic federated learning framework that works with any machine learning framework (Keras, Tensorflow, PyTorch, MXNet, ...), that is programming language independent (Python, Java, C++, Swift, ...), and that works on any operating system (iOS, Android, Linux, Windows, macOS). The Google Summer of Code program is a great way to get started with federated learning, Flower, and open source work in general! This blog post proposes Google Summer of Code project ideas which will help to improve the usability of Flower on different platforms, add new federated learning algorithms to the core framework, and help to make federated learning more accessible in general.
Possible mentors for Google Summer of Code project ideas are:
- Nicholas Lane, PhD, Senior Lecturer (Assoc. Professor), Dept. of Computer Science & Tech. University of Cambridge and Director at Samsung AI Center
- Pedro Porto Buarque de Gusmão, PhD, Senior Research Associate, University of Cambridge
- Javier Fernandez-Marques, DPhil candidate in Computer Science, University of Oxford
- Dr. Maria Börner, Program-Manager at Adap GmbH
- Daniel J. Beutel, Co-Creator of Flower and CEO at Adap GmbH
- Taner Topal, Co-Creator of Flower and COO at Adap GmbH
The following list shows some suitable project ideas, but students can also suggest their own ideas.
JAX + Flower Federated Learning Quickstart Example
Description
JAX is a higher performance machine learning framework based on Python and NumPy. In this project, you will develop a federated learning example with JAX and Flower. The example can be based on one of the centralised JAX ML training projects and is intended to showcase how Flower can be used to federate such projects. First, you can choose the model (sequential, CNN) as well as the dataset for the training (MNIST, Cifar10, ... ). You create then a training and evaluation process running centralized. Afterwards, Flower takes the dataset, model, training and evaluation process from the centralized example to run the JAX example federated. You also need to create a Flower-based code to extract and modify the model weights from the previously created model.
Expected Outcome:
- Available JAX example running federated
- Write documentation and README for the GitHuB repository
- Write blog post about new available example
Required Skills:
- Intermediate knowledge of Python
- Basic knowledge with Git
- Basic understanding of JAX is an advantage but not necessary
Mentor: Dr. Maria Börner
Difficulty: easy
Monitoring tools for FL/Flower
Description
When training ML models, being able to track training statistics such as accuracy/loss curves or ranges of distributions can be very insightful: it helps to choose better hyperparameters, helps introducing learning rate decay schedulers, etc. For FL, especially when using large pools of clients, it can become challenging to monitor and track such statistics when each client is running on different machines. This project involves building a centralized monitoring system to track training statistics from all clients participating in the experiment. The design should be framework-agnostic, so it works with existing frameworks such as Tensorboard or W&B, if the user wants to rely on those.
Expected Outcome:
- Add support to clients for (optional) user-defined monitoring callbacks.
- Create example demonstrating support for a variety of scenarios.
- A blog post describing the example above.
Required Skills:
- Intermediate knowledge of Python
- Basic experience training ML models with Pytorch and/or Tensorflow
- Experience with tools for monitoring training: Tensorboard, W&B, etc
Mentor: Javier Fernandez-Marques
Difficulty: Medium
Federated Analytics
Description
Flower’s main focus is Federated Learning, but the framework is general enough to implement related approaches, such as Federated Analytics. It allows to exchange ML models within a set of connected devices, each holding their own data partition. Federated analytics is somewhat similar to federated learning, but instead of model updates the client sends analytical results based on the local data partition. In this project, you will implement a prototype for Federated Analytics using Flower. This prototype will help to shape future API changes to make Federated Analytics a first-class citizen in the Flower ecosystem.
Expected Outcome:
- Create a prototype for federated analytics in Flower
- Define the general approach
- Create a code example
- Implement changes to the underlying Flower core framework if necessary
- Document the final approach
- Write Blog post about the approach
Required Skills:
- Knowledge of python programming
- Knowledge of analytics tools in the Python ecosystem
- Basic understanding of federated analytics
Mentor: Daniel Beutel
Difficulty: hard
Secure Aggregation
Description
Federated learning allows us to train a model over a set of connected devices, each holding their own data partition. Each connected device trains the model on their local dataset and sends the updated model to a central server thereafter. Secure Aggregation is a way to protect these model updates from being analysed by the server by only allowing the server to see the actual model parameters after aggregation. This prevents the server from being able to “peek” into the model update from a single device. In this project, you will implement Secure Aggregation in the Flower federated learning framework and create a code example that demonstrates the usage of Secure Aggregation with Flower.
Expected Outcome:
- Define an implementation proposal for Secure Aggregation
- Implement the required changed to the message protocol (i.e. messages exchanged between server and clients)
- Implement the required server-side and client-side logic
- Create a code example using the new functionality
- A documentation for the new feature to use
- Optional: a blog post about the new feature
Required Skills:
- Knowledge of python programming
- Good understanding of security principles (encryption/decryption)
- Basic understanding of machine learning
Mentors: Daniel Beutel and Nicholas Lane, PhD
Difficulty: hard
Differential Privacy using Opacus
Description
Federated learning offers increased data privacy since only the weights of ML models are shared with a server, not the underlying data used to train these models. However, this paper (https://arxiv.org/abs/1906.08935) shows that it is possible to obtain private data by leaking the shared weights. Therefore, an additional layer can help to improve data privacy. In this project, you will implement a Flower example that demonstrated how Differential Privacy can be used on the client.
Expected Outcome:
- Create a PyTorch-based code example that implements Differential Privacy using Opacus (https://opacus.ai/).
- Test and document the new example.
- Write Blog post describing the new code example.
Required Skills
- Basic understanding of differential privacy
- Experience with PyTorch
- Experience with Python
Mentor: Daniel Beutel
Difficulty: medium
FedProx
Description
Implement FedProx Federated Learning Strategy for Flower. Federated learning involves training on thousands of devices having different amounts of data and different data distributions. In this project, you will implement FedProx, a proven aggregation Strategy that helps mitigate the problem of heterogeneity by suggesting a generalization and re-parametrization of FedAvg.
Expected outcomes:
- Source code for the FedProx implementation
- Unit test for the code above
- A Flower example that uses the Strategy
- A blogpost describing the Strategy
Skills required:
- Intermediate knowledge of Python
- Basic knowledge with Git
Mentor: Pedro Porto Buarque de Gusmão, PhD
Sources: https://arxiv.org/pdf/1905.10497.pdf
per-FedAvg
Description
Implement a Personalized Federated Learning Strategy. Federated Learning trains on thousands of clients having different data distributions. Is it really possible to find one model that will fit them all? In this project you will be developing a new aggregation strategy called Personalized FedAvg (per-FedAvg), which tries to find an initial shared model that current or new users can easily adapt to their local datasets.
Expected outcomes:
- Source code for the per-FedAvg implementation.
- Unit test for the code above.
- A Flower example that uses the Strategy.
- A blogpost describing the Strategy.
Skills required:
- Intermediate knowledge of Python
- Basic knowledge with Git
Mentor: Pedro Porto Buarque de Gusmão, PhD
Sources: https://proceedings.neurips.cc/paper/2020/file/24389bfe4fe2eba8bf9aa9203a44cdad-Paper.pdf
Reinforcement Learning
Description
Reinforcement Learning (RL) helps us find solutions to problems where some notion of cumulative reward needs to be maximized, e.g. video games. Given the vast amounts of data being generated in mobile devices, FL offers a good alternative to centralized training where the agents can now be trained directly on the edge. In this project you will develop a Flower example that trains a RF Agent using Federated Learning.
Expected Outcome:
- Code for a FLower example that shows how to train a Reinforcement Learning model using Federated Learning
- A blog post describing the example above
Required Skills:
- Basic understanding of Reinforcement Learning
- Intermediate knowledge of Python
- Acquaintance with Pytorch or Tensorflow
Mentor: Pedro Porto Buarque de Gusmão, PhD
Difficulty: hard
Sources:
- https://arxiv.org/pdf/1901.08277.pdf
- https://pytorch.org/tutorials/intermediate/mario_rl_tutorial.html
Dart/Flutter SDK
Description
One of the use cases for federated learning and Flower is to connect a fleet of devices (server, phone, edge devices, ...) and train AI models on them. Flutter can be used to easily build mobile applications on multiple platforms. In this project, you will build a Flutter SDK for Flower. The SDK will use gRPC to communicate with the server.
Expected Outcome:
- Set up gRPC compilation for Dart/Flutter
- Define the user-facing API of the Dart/Flutter SDK
- Implement the API
- Test and document the new module
- Build a Dart/Flutter library and publish it on https://pub.dev
- Build a code example using the SDK
- Write Blog post about the available feature
Required Skills:
- Good understanding of Dart
- Basic understanding of Flutter
- Interest in gRPC
Mentor: Taner Topal
Difficulty: medium
Java/Android SDK
Description
Android is one of the two important mobile platforms for federated learning and mobile app users will benefit considerably from protecting their data. An Android SDK allows for easy integration of federated learning in mobile apps. In this project, you will build an Android SDK for Flower, publish the resulting library, and build a usage example demonstrating how to use this library.
Expected Outcome:
- Set up gRPC compilation for Java/Android
- Define the user-facing API of the Java/Android SDK
- Implement the API
- Test and document the new SDK
- Build a Java/Android library and publish it
- Build a code example using the SDK
- Write Blog post about the available feature
Required Skills
- Android programming
- Basic understanding of gRPC
- Interest in machine learning / federated learning
Mentors: Daniel Beutel and Taner Topal
Difficulty: medium
Swift/iOS SDK
Description
iOS is one of the two important mobile platforms for federated learning and mobile app users will benefit considerably from protecting their data. An iOS SDK allows for easy integration of federated learning in mobile apps. In this project, you will build a Swift/iOS SDK for Flower, publish the resulting library, and build a usage example demonstrating how to use this library.
Expected Outcome:
- Set up gRPC compilation for Swift/iOS
- Define the user-facing API of the Swift/iOS SDK
- Implement the API
- Test and document the new SDK
- Build a Swift/iOS library and publish it
- Build a code example using the SDK
- Write Blog post about the available feature
Required Skills:
- iOS programming
- Basic understanding of gRPC
- Interest in machine learning / federated learning
Mentors: Daniel Beutel and Taner Topal
Difficulty: medium
C++ SDK
Description
C++ is one of the most defining programming languages of our time. It is used in many critical applications and the go-to language for performance-sensitive applications, such as robotics or automotive. Federated Learning can enable entirely new platforms in these domains and we thus want to support C++ by providing a Flower C++ SDK. Flower communicates between the server and the client using gRPC. At the moment, every C++ user needs to build their own integration with the gRPC message protocol to run Flower. In this project, you will create a Flower SDK for C++.
Expected Outcome:
- Set up gRPC compilation for C++
- Define the user-facing API of the C++ SDK
- Implement the API
- Test and document the new SDK
- Build a C++ library and publish it
- Build a code example using the C++ SDK and libtorch (PyTorch C++ API)
- Write Blog post about the available feature
Required Skills:
- Strong experience with C++
- Interest in gRPC
- Basic understanding of machine learning
- Optional: Basic libtorch (PyTorch C++ API) understanding
Mentor: Taner Topal
Difficulty: medium