This post was originally published by PIYUSH PATHAK at Medium [AI]
Federated Learning: Resilient, Low-impact and Secure
The field of machine learning is constantly evolving, sometimes slowly, and at other times we experience the tech equivalent of the Cambrian Explosion with rapid advance that makes a good many data scientists experience a serious case of imposter syndrome. It has only been 8 years since the modern era of deep learning began at the 2012 ImageNet competition.
What will the next generation of artificial intelligence look like? Which novel AI approaches will unlock currently unimaginable possibilities in technology and business?
This article highlights emerging areas within AI that are poised to redefine the field — and society — in the years ahead.
Unsupervised learning more closely mirrors the way that humans learn about the world: through open-ended exploration and inference, without a need for the “training wheels” of supervised learning. One of its fundamental advantages is that there will always be far more unlabeled data than labeled data in the world (and the former is much easier to come by).
In a nutshell, the system learns about some parts of the world based on other parts of the world. By observing the behavior of, patterns among, and relationships between entities — for example, words in a text or people in a video — the system bootstraps an overall understanding of its environment. Some researchers sum this up with the phrase
“predicting everything from everything else.”
In the words of LeCun, who prefers the closely related term “self-supervised learning”: “In self-supervised learning, a portion of the input is used as a supervisory signal to predict the remaining portion of the input….More knowledge about the structure of the world can be learned through self-supervised learning than from [other AI paradigms], because the data is unlimited and the amount of feedback provided by each example is huge.”
But still there is requirement of something new….!!!!
Recently, a new approach has considered for models trained from user interaction with mobile called Federated Learning which distributes ML process over to the edge and enables mobile to collaboratively learn a shared model using training data on device and keeping data on device. It decouples the need for doing machine learning with the need to store the data in the cloud.
To train a machine learning model, traditional machine learning adopts a centralized approach which requires the training data to be aggregated on a single machine or in a datacenter. This is practically what giant AI companies such as Google, Facebook, and Amazon have been doing over the years. These companies have been collecting a gigantic amount of data and store these data in their datacenters where machine learning models are trained. This centralized training approach, however, is privacy-intrusive, especially for mobile phone users. This is because mobile phones may contain the owners’ privacy-sensitive data. To train or obtain a better machine learning model under such a centralized training approach, mobile phone users have to trade their privacy by sending their personal data stored inside phones to the clouds owned by the AI companies.
Federated learning opens up a brand new research field in AI. Today, gigantic amounts of data are generated by consumer devices such as mobile phones on a daily basis. These data contain valuable information about users and their personal preferences: what websites they mostly visited, what social media apps they mostly used, what types of videos they mostly watched, etc. With such valuable information, these data become the key to building better and personalized machine learning models to deliver personalized services to maximally enhance user experiences. Federated learning provides a unique way to build such personalized models without intruding users’ privacy. Such a unique advantage is the key motivation to attract researchers in the AI community to work on this new research direction.
Federated learning also opens up a brand new computing paradigm for AI. As compute resources inside end devices such as mobile phones are becoming increasingly powerful, especially with the emergence of AI chipsets, AI is moving from clouds and datacenters to end devices.
One of the overarching challenges of the digital era is data privacy. Because data is the lifeblood of modern artificial intelligence, data privacy issues play a significant (and often limiting) role in AI’s trajectory.
Privacy-preserving artificial intelligence — methods that enable AI models to learn from datasets without compromising their privacy — is thus becoming an increasingly important pursuit. Perhaps the most promising approach to privacy-preserving AI is federated learning.
However, when it comes to training machine learning models where personally identifiable data is involved (on-device, or in industries with particularly sensitive data like healthcare), this approach becomes unsuitable.
Training models on a centralized server also means that you need enormous amounts of storage space, as well as world-class security to avoid data breaches. But imagine if you were able to train your models with data that’s locally stored on a user’s device…
Introducing Federated learning.
The concept of federated learning was first formulated by researchers at Google in early 2017.
Federated Learning enables mobile phones to collaboratively learn a shared prediction model while keeping all the training data on device, decoupling the ability to do machine learning from the need to store the data in the cloud. This goes beyond the use of local models that make predictions on mobile devices (like the Mobile Vision API and On-Device Smart Reply) by bringing model training to the device as well.
It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. All the training data remains on your device, and no individual updates are stored in the cloud.
In a nut shell, Federated learning is a model training technique that enables devices to learn collaboratively from a shared model. The shared model is first trained on a server using proxy data. Each device then downloads the model and improves it using data — federated data — from the device.
The device trains the model with the locally available data. The changes made to the model are summarized as an update that is then sent to the cloud. The training data and individual updates remain on the device. In order to ensure faster uploads of theses updates, the model is compressed using random rotations and quantization. When the devices send their specific models to the server, the models are averaged to obtain a single combined model. This is done for several iterations until a high-quality model is obtained.
Compared to centralized machine learning, federated learning has a couple of specific advantages:
- Ensuring privacy, since the data remains on the user’s device.
- Lower latency, because the updated model can be used to make predictions on the user’s device.
- Smarter models, given the collaborative training process.
- Less power consumption, as models are trained on a user’s device.
To make Federated Learning possible, we had to overcome many algorithmic and technical challenges. In a typical machine learning system, an optimization algorithm like Stochastic Gradient Descent (SGD) runs on a large dataset partitioned homogeneously across servers in the cloud. Such highly iterative algorithms require low-latency, high-throughput connections to the training data. But in the Federated Learning setting, the data is distributed across millions of devices in a highly uneven fashion. In addition, these devices have significantly higher-latency, lower-throughput connections and are only intermittently available for training.
Compared to the centralized training approach, federated learning is a decentralized training approach which enables mobile phones located at different geographical locations to collaboratively learn a machine learning model while keeping all the personal data that may contain private information on device. In such a case, mobile phone users can benefit from obtaining a well-trained machine learning model without sending their privacy-sensitive personal data to the cloud.
Federated Learning allows for faster deployment and testing of smarter models, lower latency, and less power consumption, all while ensuring privacy. Also, in addition to providing an update to the shared model, the improved (local) model on your phone can also be used immediately, powering experiences personalized by the way you use your phone.
Best example of this is Gboard. When Gboard shows a suggested query, your phone locally stores information about the current context and whether you clicked the suggestion. Federated Learning processes that history on-device to suggest improvements to the next iteration of Gboard’s query suggestion model.
- FL enables devices like mobile phones to collaboratively learn a shared prediction model while keeping the training data on the device instead of requiring the data to be uploaded and stored on a central server.
- Moves model training to the edge, namely devices such as smartphones, tablets, IoT, or even “organizations” like hospitals that are required to operate under strict privacy constraints. Having personal data remain local is a strong security benefit.
- Makes real-time prediction possible, since prediction happens on the device itself. FL reduces the time lag that occurs due to transmitting raw data back to a central server and then shipping the results back to the device.
- Since the models reside on the device, the prediction process works even when there is no internet connectivity.
- FL reduces the amount of hardware infrastructure required. FL uses minimal hardware and what is available in mobile devices is more than enough to run the FL models.
- More recently, healthcare has emerged as a particularly promising field for the application of federated learning.
It is easy to see why. On one hand, there are an enormous number of valuable AI use cases in healthcare. On the other hand, healthcare data, especially patients’ personally identifiable information, is extremely sensitive; a thicket of regulations like HIPAA restrict its use and movement. Federated learning could enable researchers to develop life-saving healthcare AI tools without ever moving sensitive health records from their source or exposing them to privacy breaches.
Beyond healthcare, federated learning may one day play a central role in the development of any AI application that involves sensitive data: from financial services to autonomous vehicles, from government use cases to consumer products of all kinds
2. Best example of this is Gboard. When Gboard shows a suggested query, your phone locally stores information about the current context and whether you clicked the suggestion. Federated Learning processes that history on-device to suggest improvements to the next iteration of Gboard’s query suggestion model.
Federated Learning can’t solve all machine learning problems (for example, learning to recognize different dog breeds by training on carefully labeled examples), and for many other models the necessary training data is already stored in the cloud (like training spam filters for Gmail). So Google will continue to advance the state-of-the-art for cloud-based ML, but we are also committed to ongoing research to expand the range of problems we can solve with Federated Learning. Beyond Gboard query suggestions, for example, we hope to improve the language models that power your keyboard based on what you actually type on your phone (which can have a style all its own) and photo rankings based on what kinds of photos people look at, share, or delete.
There are a number of core challenges associated with FL. First, communication is a critical bottleneck in FL networks where data generated on each device remain local. In order to train a model using data generated by the devices in the network, it is necessary to develop communication-efficient methods that reduce the total number of communication rounds, and also iteratively send small model updates as part of the training process, as opposed to sending the entire data set.
Additionally, FL methods must: anticipate low levels of device participation, i.e. only a small fraction of the devices being active at once; tolerate variability in hardware that affects storage, computational, and communication capabilities of each device in a federated network; and be able to handle dropped devices in the network.
Finally, FL helps to protect data generated on a device by sharing model updates such as gradient data instead of raw data. But communicating model updates throughout the training process can still reveal sensitive information, either to a third party, or to the central server.
In this article, I’ve introduced a new setting for distributed machine learning (optimization problems), which is called federated learning. This setting is motivated by the methodology where users do not send the data they generate locally to central servers at all, but rather provide part of their computational power to be used to perform local machine learning training. This comes with a unique set of challenges however FL researchers are actively engaged with bringing this new technology forward.
Here are several new available FL resources:
There is also a paper that describes a scalable production system for FL for mobile devices, “Towards Federated Learning at Scale: System Design” (Mar. 2019) which includes the resulting high-level design, overview of new challenges with solutions, and also some open problems with future directions.
This post was originally published by PIYUSH PATHAK at Medium [AI]