谷歌联合学习的论文_Google的未来联合学习
谷歌聯合學習的論文
Lately, the topic of security on machine learning is enjoying increased interest. This can be largely attributed to the success of big data in conjunction with deep learning and the urge for creating and processing over larger data sets for data mining. Since machine learning is becoming a part of day-to-day life, making use of our data, special measures must be taken to protect privacy.
最近,關于機器學習的安全性話題日益引起人們的關注。 這可以很大程度上歸功于大數據與深度學習相結合的成功,以及為更大的數據集創建和處理數據挖掘的需求。 由于機器學習已成為日常生活的一部分,因此利用我們的數據,必須采取特殊措施來保護隱私。
In federated learning, the model is learned by multiple clients in a decentralized fashion. Here learning is shifted to the clients and only the learning parameters are centralized by the trusted curator. This curator the distribute aggregate model back to the client. The approach of federated learning can be widely used in mobile applications by considering the computational power and privacy aspects.
在聯合學習中,多個客戶以分散的方式學習模型。 在這里,學習轉移到客戶端,只有學習參數由受信任的策展人集中管理。 該策展人將分發聚合模型發回到客戶端。 考慮到計算能力和隱私方面,聯合學習的方法可以廣泛用于移動應用程序中。
sharing model within certain users在某些用戶內共享模型When a model is learned in a conventional way, its parameters reveal information about the data that was used during training. In order to solve this problem discussion of differential privacy to learning algorithms has been developed. It is to ensure that the learned model does not know a client participate during decentralized training and the client’s data set will be protected from other client attacks.
當以常規方式學習模型時,其參數會顯示有關訓練期間使用的數據的信息。 為了解決該問題,已經開發了關于學習算法的差分隱私的討論。 這是為了確保學習的模型在分散式培訓期間不知道客戶參與,并且將保護該客戶的數據集免受其他客戶端攻擊。
1.簡介 (1. Introduction)
Basically, federated learning is the problem of training a shared global model under the coordination of a central server, from a federation of participating devices that maintain control of their own data. In standard machine learning approaches, it requires centralizing the training data on one machine or in a data center. But in federated learning, it enables mobile phones to collaboratively learn a shared prediction model while keeping all the training data on the device.
基本上,聯合學習是在中央服務器的協調下從參與方對自己的數據進行控制的聯合設備中訓練共享全局模型的問題。 在標準的機器學習方法中,它要求將訓練數據集中在一臺機器或數據中心中。 但是在聯合學習中,它使手機能夠協作學習共享的預測模型,同時將所有訓練數據保留在設備上。
Data is often created on edge devices such as smartphones or IoT sensors attached to industrial equipment or is controlled by entities such as hospitals. Now, normally in machine learning when we train models, we move this data to the servers in our data center. But often the owners of these smartphones or sensors or these hospitals they can’t, or they won’t share the data with us because of privacy concerns or bandwidth challenges or both. Federated learning is an algorithmic solution to this problem it allows you to build a model while keeping the data at its source. When we do federated learning, each device or entity trains their own model locally and it’s that model that they share with the servers in the data center the server combines the model into a single federated model and it never has direct access to the training data in this way we help to preserve privacy and reduce communication costs in the cloud era. These topics will be discuses in the later sections of the review.
數據通常在連接到工業設備的智能手機或IoT傳感器等邊緣設備上創建,或由醫院等實體控制。 現在,通常在機器學習中訓練模型時,我們會將這些數據移動到數據中心的服務器中。 但是,常常是這些智能手機或傳感器的所有者或他們無法擁有的這些醫院的所有者,或者由于隱私問題或帶寬挑戰或兩者兼而有之,他們不會與我們共享數據。 聯合學習是針對此問題的算法解決方案,它使您可以在保留數據源的同時構建模型。 當我們進行聯合學習時,每個設備或實體都會在本地訓練自己的模型,而該模型是它們與數據中心中的服務器共享的模型,服務器會將模型組合成單個聯合模型,并且永遠無法直接訪問以下模型中的訓練數據這樣,我們可以幫助保護隱私并降低云時代的通信成本。 這些主題將在本評論的后面部分討論。
2.為什么要聯合學習 (2. Why Federated Learning)
In most scenarios, people send their private data to classification for many purposes. These data can be sensitive most of the time. In the concept of federated learning, the whole date will no upload to the cloud server. So, the privacy of the data can be protected. Furthermore, training data from your own device is an advantage by using data that is available in a cloud-like data store is better rather than uploading the private data for unknown server spaces. Consider an example where image classification is used. A user might need to predict the most viewing or trending image types in the future. So, the images can be classified by the number of user data that the shared model is trained by. In communication scenarios like language modeling, the algorithms like next word prediction also can be improved likewise the same scenario.
在大多數情況下,人們出于多種目的將其私人數據發送到分類中。 這些數據在大多數時間都是敏感的。 在聯合學習的概念中,整個日期都不會上傳到云服務器。 因此,可以保護數據的隱私。 此外,通過使用類似云的數據存儲中可用的數據,比上載未知服務器空間的私有數據更好,因此從您自己的設備訓練數據是一個優勢。 考慮使用圖像分類的示例。 用戶將來可能需要預測最多的觀看或趨勢圖像類型。 因此,可以通過訓練共享模型的用戶數據數量來對圖像進行分類。 在類似語言建模的通信場景中,像下一個單詞預測這樣的算法也可以像在相同場景中一樣得到改進。
3.聯合學習中的數據隱私 (3. Data Privacy in Federated Leaning)
Mainly two privacy aspects are there in federated learning. The most important fact before all is how an attacker can do in data and what model parameter the attacker can target. Since the data in which the model is training is for a large amount of data from many clients, the parameters for the model have a high probability of variance, such attacks are comparablydifficult. The second, and one of the most important approaches, is differential privacy. This approach is used for highly sensitive data. This approach will be discussed in the later sections of the review in detail.
聯合學習中主要有兩個隱私方面。 首先,最重要的事實是攻擊者如何處理數據以及攻擊者可以針對哪些模型參數。 由于在其中訓練模型的數據用于來自許多客戶端的大量數據,因此模型的參數具有很高的方差概率,因此此類攻擊比較困難。 第二種也是最重要的方法之一是差異隱私。 此方法用于高度敏感的數據。 該方法將在本文的后面部分中詳細討論。
4.聯合學習中的挑戰 (4. Challenges in Federated Learning)
There are some drawbacks to federated learning since the technology mostly depends on distributed data which can be sensitive. The two examples which discussed in the proviso subsections can be privacy-sensitive data. Also, the prerequisites for using the service can be interrupted by many aspects.
聯合學習存在一些缺點,因為該技術主要取決于可能敏感的分布式數據。 條款小節中討論的兩個示例可能是隱私敏感數據。 而且,使用服務的前提條件可能會在許多方面被中斷。
Communication limits: Sometimes there can be only a few devices that may be online to fetch data to train the model. This few rounds of
通信限制:有時可能只有少數設備在線以獲取數據以訓練模型。 這幾輪
communication with devices will make the training unreliable.
與設備的通信將使培訓不可靠。
Unbalanced Data: In most of the devices there can be a limited number of examples and some devices may have more examples.
數據不平衡:在大多數設備中,示例數量有限,某些設備可能包含更多示例。
Highly Non-IID data: data on one device is always leads to a data pattern for a single user. So the data can be similar foe r many instances of training the model.
高度非IID的數據:一臺設備上的數據始終導致單個用戶的數據模式。 因此,在訓練模型的許多實例中,數據都可以相似。
Unreliable compute nodes: Most of the devices can be offline when a model needs to be trained, and also while training a model the devices can go offline. This is one of the unreliability of federated learning.
計算節點不可靠:當需要訓練模型時,大多數設備可能會脫機,并且在訓練模型時,設備可能會脫機。 這是聯合學習的不可靠性之一。
Attacks on training data: There can be backdoor attacks on training data and it causes for change the model’s behavior.
對訓練數據的攻擊:對訓練數據的后門攻擊可能會導致模型行為的改變。
Data can be Massively Distributed: Since the data is taken by several users in many locations the data and many devices. When the number of devices increases the data distribution also can be increased.
可以大規模分發數據:由于數據是由許多用戶在許多位置獲取的,因此數據和設備很多。 當設備數量增加時,數據分配也可以增加。
結論 (Conclusion)
As we can see, federated learning is an approach that enables us to get rid of such complexities by enabling the models to be trained at the device itself. These trained models are then sent back to a central server where they are aggregated and then one consolidated model is sent back to the devices. In federated learning communication between the curator and the client might be limited. The challenge of federated optimization is to learn a model with minimal information over-read between client and the curator, data might be unbalanced and massively distributed. However, even nowadays there are many apps that use federated learning such as language modeling for mobile keyboards and voice recognition, image classification of predicting which photos people will share. The main advantage of federated learning is that clients never share data. Only model parameters.
如我們所見,聯合學習是一種使我們能夠通過在設備本身上訓練模型來擺脫這種復雜性的方法。 然后將這些經過訓練的模型發送回中央服務器,在此處進行匯總,然后將一個合并的模型發送回設備。 在聯合學習中,館長與客戶之間的交流可能會受到限制。 聯合優化的挑戰是要學習一個模型,該模型需要在客戶端和策展人之間過度讀取最少的信息,數據可能會不平衡且會大量分布。 但是,即使在當今,仍有許多應用程序使用聯合學習,例如用于移動鍵盤的語言建模和語音識別,用于預測人們將共享哪些照片的圖像分類。 聯合學習的主要優點是客戶永遠不會共享數據。 僅模型參數。
imgix on imgix unsplash拍攝Studying and investigating the contribution of information technology in a modern field such as federated learning can be adapted in numerous scenarios in the future. The major problem of digitized users that misuse unprotected personal data by third parties can be reduced by optimizations of federated learning in regards to machine learning applications that use the internet. And, the study of optimizing and minimizing computational power can be reduced by using cloud-integrated learning models and neural networks.
在諸如聯合學習之類的現代領域中,研究和調查信息技術的貢獻可以在未來的許多情況下進行調整。 關于使用Internet的機器學習應用程序的聯合學習的優化,可以減少由第三方濫用未受保護的個人數據的數字化用戶的主要問題。 并且,可以通過使用云集成的學習模型和神經網絡來減少優化和最小化計算能力的研究。
To get a greater understanding of federated learning, refer to the below comic.
為了更好地了解聯合學習,請參考以下漫畫。
資源資源 (Resources)
https://ai.googleblog.com/2017/04/federated-learning-collaborative.html
https://ai.googleblog.com/2017/04/federated-learning-collaborative.html
翻譯自: https://medium.com/better-programming/federated-learning-for-the-future-5253d80c8e9d
谷歌聯合學習的論文
總結
以上是生活随笔為你收集整理的谷歌联合学习的论文_Google的未来联合学习的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 仅20天!全国快递业揽收包裹39.4亿件
- 下一篇: 使用cnn预测房价_使用CNN的人和马预