Federated Learning. Yang Liu

Читать онлайн.
Название Federated Learning
Автор произведения Yang Liu
Жанр Компьютеры: прочее
Серия Synthesis Lectures on Artificial Intelligence and Machine Learning
Издательство Компьютеры: прочее
Год выпуска 0
isbn 9781681737188



Скачать книгу

can be used to compute any arithmetic operation. Each party is in charge of one stage and the circuit is garbled in each stage, so that any of them cannot get information from the other one, but they can still achieve the result according to the circuit. GC consists of an OT protocol and a block cipher. The complexity of the circuit grows at least linearly with the input size. Soon after GC was proposed, GMW [Goldreich et al., 1987] extended GC to the multi-party setting against malicious adversaries. For more detailed survey of GC, readers can refer to Yakoubov [2017].

      OT Extension. Impagliazzo and Rudich [1989] showed that OT provably requires “public-key” type of assumptions (such as factoring, discrete log, etc.). However, Beaver [1996] pointed out that OT can be “extended” in the sense that it is enough to generate a few “seed” OTs based on public-key cryptography, which can then be extended to any number of OTs using symmetric-key cryptosystems only. OT extension is now widely applied in MPC protocols [Keller et al., 2016, Mohassel and Zhang, 2017, Demmler et al., 2015] to improve efficiency.

       Secret Sharing

      Secret sharing is a concept of hiding a secret value by splitting it into random parts and distributing these parts (a.k.a. shares) to different parties, so that each party has only one share and thus only one piece of the secret [Shamir, 1979, Beimel, 2011]. Depending on the specific secret sharing schemes used, all or a known threshold of shares are needed to reconstruct the original secret value [Shamir, 1979, Tutdere and Uzunko, 2015]. For example, Shamir’s Secret Sharing is constructed based on polynomial equations and provides information-theoretic security, and it is also efficient using matrix calculation speedup [Shamir, 1979]. There are several types of secret sharing, mainly including arithmetic secret sharing [Damård et al., 2011], Shamir’s secret sharing [Shamir, 1979], and binary secret sharing [Wang et al., 2007]. As arithmetic secret sharing is mostly adopted by existing SMPC-based PPML approaches and binary secret sharing are closely related to OT which is discussed in Section 2.4.1, here we focus on arithmetic secret sharing.

      Consider that a party Pi wants to share a secret S among n parties Image in a finite field Fq. To share S, the party Pi randomly samples n – 1 values Image from Zq and set Image mod q. Then, Pi distributes sk to party Pk, for ki. We denote the shared S as Image.

      The arithmetic addition operation is carried out locally at each party. The secure multiplication is performed by using Beaver triples [Beaver, 1991]. The Beaver triples can be generated in an offline phase. The offline phase (i.e., preprocessing) serves as a trusted dealer who generates Beaver triples {(〈a〉, 〈b〉, 〈c〉)|ab = c} and distributes the shares among the n parties.

      To compute Image first computes 〈e〉 = 〈x〉 – 〈a〉, 〈f〉 = 〈y〉 – 〈b〉. Then, e and f are reconstructed. Finally, Pi computes 〈z〉 = 〈c〉 + ex〉 + fy〉 locally, and a random party Pj adds its share into ef. We denote element-wise multiplication of vectors as 〈·〉 ⨀ 〈·〉.

      Secure multiplication can also be performed by leveraging the Gilboa’s protocol [Gilboa, 1999], in which n-bit arithmetic multiplication can be conducted via n 1-out-of-2 OTs. Suppose that Party A holds x and Party B holds y. Now we show Gilboa’s protocol, which results in A holding 〈zA and B holding 〈zB such that z = x · y. Let l be the maximum length of the binary representation of the numbers involved in our protocol. Denote the m× 1-out-of-2 OT for l -bit strings as Image. Denote the i th bit of x as x[i]. The secure 2-party multiplication via OT can be conducted as follows.

      1. A represents x in binary format.

      2. B builds Image. For the i th OT, randomly pick ai,0 and compute ai,1 = 2i yai,0. Use (–ai,0, ai,1) as the input for the i th OT.

      3. A inputs X[i] as the choice bit in the i th OT and obtains x[i] × 2i y × – ai,0.

      4. A computes Image B computes Image.

      The offline phase can be carried out efficiently with the help of a semi-honest dealer who generates Beaver triples and distributes them among all the parties. To perform such a preprocessing step without a semi-honest dealer, there are several protocols available, such as SPDZ [Damård et al., 2011], SPDZ-2 [Damård et al., 2012], MASCOT [Keller et al., 2016], and HighGear [Keller et al., 2018].

      • SPDZ is an offline protocol in the preprocessing model based on somewhat homomorphic encryption (SHE) in the form of BGV, first described in Damård et al. [2011].

      • SPDZ-2 [Damård et al., 2012] is a protocol based on threshold SHE cryptography (with a shared decryption key).

      • MASCOT is an oblivious-transfer-based protocol, proposed in Keller et al. [2016]. It is far more computationally efficient than SPDZ and SPDZ-2.

      • In 2018, Keller et al. [2018] developed a BGV-based SHE protocol, called the HighGear protocol, which achieves better performance than the MASCOT protocol.

       Application in PPML

      Various MPC-based approaches have been designed and implemented for PPML in the past. Most MPC-based PPML approaches leverage a two-phase architecture, comprising of an offline phase and an online phase. The majority of cryptographic operations are conducted in the offline phase, where multiplication triples are generated. The ML model is then trained in the online phase using the multiplication triples generated in the offline phase. The DeepSecure [Rouhani et al., 2017] is a GC-based framework for secure neural network inference, where the inference function has to be represented as a Boolean circuit. The computation and communication cost in GC only depend on the total number of AND gates in the circuit.

      SecureML [Mohassel and Zhang, 2017] is another two-party framework for PPML employing two-phase architecture. Parties in federated learning distributes arithmetic shared of their data among two non-colluding servers, who run secure two-party model training protocols. Both Linearly HE (LHE)-based and OT-based protocols are proposed for multiplication triples generation in offline phase. The online phase is based on arithmetic secret sharing and division GC. Therefore, only linear operations are allowed in model training, and various approximations are done to nonlinear functions.

      The Chameleon framework is another hybrid MPC framework based on ABY for neural network model inference [Demmler et al., 2015]. Arithmetic secret sharing is used to conduct linear operations, and GC as well as GMW [Goldreich et al., 1987] are used for nonlinear operations. Conversion protocols are also implemented to convert data representations among different protocols.

      Privacy-preserving ID3 learning based on OT is addressed in Lindell and Pinkas [2002]. Shamir’s threshold secret sharing is used for secure model aggregation for PPML with security against both honest-but-curious and malicious adversaries [Bonawitz et al., 2017], where a group of clients do MPC to evaluate the average of their private input models, and disclose the average to the parameter server for