In cut-and-choose protocols for two-party secure computation (2PC) the main overhead is the number of garbled circuits that must be sent. Recent work (Lindell, Riva; Huang et al., Crypto 2014) has shown that in a batched setting, when the parties plan to evaluate the same function N times, the number of garbled circuits per execution can be reduced by a O(logN) factor compared to the single-execution setting. This improvement is significant in practice: an order of magnitude for N as low as one thousand. % Besides the number of garbled circuits, communication round trips are another significant performance bottleneck. Afshar et al. (Eurocrypt 2014) proposed an efficient cut-and-choose 2PC that is round-optimal (one message from each party), but in the single-execution setting.
In this work we present new malicious-secure 2PC protocols that are round-optimal and also take advantage of batching to reduce cost. Our contributions include: \begin{itemize} \item A 2-message protocol for batch secure computation (N instances of the same function). The number of garbled circuits is reduced by a O(logN) factor over the single-execution case. However, other aspects of the protocol that depend on the input/output size of the function do not benefit from the same O(logN)-factor savings. \item A 2-message protocol for batch secure computation, in the random oracle model. All aspects of this protocol benefit from the O(logN)-factor improvement, except for small terms that do not depend on the function being evaluated. \item A protocol in the offline/online setting. After an offline preprocessing phase that depends only on the function f and N, the parties can securely evaluate f, N times (not necessarily all at once). Our protocol’s online phase is only 2 messages, and the total online communication is only ℓ+O(κ) bits, where ℓ is the input length of f and κ is a computational security parameter. This is only O(κ) bits more than the information-theoretic lower bound for malicious 2PC.
Secure two-party computation has witnessed significant efficiency improvements in the recent years. Current implementations of protocols with security against passive adversaries generate and process data much faster than it can be sent over the network, even with a single thread. This paper introduces novel methods to further reduce the communication bottleneck and round complexity of semi-honest secure two-party computation. Our new methodology creates a trade-off between communication and computation, and we show that the added computing cost for each party is still feasible and practicable in light of the new communication savings. We first improve communication for Boolean circuits with 2-input gates by factor 1.9x when evaluated with the protocol of Goldreich-Micali-Wigderson (GMW). As a further step, we change the conventional Boolean circuit representation from 2-input gates to multi-input/multi-output lookup tables (LUTs) which can be programmed to realize arbitrary functions. We construct two protocols for evaluating LUTs offering a trade-off between online communication and total communication. Our most efficient LUT-based protocol reduces the communication and round complexity by a factor 2-4x for several basic and complex operations. Our proposed scheme results in a significant overall runtime decrease of up to a factor of 3x on several benchmark functions.
We propose a new protocol for two-party computation, secure against malicious adversaries, that is significantly faster than prior work in the single-execution setting (i.e., non-amortized and with no pre-processing). In particular, for computational security parameter κ and statistical security parameter ρ, our protocol uses only ρ garbled circuits and O(κ) public-key operations, whereas previous work with the same number of garbled circuits required either O(ρn+κ) public-key operations (where n is the input/output length) or a second execution of a secure-computation sub-protocol. Our protocol can be based on the decisional Diffie-Hellman assumption in the standard model.
We implement our protocol to evaluate its performance. With ρ=40, our implementation securely computes an AES evaluation in 65 ms over a local-area network using a single thread without any pre-computation, 22x faster than the best prior work in the non-amortized setting. The relative performance of our protocol is even better for functions with larger input/output lengths.
We propose a new protocol, nicknamed TinyTable, for maliciously secure 2-party computation in the preprocessing model. One version of the protocol is useful in practice and allows, for instance, secure AES encryption with latency about 1ms and amortized time about 0.5 μs per AES block on a fast cloud set-up. Another version is interesting from a theoretical point of view: we achieve a maliciously and unconditionally secure 2-party protocol in the preprocessing model for computing a Boolean circuit, where both the communication complexity and preprocessed data size needed is O(s) where s is the circuit size, while the computational complexity is O(k^\eps*s) where k is the statistical security parameter and \eps<1 is a constant. For general circuits with no assumption on their structure, this is the best asymptotic performance achieved so far in this model.
We design and implement a Distributed Oblivious Random Access Memory (DORAM) data structure that is optimized for use in two-party secure computation protocols. We improve upon the access time of previous constructions by a factor of up to ten, their memory overhead by a factor of one hundred or more, and their initialization time by a factor of thousands. We are able to instantiate ORAMs that hold 234 bytes, and perform operations on them in seconds, which was not previously feasible with any implemented scheme.
Unlike prior ORAM constructions based on hierarchical hashing, permutation, or trees, our Distributed ORAM is derived from the new Function Secret Sharing scheme introduced by Boyle, Gilboa and Ishai. This significantly reduces the amount of secure computation required to implement an ORAM access, albeit at the cost of O(n) efficient local memory operations.
We implement our construction and find that, despite its poor O(n) asymptotic complexity, it still outperforms the fastest previously known constructions, Circuit ORAM and Square-root ORAM, for datasets that are 32 KiB or larger, and outperforms prior work on applications such as stable matching or binary search by factors of two to ten.
We propose a simple and efficient framework for obtaining efficient constant-round protocols for maliciously secure two-party computation. Our framework uses a function-independent preprocessing phase to generate authenticated information for the two parties; this information is then used to construct a single“authenticated” garbled circuit which is transmitted and evaluated.We also show how to efficiently instantiate the preprocessing phase by designing a highly optimized version of the TinyOT protocol by Nielsen et al. Our overall protocol outperforms existing work in both the single-execution and amortized settings, with or without preprocessing: In the single-execution setting, our protocol evaluates an AES circuit with malicious security in37 ms with an online time of just 1 ms. Previous work with the best online time (also 1 ms)requires 124 ms in total; previous work with the best total time requires 62 ms (with 14 ms online time). If we amortize the computation over 1024 executions, each AES computation requires just 6.7 ms with roughly the same online time as above. The best previous work in the amortized setting has roughly the same total time but does not support function-independent preprocessing.Our work shows that the performance penalty for maliciously secure two-party computation (as compared to semi-honest security) is much smaller than previously believed.
Cut-and-choose (CC) is the standard approach to making Yao’s garbled circuit two-party computation (2PC) protocol secure against malicious adversaries. Traditional cut-and-choose operates at the level of entire circuits, whereas the LEGO paradigm (Nielsen & Orlandi, TCC 2009) achieves asymptotic improvements by performing cut-and-choose at the level of individual gates. In this work we propose a unified approach called DUPLO that spans the entire continuum between these two extremes. The cut-and-choose step in our protocol operates on the level of arbitrary circuit “components,” which can range in size from a single gate to the entire circuit itself.
With this entire continuum of parameter values at our disposal, we find that the best way to scale 2PC to computations of realistic size is to use CC components of intermediate size, and not at the extremes. On computations requiring several millions of gates or more, our more general approach to CC gives between 4-7x improvement over existing approaches.
In addition to our technical contributions of modifying and optimizing previous protocol techniques to work with general CC components, we also provide an extension of the recent Frigate circuit compiler (Mood et al, Euro S&P 2016) to effectively express any C-style program in terms of components which can be processed efficiently using our protocol.
Recently, there has been a wealth of effort devoted to the design of secure protocols for machine learning tasks. Much of this is aimed at enabling secure prediction from highly-accurate Deep Neural Networks (DNNs). However, as DNNs are trained on data, a key question is how such models can be also trained securely. The few prior works on secure DNN training have focused either on designing custom protocols for existing training algorithms, or on developing tailored training algorithms and then applying generic secure protocols. In this work, we investigate the advantages of designing training algorithms alongside a novel secure protocol, incorporating optimizations on both fronts. We present QUOTIENT, a new method for discretized training of DNNs, along with a customized secure two-party protocol for it. QUOTIENT incorporates key components of state-of-the-art DNN training such as layer normalization and adaptive gradient methods, and improves upon the state-of-the-art in DNN training in two-party computation. Compared to prior work, we obtain an improvement of 50X in WAN time and 6% in absolute accuracy.
We study the problem of secure two-party computation of arithmetic circuits in the presence of active (“malicious”) parties. This problem is motivated by privacy-preserving numerical computations, such as ones arising in the context of machine learning training and classification, as well as in threshold cryptographic schemes. In this work, we design, optimize, and implement anactively secure protocol for secure two-party arithmetic computation. A distinctive feature of our protocol is that it can make a fully modular black-box use of any passively secure implementation of oblivious linear function evaluation (OLE). OLE is a commonly used primitive for secure arithmetic computation, analogously to the role of oblivious transfer in secure computation for Boolean circuits. For typical (large but not-too-narrow) circuits, our protocol requires roughly 4 invocations of passively secure OLE per multiplication gate. This significantly improves over the recent TinyOLE protocol (Döttling et al., ACM CCS 2017), which requires 22 invocations of actively secure OLE in general, or 44 invocations of a specific code-based passively secure OLE. Our protocol follows the high level approach of the IPS compiler (Ishai et al., CRYPTO 2008, TCC 2009), optimizing it in several ways. In particular, we adapt optimization ideas that were used in the context of the practical zero-knowledge argument system Ligero (Ames et al., ACM CCS 2017) to the more general setting of secure computation, and explore the possibility of boosting efficiency by employing a “leaky” passively secure OLE protocol. The latter is motivated by recent (passively secure) lattice-based OLE implementations in which allowing such leakage enables better efficiency. We showcase the performance of our protocol by applying its implementation to several useful instances of secure arithmetic computation. On “wide” circuits, such as ones computing a fixed function on many different inputs, our protocol is 5x faster and transmits 4x less data than the state-of-the-art Overdrive (Keller et al., Eurocrypt 2018). Our benchmarks include a general passive-to-active OLE compiler, authenticated generation of “Beaver triples”, and a system for securely outsourcing neural network classification. The latter is the first actively secure implementation of its kind, strengthening the passive security provided by recent related works (Mohassel and Zhang, IEEE S&P 2017; Juvekar et al., USENIX 2018).
Machine learning is widely used in practice to produce predictive models for applications such as image processing, speech and text recognition. These models are more accurate when trained on large amount of data collected from different sources. However, the massive data collection raises privacy concerns. In this paper, we present new and efficient protocols for privacy preserving machine learning for linear regression, logistic regression and neural network training using the stochastic gradient descent method. Our protocols fall in the two-server model where data owners distribute their private data among two non-colluding servers who train various models on the joint data using secure two-party computation (2PC). We develop new techniques to support secure arithmetic operations on shared decimal numbers, and propose MPC-friendly alternatives to non-linear functions such as sigmoid and softmax that are superior to prior work. We implement our system in C++. Our experiments validate that our protocols are several orders of magnitude faster than the state of the art implementations for privacy preserving linear and logistic regressions, and scale to millions of data samples with thousands of features. We also implement the first privacy preserving system for training neural networks.
Secure two-party computation (S2PC) allows two parties to compute a function on their joint inputs while leaking only the output of the function. At TCC 2009 Orlandi and Nielsen proposed the LEGO protocol for maliciously secure 2PC based on cut-and-choose of Yao’s garbled circuits at the gate level and showed that this is asymptotically more efficient than on the circuit level. Since then the LEGO approach has been improved upon in several theoretical works, but never implemented. In this paper we describe further concrete improvements and provide the first implementation of a protocol from the LEGO family. Our protocol is optimized for the offline/online setting and supports function-independent preprocessing using only a constant number of rounds. We have benchmarked our prototype and find that our protocol can compete with all existing implementations and that it is often more efficient. As an example, in a LAN setting we can evaluate an AES-128 with online latency down to 1.13 ms, while if evaluating 128 AES-128 in parallel the amortized cost is 0.09 ms per AES-128. This online performance does not come at the price of offline inefficiency as we achieve comparable performance to previous, less general protocols, and significantly better if we ignore the cost of the function-independent preprocessing. Also, as our protocol has an optimal 2-round online phase it is significantly more efficient than previous protocols’ when considering a high latency network.