Side-channel attacks exploiting (EC)DSA nonce leakage easily lead to full key recovery. Although (EC)DSA implementations have already been hardened against side-channel leakage using the constant-time paradigm, the long-standing cat-and-mouse-game of attacks and patches continues. In particular, current code review is prone to miss less obvious side channels hidden deeply in the call stack. To solve this problem, a systematic study of nonce leakage is necessary. We present a systematic analysis of nonce leakage in cryptographic implementations. In particular, we expand DATA, an open-source side-channel analysis framework, to detect nonce leakage. Our analysis identified multiple unknown nonce leakage vulnerabilities across all essential computation steps involving nonces. Among others, we uncover inherent problems in Bignumber implementations that break claimed constant-time guarantees of (EC)DSA implementations if secrets are close to a word boundary. We found that lazy resizing of Bignumbers in OpenSSL and LibreSSL yields a highly accurate and easily exploitable side channel, which has been acknowledged with two CVEs. Surprisingly, we also found a tiny but expressive leakage in the constant-time scalar multiplication of OpenSSL and BoringSSL. Moreover, in the process of reporting and patching, we identified newly introduced leakage with the support of our tool, thus preventing another attack-patch cycle. We open-source our tool, together with an intuitive graphical user interface we developed.
Private set intersection (PSI) allows two parties, who each hold a set of items, to compute the intersection of those sets without revealing anything about other items. Recent advances in PSI have significantly improved its performance for the case of semi-honest security, making semi-honest PSI a practical alternative to insecure methods for computing intersections. However, the semi-honest security model is not always a good fit for real-world problems.
In this work we introduce a new PSI protocol that is secure in the presence of malicious adversaries. Our protocol is based entirely on fast symmetric-key primitives and inherits important techniques from state-of-the-art protocols in the semi-honest setting. Our novel technique to strengthen the protocol for malicious adversaries is inspired by the dual execution technique of Mohassel & Franklin (PKC 2006). Our protocol is optimized for the random-oracle model, but can also be realized (with a performance penalty) in the standard model.
We demonstrate our protocol’s practicality with a prototype implementation. To securely compute the intersection of two sets of size 220 requires only 13 seconds with our protocol, which is ~12x faster than the previous best malicious-secure protocol (Rindal & Rosulek, Eurocrypt 2017), and only 3x slower than the best semi-honest protocol (Kolesnikov et al., CCS 2016).
A memory-hard function (MHF) ƒn with parameter n can be computed in sequential time and space n. Simultaneously, a high amortized parallel area-time complexity (aAT) is incurred per evaluation. In practice, MHFs are used to limit the rate at which an adversary (using a custom computational device) can evaluate a security sensitive function that still occasionally needs to be evaluated by honest users (using an off-the-shelf general purpose device). The most prevalent examples of such sensitive functions are Key Derivation Functions (KDFs) and password hashing algorithms where rate limits help mitigate off-line dictionary attacks. As the honest users’ inputs to these functions are often (low-entropy) passwords special attention is given to a class of side-channel resistant MHFs called iMHFs.
Essentially all iMHFs can be viewed as some mode of operation (making n calls to some round function) given by a directed acyclic graph (DAG) with very low indegree. Recently, a combinatorial property of a DAG has been identified (called “depth-robustness”) which results in good provable security for an iMHF based on that DAG. Depth-robust DAGs have also proven useful in other cryptographic applications. Unfortunately, up till now, all known very depth-robust DAGs are impractically complicated and little is known about their exact (i.e. non-asymptotic) depth-robustness both in theory and in practice.
In this work we build and analyze (both formally and empirically) several exceedingly simple and efficient to navigate practical DAGs for use in iMHFs and other applications. For each DAG we:
Prove that their depth-robustness is asymptotically maximal.
Prove bounds of at least 3 orders of magnitude better on their exact depth-robustness compared to known bounds for other practical iMHF.
Implement and empirically evaluate their depth-robustness and aAT against a variety of state-of-the art (and several new) depth-reduction and low aAT attacks. We find that, against all attacks, the new DAGs perform significantly better in practice than Argon2i, the most widely deployed iMHF in practice.
Along the way we also improve the best known empirical attacks on the aAT of Argon2i by implementing and testing several heuristic versions of a (hitherto purely theoretical) depth-reduction attack. Finally, we demonstrate practicality of our constructions by modifying the Argon2i code base to use one of the new high aAT DAGs. Experimental benchmarks on a standard off-the-shelf CPU show that the new modifications do not adversely affect the impressive throughput of Argon2i (despite seemingly enjoying significantly higher aAT).
Time-based one-time password (TOTP) systems in use today require storing secrets on both the client and the server. As a result, an attack on the server can expose all second factors for all users in the system. We present T/Key, a time-based one-time password system that requires no secrets on the server. Our work modernizes the classic S/Key system and addresses the challenges in making such a system secure and practical. At the heart of our construction is a new lower bound analyzing the hardness of inverting hash chains composed of independent random functions, which formalizes the security of this widely used primitive. Additionally, we develop a near-optimal algorithm for quickly generating the required elements in a hash chain with little memory on the client. We report on our implementation of T/Key as an Android application. T/Key can be used as a replacement for current TOTP systems, and it remains secure in the event of a server-side compromise. The cost, as with S/Key, is that one-time passwords are longer than the standard six characters used in TOTP.
We implemented (a simplified version of) the branching-program obfuscator due to Gentry et al. (GGH15), which is itself a variation of the first obfuscation candidate by Garg et al. (GGHRSW13). To keep within the realm of feasibility, we had to give up on some aspects of the construction, specifically the “multiplicative bundling” factors that protect against mixed-input attacks. Hence our implementation can only support read-once branching programs.
To be able to handle anything more than just toy problems, we developed a host of algorithmic and code-level optimizations. These include new variants of discrete Gaussian sampler and lattice trapdoor sampler, efficient matrix-manipulation routines, and many tradeoffs. We expect that these optimizations will find other uses in lattice-based cryptography beyond just obfuscation.
Our implementation is the first obfuscation attempt using the GGH15 graded encoding scheme, offering performance advantages over other graded encoding methods when obfuscating finite-state machines with many states. In out most demanding setting, we were able to obfuscate programs with input length of 20 nibbles (80 bits) and over 100 states, which seems out of reach for prior implementations. Although further optimizations are surely possible, we do not expect any implementation of current schemes to be able to handle much larger parameters.
Program obfuscation is a powerful security primitive with many applications. White-box cryptography studies a particular subset of program obfuscation targeting keyed pseudorandom functions (PRFs), a core component of systems such as mobile payment and digital rights management. Although the white-box obfuscators currently used in practice do not come with security proofs and are thus routinely broken, recent years have seen an explosion of cryptographic techniques for obfuscation, with the goal of avoiding this build-and-break cycle.
In this work, we explore in detail cryptographic program obfuscation and the related primitive of multi-input functional encryption (MIFE). In particular, we extend the 5Gen framework (CCS 2016) to support circuit-based MIFE and program obfuscation, implementing both existing and new constructions. We then evaluate and compare the efficiency of these constructions in the context of PRF obfuscation.
As part of this work we (1) introduce a novel instantiation of MIFE that works directly on functions represented as arithmetic circuits, (2) use a known transformation from MIFE to obfuscation to give us an obfuscator that performs better than all prior constructions, and (3) develop a compiler for generating circuits optimized for our schemes. Finally, we provide detailed experiments, demonstrating, among other things, the ability to obfuscate a PRF with a 64-bit key and 12 bits of input (containing 62k gates) in under 4 hours, with evaluation taking around 1 hour. This is by far the most complex function obfuscated to date.
Blockchains and more general distributed ledgers are becoming increasingly popular as efficient, reliable, and persistent records of data and transactions. Unfortunately, they ensure reliability and correctness by making all data public, raising confidentiality concerns that eliminate many potential uses.
In this paper we present Solidus, a protocol for confidential transactions on public blockchains, such as those required for asset transfers with on-chain settlement. Solidus operates in a framework based on real-world financial institutions: a modest number of banks each maintain a large number of user accounts. Within this framework, Solidus hides both transaction values and the transaction graph (i.e., the identities of transacting entities) while maintaining the public verifiability that makes blockchains so appealing. To achieve strong confidentiality of this kind, we introduce the concept of a Publicly-Verifiable Oblivious RAM Machine (PVORM). We present a set of formal security definitions for both PVORM and Solidus and show that our constructions are secure. Finally, we implement Solidus and present a set of benchmarks indicating that the system is efficient in practice.
Time and again, attribute-based encryption has been shown to be the natural cryptographic tool for building various types of conditional access systems with far-reaching applications, but the deployment of such systems has been very slow. A central issue is the lack of an encryption scheme that can operate on sensitive data very efficiently and, at the same time, provides features that are important in practice.
This paper proposes the first fully secure ciphertext-policy and key-policy ABE schemes based on a standard assumption on Type-III pairing groups, which do not put any restriction on policy type or attributes. We implement our schemes along with several other prominent ones using the Charm library, and demonstrate that they perform better on almost all parameters of interest.
We design and implement a Distributed Oblivious Random Access Memory (DORAM) data structure that is optimized for use in two-party secure computation protocols. We improve upon the access time of previous constructions by a factor of up to ten, their memory overhead by a factor of one hundred or more, and their initialization time by a factor of thousands. We are able to instantiate ORAMs that hold 234 bytes, and perform operations on them in seconds, which was not previously feasible with any implemented scheme.
Unlike prior ORAM constructions based on hierarchical hashing, permutation, or trees, our Distributed ORAM is derived from the new Function Secret Sharing scheme introduced by Boyle, Gilboa and Ishai. This significantly reduces the amount of secure computation required to implement an ORAM access, albeit at the cost of O(n) efficient local memory operations.
We implement our construction and find that, despite its poor O(n) asymptotic complexity, it still outperforms the fastest previously known constructions, Circuit ORAM and Square-root ORAM, for datasets that are 32 KiB or larger, and outperforms prior work on applications such as stable matching or binary search by factors of two to ten.
Write-Only Oblivious RAM (WoORAM) protocols provide privacy by encrypting the contents of data and also hiding the pattern of write operations over that data. WoORAMs provide better privacy than plain encryption and better performance than more general ORAM schemes (which hide both writing and reading access patterns), and the write-oblivious setting has been applied to important applications of cloud storage synchronization and encrypted hidden volumes. In this paper, we introduce an entirely new technique for Write-Only ORAM, called DetWoORAM. Unlike previous solutions, DetWoORAM uses a deterministic, sequential writing pattern without the need for any “stashing” of blocks in local state when writes fail. Our protocol, while conceptually simple, provides substantial improvement over prior solutions, both asymptotically and experimentally. In particular, under typical settings the DetWoORAM writes only 2 blocks (sequentially) to backend memory for each block written to the device, which is optimal. We have implemented our solution using the BUSE (block device in user-space) module and tested DetWoORAM against both an encryption only baseline of dm-crypt and prior, randomized WoORAM solutions, measuring only a 3x-14x slowdown compared to an encryption-only baseline and around 6x-19x speedup compared to prior work.
Oblivious Random Access Machine (ORAM) enables a client to access her data without leaking her access patterns. Existing client-efficient ORAMs either achieve O(log N) client-server communication blowup without heavy computation, or O(1) blowup but with expensive homomorphic encryptions. It has been shown that O(log N) bandwidth blowup might not be practical for certain applications, while schemes with O(1) communication blowup incur even more delay due to costly homomorphic operations.
In this paper, we propose a new distributed ORAM scheme referred to as Shamir Secret Sharing ORAM (S3ORAM), which achieves O(1) client-server bandwidth blowup and O(1) blocks of client storage without relying on costly partial homomorphic encryptions. S3ORAM harnesses Shamir Secret Sharing, tree-based ORAM structure and a secure multi-party multiplication protocol to eliminate costly homomorphic operations and, therefore, achieves O(1) client-server bandwidth blowup with a high computational efficiency. We conducted comprehensive experiments to assess the performance of S3ORAM and its counterparts on actual cloud environments, and showed that S3ORAM achieves three orders of magnitude lower end-to-end delay compared to alternatives with O(1) client communication blowup (Onion-ORAM), while it is one order of magnitude faster than Path-ORAM for a network with a moderate bandwidth quality. We have released the implementation of S3ORAM for further improvement and adaptation.
Bitcoin owes its success to the fact that transactions are transparently recorded in the blockchain, a global public ledger that removes the need for trusted parties. Unfortunately, recording every transaction in the blockchain causes privacy, latency, and scalability issues. Building on recent proposals for “micropayment channels” — two party associations that use the ledger only for dispute resolution — we introduce techniques for constructing anonymous payment channels. Our proposals allow for secure, instantaneous and private payments that substantially reduce the storage burden on the payment network. Specifically, we introduce three channel proposals, including a technique that allows payments via untrusted intermediaries. We build a concrete implementation of our scheme and show that it can be deployed via a soft fork to existing anonymous currencies such as ZCash.
Password checking systems traditionally allow login only if the correct password is submitted. Recent work on typo-tolerant password checking suggests that usability can be improved, with negligible security loss, by allowing a small number of typographical errors. Existing systems, however, can only correct a handful of errors, such as accidentally leaving caps lock on or incorrect capitalization of the first letter in a password. This leaves out numerous kinds of typos made by users, such as transposition errors, substitutions, or capitalization errors elsewhere in a password. Some users therefore receive no benefit from existing typo-tolerance mechanisms.
We introduce personalized typo-tolerant password checking. In our approach, the authentication system learns over time the typos made by a specific user. In experiments using Mechanical Turk, we show that 45% of users would benefit from personalization. Therefore, we design a system, called TypTop, that securely implements personalized typo-tolerance. Underlying TypTop is a new stateful password-based encryption scheme that can be used to store recent failed login attempts. Our formal analysis shows that security in the face of an attacker that obtains the state of the system reduces to the difficulty of a brute-force dictionary attack against the real password. We implement TypTop for Linux and Mac OS login and report on a proof-of-concept deployment.
While the feasibility of constant-round and actively secure MPC has been known for over two decades, the last few years have witnessed a flurry of designs and implementations that make its deployment a palpable reality. To our knowledge, however, existing concretely efficient MPC constructions are only for up to three parties.
In this paper we design and implement a new actively secure 5PC protocol tolerating two corruptions that requires 8 rounds of interaction, only uses fast symmetric-key operations, and incurs 60% less communication than the passively secure state-of-the-art solution from the work of Ben-Efraim, Lindell, and Omri [CCS 2016]. For example, securely evaluating the AES circuit when the parties are in different regions of the U.S. and Europe only takes 1.8s which is 2.6x faster than the passively secure 5PC in the same environment.
Instrumental for our efficiency gains (less interaction, only symmetric key primitives) is a new 4-party primitive we call Attested OT, which in addition to Sender and Receiver involves two additional “assistant parties” who will attest to the respective inputs of both parties, and which might be of broader applicability in practically relevant MPC scenarios. Finally, we also show how to generalize our construction to n parties with similar efficiency properties where the corruption threshold is t ≈ √n, and propose a combinatorial problem which, if solved optimally, can yield even better corruption thresholds for the same cost.
This paper considers the problem of running a long-term on-demand service for executing actively-secure computations. We examined state-of-the-art tools and implementations for actively-secure computation and identified a set of key features indispensable to offer meaningful service like this. Since no satisfactory tools exist for the purpose, we developed Pool, a new tool for building and executing actively-secure computation protocols at extreme scales with nearly zero offline delay. With Pool, we are able to obliviously execute, for the first time, reactive computations like ORAM in the malicious threat model. Many technical benefits of Pool can be attributed to the concept of pool-based cut-and-choose. We show with experiments that this idea has significantly improved the scalability and usability of JIMU, a state-of-the-art LEGO protocol.
Zero Knowledge Contingent Payment (ZKCP) protocols allow fair exchange of sold goods and payments over the Bitcoin network. In this paper we point out two main shortcomings of current proposals for ZKCP, and propose ways to address them.
First we show an attack that allows a buyer to learn partial information about the digital good being sold, without paying for it. This break in the zero-knowledge condition of ZKCP is due to the fact that in the protocols we attack, the buyer is allowed to choose common parameters that normally should be selected by a trusted third party. We implemented and tested this attack: we present code that learns, without paying, the value of a Sudoku cell in the “Pay-to-Sudoku” ZKCP implementation. We also present ways to fix this attack that do not require a trusted third party.
Second, we show that ZKCP are not suited for the purchase of digital services rather than goods. Current constructions of ZKCP do not allow a seller to receive payments after proving that a certain service has been rendered, but only for the sale of a specific digital good. We define the notion of Zero-Knowledge Contingent Service Payment (ZKCSP) protocols and construct two new protocols, for either public or private verification. We implemented our ZKCSP protocols for Proofs of Retrievability, where a client pays the server for providing a proof that the client’s data is correctly stored by the server.We also implement a secure ZKCP protocol for “Pay-to-Sudoku” via our ZKCSP protocol, which does not require a trusted third party.
A side product of our implementation effort is a new optimized circuit for SHA256 with less than a quarter than the number of AND gates of the best previously publicly available one. Our new SHA256 circuit may be of independent use for circuit-based MPC and FHE protocols that require SHA256 circuits.
We propose a new, constant-round protocol for multi-party computation of boolean circuits that is secure against an arbitrary number of malicious corruptions. At a high level, we extend and generalize recent work of Wang et al. in the two-party setting. Namely, we design an efficient preprocessing phase that allows the parties to generate authenticated information; we then show how to use this information to distributively construct a single “authenticated” garbled circuit that is evaluated by one party.
Our resulting protocol improves upon the state-of-the-art both asymptotically and concretely. We validate these claims via several experiments demonstrating both the efficiency and scalability of our protocol:
Efficiency: For three-party computation over a LAN, our protocol requires only 95 ms to evaluate AES. This is roughly a 700X improvement over the best prior work, and only 2.5X slower than the best known result in the two-party setting. In general, for n-party computation our protocol improves upon prior work (which was never implemented) by a factor of more than 230n, e.g., an improvement of 3 orders of magnitude for 5-party computation.
Scalability: We successfully executed our protocol with a large number of parties located all over the world, computing (for example) AES with 128 parties across 5 continents in under 3 minutes. Our work represents the largest-scale demonstration of secure computation to date.
We propose a simple and efficient framework for obtaining efficient constant-round protocols for maliciously secure two-party computation. Our framework uses a function-independent preprocessing phase to generate authenticated information for the two parties; this information is then used to construct a single“authenticated” garbled circuit which is transmitted and evaluated.We also show how to efficiently instantiate the preprocessing phase by designing a highly optimized version of the TinyOT protocol by Nielsen et al. Our overall protocol outperforms existing work in both the single-execution and amortized settings, with or without preprocessing: In the single-execution setting, our protocol evaluates an AES circuit with malicious security in37 ms with an online time of just 1 ms. Previous work with the best online time (also 1 ms)requires 124 ms in total; previous work with the best total time requires 62 ms (with 14 ms online time). If we amortize the computation over 1024 executions, each AES computation requires just 6.7 ms with roughly the same online time as above. The best previous work in the amortized setting has roughly the same total time but does not support function-independent preprocessing.Our work shows that the performance penalty for maliciously secure two-party computation (as compared to semi-honest security) is much smaller than previously believed.
High-performance cryptographic code often relies on complex hand-tuned assembly language that is customized for individual hardware platforms. Such code is difficult to understand or analyze. We introduce a new programming language and tool called Vale that supports flexible, automated verification of high-performance assembly code. The Vale tool transforms annotated assembly language into an abstract syntax tree (AST), while also generating proofs about the AST that are verified via an SMT solver. Since the AST is a first-class proof term, it can be further analyzed and manipulated by proven correct code before being extracted into standard assembly. For example, we have developed a novel, proven-correct taint-analysis engine that verifies the code’s freedom from digital side channels. Using these tools, we verify the correctness, safety, and security of implementations of SHA-256 on x86 and ARM, Poly1305 on x64, and hardware-accelerated AES-CBC on x86. Several implementations meet or beat the performance of unverified, state-of-the-art cryptographic libraries.
Cryptographic libraries that implement Transport Layer Security (TLS) have a responsibility to delete cryptographic keys once they’re no longer in use. Any key that’s left in memory can potentially be recovered through the actions of an attacker, up to and including the physical capture and forensic analysis of a device’s memory. This paper describes an analysis of the TLS library stack used in recent Android distributions, combining a C language core (BoringSSL) with multiple layers of Java code (Conscrypt, OkHttp, and Java Secure Sockets). We first conducted a black-box analysis of virtual machine images, allowing us to discover keys that might remain recoverable. After identifying several such keys, we subsequently pinpointed undesirable interactions across these layers, where the higherlevel use of BoringSSL’s reference counting features, from Java code, prevented BoringSSL from cleaning up its keys. This interaction poses a threat to all Android applications built on standard HTTPS libraries, exposing master secrets to memory disclosure attacks. We found all versions we investigated from Android 4 to the latest Android 8 are vulnerable, showing that this problem has been long overlooked. The Android Chrome application is proven to be particularly problematic. We suggest modest changes to the Android codebase to mitigate these issues, and have reported these to Google to help them patch the vulnerability in future Android systems.