Current formal approaches have been successfully used to find design flaws in many security protocols. However, it is still challenging to automatically analyze protocols due to their large or infinite state spaces. In this paper, we propose SmartVerif, a novel and general framework that pushes the limit of automation capability of state-of-the-art verification approaches. The primary technical contribution is the dynamic strategy inside SmartVerif, which can be used to smartly search proof paths. Different from the non-trivial and error-prone design of existing static strategies, the design of our dynamic strategy is simple and flexible: it can automatically optimize itself according to the security protocols without any human intervention. With the optimized strategy, SmartVerif can localize and prove supporting lemmata, which leads to higher probability of success in verification. The insight of designing the strategy is that the node representing a supporting lemma is on an incorrect proof path with lower probability, when a random strategy is given. Hence, we implement the strategy around the insight by introducing a reinforcement learning algorithm. We also propose several methods to deal with other technical problems in implementing SmartVerif. Experimental results show that SmartVerif can automatically verify all security protocols studied in this paper. The case studies also validate the efficiency of our dynamic strategy.
Side-channel attacks exploiting (EC)DSA nonce leakage easily lead to full key recovery. Although (EC)DSA implementations have already been hardened against side-channel leakage using the constant-time paradigm, the long-standing cat-and-mouse-game of attacks and patches continues. In particular, current code review is prone to miss less obvious side channels hidden deeply in the call stack. To solve this problem, a systematic study of nonce leakage is necessary. We present a systematic analysis of nonce leakage in cryptographic implementations. In particular, we expand DATA, an open-source side-channel analysis framework, to detect nonce leakage. Our analysis identified multiple unknown nonce leakage vulnerabilities across all essential computation steps involving nonces. Among others, we uncover inherent problems in Bignumber implementations that break claimed constant-time guarantees of (EC)DSA implementations if secrets are close to a word boundary. We found that lazy resizing of Bignumbers in OpenSSL and LibreSSL yields a highly accurate and easily exploitable side channel, which has been acknowledged with two CVEs. Surprisingly, we also found a tiny but expressive leakage in the constant-time scalar multiplication of OpenSSL and BoringSSL. Moreover, in the process of reporting and patching, we identified newly introduced leakage with the support of our tool, thus preventing another attack-patch cycle. We open-source our tool, together with an intuitive graphical user interface we developed.
Trusted Platform Module (TPM) serves as a hardware-based root of trust that protects cryptographic keys from privileged system and physical adversaries. In this work, we per-form a black-box timing analysis of TPM 2.0 devices deployed on commodity computers. Our analysis reveals thatsome of these devices feature secret-dependent execution times during signature generation based on elliptic curves. In particular, we discovered timing leakage on an Intel firmware-based TPM as well as a hardware TPM. We show how this information allows an attacker to apply lattice techniques torecover 256-bit private keys for ECDSA and EC Schnorr signatures. On Intel fTPM, our key recovery succeeds after about1,300 observations and in less than two minutes. Similarly, weextract the private ECDSA key from a hardware TPM manufactured by STMicroelectronics, which is certified at Common Criteria (CC) EAL 4+, after fewer than 40,000 observations.We further highlight the impact of these vulnerabilities by demonstrating a remote attack against a StrongSwan IPsecVPN that uses a TPM to generate the digital signatures for authentication. In this attack, the remote client recovers the server’s private authentication key by timing only 45,000authentication handshakes via a network connection.The vulnerabilities we have uncovered emphasize the difficulty of correctly implementing known constant-time techniques, and show the importance of evolutionary testing and transparent evaluation of cryptographic implementations.Even certified devices that claim resistance against attacks require additional scrutiny by the community and industry, as we learn more about these attacks.
We revisit the problem of performing secure computation of graph-parallel algorithms, focusing on the applications of securely outsourcing matrix factorization, and histograms. Leveraging recent results in low-communication secure multi-party computation, and a security relaxation that allows the computation servers to learn some differentially private leakage about user inputs, we construct a new protocol that reduces overall runtime by 320X, reduces the number of AES calls by 750X , and reduces the total communication by 200X . Our system can securely compute histograms over 300 million items in about 4 minutes, and it can perform sparse matrix factorization, which is commonly used in recommendation systems, on 20 million records in about 6 minutes. Furthermore, in contrast to prior work, our system is secure against a malicious adversary that corrupts one of the computing servers.
In Proof-of-Stake (PoS) and permissioned blockchains, a committee of verifiers agrees and sign every new block of transactions. These blocks are validated, propagated, and stored by all users in the network. However, posterior corruptions pose a common threat to these designs, because the adversary can corrupt committee verifiers after they certified a block and use their signing keys to certify a different block. Designing efficient and secure digital signatures for use in PoS blockchains can substantially reduce bandwidth, storage and computing requirements from nodes, thereby enabling more efficient applications.
We present Pixel, a pairing-based forward-secure multi-signature scheme optimized for use in blockchains, that achieves substantial savings in bandwidth, storage requirements, and verification effort. Pixel signatures consist of two group elements, regardless of the number of signers, can be verified using three pairings and one exponentiation, and support non-interactive aggregation of individual signatures into a multi-signature. Pixel signatures are also forward-secure and let signers evolve their keys over time, such that new keys cannot be used to sign on old blocks, protecting against posterior corruptions attacks on blockchains. We show how to integrate Pixel into any PoS blockchain. Next, we evaluate Pixel in a real-world PoS blockchain implementation, showing that it yields notable savings in storage, bandwidth, and block verification time. In particular, Pixel signatures reduce the size of blocks with 1500 transactions by 35% and reduce block verification time by 38%.
Recent results have shown that some post-quantum cryptographic systems have encryption and decryption performance comparable to fast elliptic-curve cryptography (ECC) or even better. However, this performance metric is considering only CPU time and ignoring bandwidth and storage. High-confidence post-quantum encryption systems have much larger keys than ECC. For example, the code-based cryptosystem recommended by the PQCRYPTO project uses public keys of 1MB.
Fast key erasure (to provide “forward secrecy”) requires new public keys to be constantly transmitted. Either the server needs to constantly generate, store, and transmit large keys, or it needs to receive, store, and use large keys from the clients. This is not necessarily a problem for overall bandwidth, but it is a problem for storage and computation time on tiny network servers. All straightforward approaches allow easy denial-of-service attacks.
This paper describes a protocol, suitable for today’s networks and tiny servers, in which clients transmit their code-based one-time public keys to servers. Servers never store full client public keys but work on parts provided by the clients, without having to maintain any per-client state. Intermediate results are stored on the client side in the form of encrypted cookies and are eventually combined by the server to obtain the ciphertext. Requirements on the server side are very small: storage of one long-term private key, which is much smaller than a public key, and a few small symmetric cookie keys, which are updated regularly and erased after use. The protocol is highly parallel, requiring only a few round trips, and involves total bandwidth not much larger than a single public key. The total number of packets sent by each side is 971, each fitting into one IPv6 packet of less than 1280 bytes.
The protocol makes use of the structure of encryption in code-based cryptography and benefits from small ciphertexts in code-based cryptography.
At the core of Apple’s iMessage is a signcryption scheme that involves symmetric encryption of a message under a key that is derived from the message itself. This motivates us to formalize a primitive we call Encryption under Message-Derived Keys (EMDK). We prove security of the EMDK scheme underlying iMessage. We use this to prove security of the signcryption scheme itself, with respect to definitions of signcryption we give that enhance prior ones to cover issues peculiar to messaging protocols. Our provable-security results are quantitative, and we discuss the practical implications for iMessage.
We present a 2-party private set intersection (PSI) protocol which provides security against malicious participants, yet is almost as fast as the fastest known semi-honest PSI protocol of Kolesnikov et al. (CCS 2016).
Our protocol is based on a new approach for two-party PSI, which can be instantiated to provide security against either malicious or semi-honest adversaries. The protocol is unique in that the only difference between the semi-honest and malicious versions is an instantiation with different parameters for a linear error-correction code. It is also the first PSI protocol which is concretely efficient while having linear communication and security against malicious adversaries, while running in the OT-hybrid model (assuming a non-programmable random oracle).
State of the art semi-honest PSI protocols take advantage of cuckoo hashing, but it has proven a challenge to use cuckoo hashing for malicious security. Our protocol is the first to use cuckoo hashing for malicious-secure PSI. We do so via a new data structure, called a probe-and-XOR of strings (PaXoS), which may be of independent interest. This abstraction captures important properties of previous data structures, most notably garbled Bloom filters. While an encoding by a garbled Bloom filter is larger by a factor of $O(\lambda)$ than the original data, we describe a significantly improved PaXoS based on cuckoo hashing that achieves constant rate while being no worse in other relevant efficiency measures.
Zero-knowledge (ZK) proofs (ZKP) have received wide attention, focusing on non-interactivity, short proof size, and fast verification time. We focus on the fastest total proof time, in particular for large Boolean circuits. Under this metric, Garbled Circuit (GC)-based ZKP (Jawurek et al., [JKO], CCS 2013) remained the state-of-the-art technique due to the low-constant linear scaling of computing the garbling. We improve GC-ZKP for proof statements with conditional clauses. Our communication is proportional to the longest branch rather than to the entire proof statement. This is most useful when the number m of branches is large, resulting in up to factor $m\times$ improvement over JKO. In our proof-of-concept illustrative application, prover P demonstrates knowledge of a bug in a codebase consisting of any number of snippets of actual C code. Our computation cost is linear in the size of the codebase and communication is constant in the number of snippets. That is, we require only enough communication for a single largest snippet! Our conceptual contribution is stacked garbling for ZK, a privacy-free circuit garbling scheme that can be used with the JKO GC-ZKP protocol to construct more efficient ZKP. Given a Boolean circuit C and computational security parameter $\kappa$, our garbling is $L \cdot \kappa$ bits long, where $L$ is the length of the longest execution path in C. All prior concretely efficient garbling schemes produce garblings of size $|C| \cdot \kappa$. The computational cost of our scheme is not increased over prior state-of-the-art. We implement our GC-ZKP and demonstrate significantly improved ($m\times$ over JKO) ZK performance for functions with branching factor $m$. Compared with recent ZKP (STARK, Libra, KKW, Ligero, Aurora, Bulletproofs), our scheme offers much better proof times for larger circuits ($35-1000\times$ or more, depending on circuit size and compared scheme). For our illustrative application, we consider four C code snippets, each of about 30-50 LOC; one snippet allows an invalid memory dereference. The entire proof takes 0.15 seconds and communication is 1.5 MB.
Recent foundational work on leakage-abuse attacks on encrypted databases has broadened our understanding of what an adversary can accomplish with a standard leakage profile. Nevertheless, all known value reconstruction attacks succeed under strong assumptions that may not hold in the real world. The most prevalent assumption is that queries are issued uniformly at random by the client. We present the first value reconstruction attacks that succeed without any knowledge about the query or data distribution. Our approach uses the search-pattern leakage, which exists in all known structured encryption schemes but has not been fully exploited so far. At the core of our method lies a support size estimator, a technique that utilizes the repetition of search tokens with the same response to estimate distances between encrypted values without any assumptions about the underlying distribution. We develop distribution-agnostic reconstruction attacks for both range queries and k-nearest neighbor (k-NN) queries based on information extracted from the search-pattern leakage. Our new range attack follows a different algorithmic approach than state-of-the-art attacks, which are fine-tuned to succeed under the uniformly distributed queries. Instead, we reconstruct plaintext values under a variety of skewed query distributions and even outperform the accuracy of previous approaches under the uniform query distribution. Our new k-NN attack succeeds with far fewer samples than previous attacks and scales to much larger values of k. We demonstrate the effectiveness of our attacks by experimentally testing them on a wide range of query distributions and database densities, both unknown to the adversary.
This paper focuses on protecting the cellular paging protocol — which balances between the quality-of-service and battery consumption of a device— against security and privacy attacks. Attacks against this protocol can have severe repercussions, for instance,allowing attacker to infer a victim’s location, leak a victim’s IMSI, and inject fabricated emergency alerts.To secure the protocol, we first identify the underlying design weaknesses enabling such attacks and then pro-pose efficient and backward-compatible approaches to address these weaknesses. We also demonstrate the deployment feasibility of our enhanced paging protocol by implementing it on an open-source cellular protocol library and commodity hardware. Our evaluation demonstrates that the enhanced protocol can thwart attacks without incurring substantial overhead.
Apple Continuity protocols are the underlying network component of Apple Continuity services which allow seamless nearby applications such as activity and file transfer, device pairing and sharing a network connection. Those protocols rely on Bluetooth Low Energy (BLE) to exchange information between devices: Apple Continuity messages are embedded in the pay-load of BLE advertisement packets that are periodically broadcasted by devices. Recently, Martin et al. identified  a number of privacy issues associated with Apple Continuity protocols; we show that this was just the tip of the iceberg and that Apple Continuity protocols leak a wide range of personal information. In this work, we present a thorough reverse engineering of Apple Continuity protocols that we use to uncover a collection of privacy leaks. We introduce new artifacts, including identifiers, counters and battery levels, that can be used for passive tracking, and describe a novel active tracking attack based on Handoff messages. Beyond tracking issues, we shed light on severe privacy flaws. First, in addition to the trivial exposure of device characteristics and status, we found that HomeKit accessories betray human activities in a smarthome. Then, we demonstrate that AirDrop and Nearby Action protocols can be leveraged by passive observers to recover e-mail addresses and phone numbers of users. Finally, we exploit passive observations on the advertising traffic to infer Siri voice commands of a user.
Encrypting data before sending it to the cloud protects it against hackers and malicious insiders, but requires the cloud to compute on encrypted data. Trusted (hardware) modules, e.g., secure enclaves like Intel’s SGX, can very efficiently run entire programs in encrypted memory. However, it already has been demonstrated that software vulnerabilities give an attacker ample opportunity to insert arbitrary code into the program. This code can then modify the data flow of the program and leak any secret in the program to an observer in the cloud via SGX side-channels. Since any larger program is rife with software vulnerabilities, it is not a good idea to outsource entire programs to an SGX enclave. A secure alternative with a small trusted code base would be fully homomorphic encryption (FHE) – the holy grail of encrypted computation. However, due to its high computational complexity it is unlikely to be adopted in the near future. As a result researchers have made several proposals for transforming programs to perform encrypted computations on less powerful encryption schemes. Yet, current approaches fail on programs that make control-flow decisions based on encrypted data. In this paper, we introduce the concept of data flow authentication (DFAuth). DFAuth prevents an adversary from arbitrarily deviating from the data flow of a program. Hence, an attacker cannot perform an attack as outlined before on SGX. This enables that all programs, even those including operations on control-flow decision variables, can be computed on encrypted data. We implemented DFAuth using a novel authenticated homomorphic encryption scheme, a Java bytecode-to-bytecode compiler producing fully executable programs, and SGX enclaves. A transformed neural network that performs machine learning on sensitive medical data can be evaluated on encrypted inputs and encrypted weights in 0.86 seconds.
Modern multi-core processors share cache resources for maximum cache utilization and performance gains. However, this leaves the cache vulnerable to side-channel attacks, where inherent timing differences in shared cache behavior are exploited to infer information on the victim’s execution patterns, ultimately leaking private information such as a secret key. The root cause for these attacks is mutually distrusting processes sharing the cache entries and accessing them in a deterministic and consistent manner. Various defenses against cache side-channel attacks have been proposed. However, they suffer from serious shortcomings: they either degrade performance significantly, impose impractical restrictions, or can only defeat certain classes of these attacks. More importantly, they assume that side-channel-resilient caches are required for the entire execution workload and do not allow the possibility to selectively enable the mitigation only for the security-critical portion of the workload.
We present a generic mechanism for a flexible and soft partitioning of set-associative caches and propose a hybrid cache architecture, called HybCache. HybCache can be configured to selectively apply side-channel-resilient cache behavior only for isolated execution domains, while providing the non-isolated execution with conventional cache behavior, capacity and performance. An isolation domain can include one or more processes, specific portions of code, or a Trusted Execution Environment (e.g., SGX or TrustZone). We show that, with minimal hardware modifications and kernel support, HybCache can provide side-channel-resilient cache only for isolated execution with a performance overhead of 3.5–5%, while incurring no performance overhead for the remaining execution workload. We provide a simulator-based and hardware implementation of HybCache to evaluate the performance and area overheads, and show how HybCache mitigates typical access-based and contention-based cache attacks
Many companies provide neural network prediction services to users for a wide range of applications. However, current prediction systems compromise one party’s privacy: either the user has to send sensitive inputs to the service provider for classification, or the service provider must store its proprietary neural networks on the user’s device. The former harms the personal privacy of the user, while the latter reveals the service provider’s proprietary model.
We design, implement, and evaluate Delphi, a secure prediction system that allows two parties to run a neural network inference without revealing either party’s data. Delphi approaches the problem by simultaneously co-designing cryptography and machine learning. We first design a hybrid cryptographic protocol that improves upon the communication and computation costs over prior work. Second, we develop a planner that automatically generates neural network architecture configurations that navigate the performance-accuracy trade-offs of our hybrid protocol. Together, these techniques allow us to achieve a 22x improvement in prediction latency compared to the state-of-the-art prior work.
Hardware enclaves are designed to execute small pieces of sensitive code or to operate on sensitive data, in isolation from larger, less trusted systems. Partitioning a large, legacy application requires significant effort. Partitioning an application written in a managed language, such as Java, is more challenging because of mutable language characteristics, extensive code reachability in class libraries, and the inevitability of using a heavyweight runtime.
Civet is a framework for partitioning Java applications into enclaves. Civet reduces the number of lines of code in the enclave and uses language-level defenses, including deep type checks and dynamic taint-tracking, to harden the enclave interface. Civet also contributes a partitioned Java runtime design, including a garbage collection design optimized for the peculiarities of enclaves. Civet is efficient for data-intensive workloads; partitioning a Hadoop mapper reduces the enclave overhead from 10× to 16–22% without taint-tracking or 70–80% with taint-tracking.