In an academic paper published this week on the preprint server Arxiv.org, a team of researchers from Princeton, Microsoft, the nonprofit Algorand Foundation, and Technion propose Falcon, an end-to-end framework for secure computation of AI models on distributed systems. They claim that it’s the first secure C++ framework to support high-capacity AI models and batch normalization, a technique for improving both the speed and stability of models. Moreover, they say that Falcon automatically aborts when it detects the presence of malicious attackers, and that it can outperform existing solutions by up to a factor of 200.

The claims are lofty, but if there’s truth to them, Falcon could be a step toward a pipeline tailored to domains where privacy and security are table stakes, like health care. Despite the emergence of techniques like federated learning and homomorphic encryption, running machine learning models in a privacy-preserving fashion without computational trade-offs remains an unsolved challenge.

In essence, Falcon assumes that there are two types of users in a distributed AI usage scenario: data holders, who own the training data sets, and query users, who query the system post-learning. A machine learning model of interest is trained on data from the data holders and then queried by the query users, such that the data holders share their data securely between servers (which use the shared data and securely train the model). Query users can submit queries to the system and receive answers based on the newly trained models, and in this way, the data holders’ inputs have privacy from the computing servers and the queries are kept a secret.

Falcon leverages new protocols for the computation of nonlinear functions, like rectified linear units (ReLU), a type of activation function. AI models contain neurons (mathematical functions) arranged in layers that transmit signals from input data and adjust the strength (weights) of each connection. That’s how they extract features and learn to make predictions; a node’s activation function defines the node’s output given inputs, taking into account the weights and sources of error.

Falcon also uses semi-honest protocols, where parties have to follow prespecified rules exactly and can’t change their inputs or outputs, and malicious protocols, where corrupted parties can deviate from rules by switching inputs and outputs or ignoring the said rules. In addition, it incorporates existing techniques to operate on smaller data types, reducing communication complexity by up to 2 times.

To evaluate Falcon’s performance, the team ran it atop six different models, ranging from 3-layer networks with 118,000 parameters (configuration variables internal to the models that are required when making predictions) to 16-layer networks with 138 million parameters, all of which were trained on the corpora MINST, CIFAR-10, and Tiny ImageNet as appropriate based on the networks’ sizes. They tested a WAN network with servers in different geographic regions and a LAN network, and in both cases, they relied on Amazon Elastic Cloud Compute instances for computation (with 16-core Intel processors and 64GB of RAM).

According to the coauthors, Falcon was orders of magnitude zippier in terms of inferencing, achieving 32 times, 16 times, and 8 times faster speeds versus the baselines Gazelle, XONN, and SecureNN, respectively. In private training, it was 4.4 times faster than ABY and 6 times faster than SecureNN.

The researchers furthermore assert that Falcon — which contains about 12,300 lines of code — doesn’t incur much of a performance penalty compared with plain-text, unsecured model execution. In one test using a single epoch (i.e., a complete pass through the training data) for the image classification model AlexNet, Falcon took 2,300 seconds on a CPU versus the plaintext’s 570 seconds — a factor of 6 difference.

The team attributes much of Falcon’s performance gains to its support for batch normalization, which they say speeds up model training by allowing higher learning rates; prevents extreme values of activations; and reduces overfitting (a phenomenon that occurs when a model learns a data set too well) by providing a regularization effect that improves training stability.

“[T]he sensitive nature of [certain] data demands deep learning frameworks that allow training on data aggregated from multiple entities while ensuring strong privacy and confidentiality guarantees,” conclude the researchers. “A synergistic combination of secure computing primitives with deep learning algorithms would enable sensitive applications to benefit from the high prediction accuracies of neural networks.”