Google today released the open source version of the differential privacy library used in some its core products, such as Google Maps. Any organization or developer can now check out the library on GitHub.

Differential privacy limits the algorithms used to publish aggregate information about a statistical database. Whether you are a city planner, small business owner, or software developer, chances are you want to gain insights from the data of your citizens, customers, or users. But you don’t want to lose their trust in the process. Differentially private data analysis enables organizations to learn from the majority of their data without allowing any single individual’s data to be distinguished or re-identified.

“If you are a health researcher, you may want to compare the average amount of time patients remain admitted across various hospitals in order to determine if there are differences in care,” Google product manager Miguel Guevara explains. “Differential privacy is a high-assurance, analytic means of ensuring that use cases like this are addressed in a privacy-preserving manner.”

Differential privacy library features

Google promises its library is easy for developers to deploy. It can help you perform functions that are difficult to execute from scratch, “like automatically calculating bounds on user contributions,” Guevera says. Key features include:

  • Statistical functions: Most common data science operations (counts, sums, averages, medians, and percentiles) are supported.
  • Rigorous testing: Besides an extensive test suite, an extensible “Stochastic Differential Privacy Model Checker library” helps prevent mistakes.
  • Ready to use: A PostgreSQL extension, along with common recipes, is included.
  • Modular: The library can be extended to include other functionalities, such as additional mechanisms, aggregation functions, or privacy budget management.

This isn’t Google’s first differential privacy rodeo, but the company has been particularly busy in the space this year. In March, Google released TensorFlow Privacy and TensorFlow Federated. In June, the company open-sourced Private Join and Compute, which gives companies data insights while preserving privacy. To learn more about Google’s approach, check out the team’s technical paper on arXiv.