Facebook AI Research today introduced ReAgent, a reinforcement learning toolkit for building decision-making AI that can receive feedback. ReAgent can assign scores to user actions and treat user input such as clicking on recommended content as training data.

ReAgent is a small C++ library available for download on GitHub designed to be embedded in any application. The toolkit comes with a set of decision-making AI models to get started, an offline module for model performance assessment, and a platform to deploy AI into production using the TorchScript library in PyTorch.

Horizon, a reinforcement learning platform for deployment of large-scale models in production open-sourced by Facebook in November 2018, is now part of ReAgent.

ReAgent is currently being used to personalize billions of decisions a day at Facebook, like user notifications for Facebook and Instagram, head of applied research Srinivas Narayanan said today at Facebook’s @Scale conference. It’s also used in its robotics research on how to teach machines to walk.

“It’s the most comprehensive and modular open source platform for creating AI-based reasoning systems, and it’s the first to include policy evaluation that incorporates offline feedback to improve models,” Facebook said in a blog post. “By making it easier to build models that make decisions in real time and at scale, ReAgent democratizes both the creation and evaluation of policies in research projects as well as production applications.”

To continue to improve ReAgent, Facebook released documentation on how to deploy to cloud services like Microsoft’s Azure. Microsoft’s Azure Cognitive Services launched its own reinforcement learning service earlier this year.

The news comes a week after Facebook’s PyTorch Developer Conference, where the company introduced Captum, a tool for explainability for machine learning.

In a conversation at VentureBeat’s Transform conference this summer, OpenAI CTO Greg Brockman and chief scientist Ilya Sutskever argued that reasoning and explainability should be core to future AI models.