Anthropic releases AI constitution to promote ethical behavior and development

Anthropic, a leading artificial intelligence company founded by former OpenAI engineers, has taken a novel approach to addressing the ethical and social challenges posed by increasingly powerful AI systems: giving them a constitution.

On Tuesday, the company publicly released its official constitution for Claude, its latest conversational AI model that can generate text, images and code. The constitution outlines a set of values and principles that Claude must follow when interacting with users, such as being helpful, harmless and honest. It also specifies how Claude should handle sensitive topics, respect user privacy and avoid illegal behavior.

“We are sharing Claude’s current constitution in the spirit of transparency,” said Jared Kaplan, Anthropic cofounder, in an interview with VentureBeat. “We hope this research helps the AI community build more beneficial models and make their values more clear. We are also sharing this as a starting point — we expect to continuously revise Claude’s constitution, and part of our hope in sharing this post is that it will spark more research and discussion around constitution design.”

The constitution draws from sources like the UN Declaration of Human Rights, AI ethics research and platform content policies. It is the result of months of collaboration between Anthropic’s researchers, policy experts and operational leaders, who have been testing and refining Claude’s behavior and performance.

By making its constitution public, Anthropic hopes to foster more trust and transparency in the field of AI, which has been plagued by controversies over bias, misinformation and manipulation. The company also hopes to inspire other AI developers and stakeholders to adopt similar practices and standards.

The announcement highlights growing concern over how to ensure AI systems behave ethically as they become more advanced and autonomous. Just last week, the former leader of Google’s AI research division, Geoffrey Hinton, resigned from his position at the tech giant, citing growing concerns about the ethical implications of the technology he helped create. Large language models (LLMs), which generate text from massive datasets, have been shown to reflect and even amplify the biases in their training data.

Building AI systems to combat bias and harm

Anthropic is one of the few startups that specialize in developing general AI systems and language models, which aim to perform a wide range of tasks across different domains. The company, which was launched in 2021 with a $124 million series A funding round, has a mission to ensure that transformative AI helps people and society flourish.

Claude is Anthropic’s flagship product, which it plans to deploy for various applications such as education, entertainment and social good. Claude can generate content such as poems, stories, code, essays, songs, celebrity parodies and more. It can also help users with rewriting, improving or optimizing their content. Anthropic claims that Claude is one of the most reliable and steerable AI systems in the market, thanks to its constitution and its ability to learn from human feedback.

“We chose principles like those in the UN Declaration of Human Rights that enjoy broad agreement and were created in a participatory way,” Kaplan told VentureBeat. “To supplement these, we included principles inspired by best practices in Terms of Service for digital platforms to help handle more contemporary issues. We also included principles that we discovered worked well via a process of trial and error in our research. The principles were collected and chosen by researchers at Anthropic. We are exploring ways to more democratically produce a constitution for Claude, and also exploring offering customizable constitutions for specific use cases.”

The unveiling of Anthropic's constitution highlights the AI community's growing concern over system values and ethics — and demand for new techniques to address them. With increasingly advanced AI deployed by companies around the globe, researchers argue models must be grounded and constrained by human ethics and morals, not just optimized for narrow tasks like generating catchy text. Constitutional AI offers one promising path toward achieving that ideal.

Constitution to evolve with AI progress

One key aspect of Anthropic's constitution is its adaptability. Anthropic acknowledges that the current version is neither finalized nor likely the best it can be, and it welcomes research and feedback to refine and improve upon the constitution. This openness to change demonstrates the company's commitment to ensuring that AI systems remain up-to-date and relevant as new ethical concerns and societal norms emerge.

"We will have more to share on constitution customization later," said Kaplan. "But to be clear: all uses of our model need to fall within our Acceptable Use Policy. This provides guardrails on any customization. Our AUP screens off harmful uses of our model, and will continue to do this."

While AI constitutions are not a panacea, they do represent a proactive approach to addressing the complex ethical questions that arise as AI systems continue to advance. By making the value systems of AI models more explicit and easily modifiable, the AI community can work together to build more beneficial models that truly serve the needs of society.

"We are excited about more people weighing in on constitution design," Kaplan said. "Anthropic invented the method for Constitutional AI, but we don’t believe that it is the role of a private company to dictate what values should ultimately guide AI. We did our best to find principles that were in line with our goal to create a Helpful, Harmless, and Honest AI system, but ultimately we want more voices to weigh in on what values should be in our systems. Our constitution is living — we will continue to update and iterate on it. We want this blog post to spark research and discussion, and we will continue exploring ways to collect more input on our constitutions."

Building AI systems to combat bias and harm

Constitution to evolve with AI progress

More