research

Anthropic Safety

San Francisco, United States

Founded 2021

100+ employees

About

Anthropic develops advanced large language models, most notably the Claude family of AI assistants, with a core focus on safety and interpretability. Their key innovations include “Constitutional AI” – a technique for aligning AI behavior with a set of principles – and “circuit tracing,” which allows researchers to visualize and understand the internal reasoning processes within Claude models. Anthropic’s research and technology are targeted towards developers and enterprises seeking reliable and steerable AI, demonstrated by projects like “Project Vend” exploring real-world AI task completion and ongoing efforts to enable AI introspection and cross-lingual reasoning.