Skip to main content

𝜏-Bench: benchmarking AI agents for the real-world

Sierra’s AI research team is on a mission to advance the frontier of conversational AI agents. In this research paper, we present a new benchmark for evaluating AI agents' performance and reliability in real-world settings, with dynamic user and tool interaction.

Download
Tau Bench cover

See what Sierra can do for you.

Find out how Sierra can help your business build better, more human customer experiences with AI.