Ben Shi

Ben is on the Research team at Sierra. Previously, he was at Princeton University and Meta.

Ben Shi による投稿

Text "τ-knowledge" in white, with a glowing 'e', against a grainy, abstract background of shifting pink, purple, and green tones.

𝜏-knowledge: benchmarking agents on real-world knowledge

𝜏-Knowledge measures how well agents can work through messy, evolving knowledge bases to complete complex, multi-step tasks. While models are improving, they still struggle to reliably use this information in practice, leaving a large gap to real-world performance.

2026年5月13日

The text "τ³-Bench" on a blurry, grainy background of green and brown.

𝜏³-Bench: Advancing agent benchmarking to knowledge and voice

𝜏³-Bench is here. We've expanded agent evaluation to two new frontiers: knowledge retrieval and voice.

2026年3月18日

Gradient image with voice and model icons

Improving voice performance with post-training

Post-training helps our customers' voice agents achieve shorter, clearer, and more human-like conversations.

2025年11月12日

Gradient mixed colors, t-bench and leaderboard icons

𝜏-Bench leaderboard: compare, explore, and understand agent performance

Introducing the 𝜏-Bench leaderboard — a community-driven platform where researchers can submit, verify, and compare results while exploring model behavior through interactive tools.

2025年10月13日

Ben Shi

Ben Shi による投稿

𝜏-knowledge: benchmarking agents on real-world knowledge

𝜏³-Bench: Advancing agent benchmarking to knowledge and voice

Improving voice performance with post-training

𝜏-Bench leaderboard: compare, explore, and understand agent performance

Sierraでできることを、ぜひご覧ください

Ben Shi による投稿

𝜏-knowledge: benchmarking agents on real-world knowledge

𝜏³-Bench: Advancing agent benchmarking to knowledge and voice

Improving voice performance with post-training

𝜏-Bench leaderboard: compare, explore, and understand agent performance