Pedram RazaviPedram Razavi is a software engineer on the Knowledge team at Sierra. Previously, he was an engineer at Cocoon and Quip. He studied Computer Science and Mathematics at MIT and earned an M.S. in Symbolic Systems from Stanford.
𝜏-knowledge: benchmarking agents on real-world knowledge𝜏-Knowledge measures how well agents can work through messy, evolving knowledge bases to complete complex, multi-step tasks. While models are improving, they still struggle to reliably use this information in practice, leaving a large gap to real-world performance.2026年5月13日
Golden articles: Evaluating and improving search Search evaluation shouldn’t be static — it should reflect what actually helps resolve real customer issues. By measuring performance daily against production conversations and feeding those signals back into the system, we've built a continuously improving system for resolving customers’ needs. 2026年4月14日
Meet Linnaeus and Darwin: Search models that drive higher resolution ratesOur purpose-built retrieval and reranking models outperform off-the-shelf ones, driving up to 16 percentage point improvements in resolution rates.2026年4月3日
𝜏³-Bench: Advancing agent benchmarking to knowledge and voice𝜏³-Bench is here. We've expanded agent evaluation to two new frontiers: knowledge retrieval and voice.2026年3月18日