Kimberly PatronKimberly is an Infrastructure Engineer at Sierra. Prior to Sierra, she was a Staff Infrastructure Engineer at Slack, led core infrastructure at Quip, and worked on production systems at Meta and its planet scale data platform.Posts by Kimberly PatronA more reliable inference layer for foundation modelsFoundation models are still less reliable than traditional web services, with more downtime and response times measured in seconds, not milliseconds. And as adoption grows, they’re accessed through a burgeoning ecosystem of providers and applications, which have similar functionality but different levels of reliability. Last year, we turned this bug into a feature—developing an adaptive routing client that dynamically selects providers to maximize uptime, minimize latency, and improve the overall experience.June 26, 2025
A more reliable inference layer for foundation modelsFoundation models are still less reliable than traditional web services, with more downtime and response times measured in seconds, not milliseconds. And as adoption grows, they’re accessed through a burgeoning ecosystem of providers and applications, which have similar functionality but different levels of reliability. Last year, we turned this bug into a feature—developing an adaptive routing client that dynamically selects providers to maximize uptime, minimize latency, and improve the overall experience.June 26, 2025