Artificial Intelligence

Preferred on Google

AI Compute: The Bottleneck for Startup Scaling

Basen CEO Tuhin Srivastava discusses the critical role of AI inference infrastructure and the challenges startups face in scaling AI models.

May 1 at 4:02 PM5 min read

Sarah Guo and Tuhin Srivastava discussing AI inference on No Priors — Image credit: No Priors· NoPriors

In the rapidly evolving world of artificial intelligence, the focus is increasingly shifting from model training to inference. As AI models become more sophisticated and widely adopted, the ability to run them efficiently and cost-effectively at scale is paramount. This conversation delves into the critical topic of AI inference, exploring the challenges and opportunities within this burgeoning market.

AI Compute: The Bottleneck for Startup Scaling - NoPriors — AI Compute: The Bottleneck for Startup Scaling — from NoPriors

Sarah Guo: Host and Venture Capitalist

Sarah Guo is a prominent figure in the tech and venture capital community. As the host of 'No Priors,' she brings a sharp, insightful perspective to discussions about startups and emerging technologies. Guo is a General Partner at Greylock Partners, a leading venture capital firm, where she focuses on enterprise software and AI investments. Her ability to distill complex topics and identify key trends makes her an invaluable voice in the industry.

Tuhin Srivastava: CEO and Co-Founder of Baseten

Tuhin Srivastava is the CEO and co-founder of Baseten, a company focused on providing AI inference infrastructure. Baseten aims to democratize AI by making it more accessible and affordable for businesses to deploy AI models. Srivastava's background in building and scaling technology companies provides him with a deep understanding of the practical challenges faced by AI developers and businesses.

The AI inference bottleneck

The core of the discussion revolves around the concept of AI inference as the new bottleneck in the AI lifecycle. While model training has historically garnered significant attention, Srivastava highlights that the actual deployment and running of these models, inference, is where the real scaling challenges lie. He notes that as AI becomes more integrated into various applications, the demand for efficient inference is skyrocketing.

Srivastava explains that the nature of AI workloads is changing. The shift from general-purpose models to more specialized ones, coupled with the rise of multi-chip architectures, presents new computational demands. This evolution necessitates a rethinking of how AI inference is handled, moving beyond traditional cloud-based solutions.

"We are in one of the craziest markets for AI inference," Srivastava states, emphasizing the rapid growth and intense competition. He elaborates on how Baseten has experienced exponential growth, projecting over a billion dollars in revenue this year. This growth is driven by the fundamental shift in how companies are approaching AI deployment.

The Rise of Open-Source and Custom Models

A significant portion of the conversation focuses on the growing importance of open-source models and the trend of companies building their own custom models. Srivastava points out that while large, proprietary models from companies like OpenAI and Google have been influential, the market is increasingly looking towards more accessible and adaptable solutions.

He explains that open-source models, while offering flexibility, often come with their own set of challenges, particularly in terms of performance and deployment. This is where companies like Baseten aim to bridge the gap, providing the infrastructure and tools to optimize these models for real-world applications.

"We've seen a lot of companies that are either building models in-house or leveraging open-source models, and they're realizing that the inference layer is where they can really differentiate," Srivastava says. He highlights that this differentiation often comes down to factors like data privacy, cost efficiency, and the ability to tailor models to specific use cases.

The Importance of Inference Infrastructure

The discussion underscores the critical role of robust inference infrastructure. Srivastava emphasizes that simply having a powerful AI model is not enough; companies need the underlying technology to run these models effectively. This includes optimizing for latency, throughput, and cost, all while ensuring reliability and scalability.

He elaborates on the concept of 'inference count,' suggesting that as AI becomes more pervasive, the number of inference requests will continue to surge. This makes the efficiency and cost-effectiveness of the inference process a key business driver.

"The ability to run inference efficiently is what unlocks the value of AI for so many companies," Srivastava explains. He notes that this is particularly true for companies that are building AI-native products or integrating AI into existing workflows.

Navigating the Compute Landscape

The conversation touches upon the broader landscape of AI compute, including the role of specialized hardware and the ongoing race for better performance. Srivastava acknowledges the importance of companies like Nvidia (NASDAQ:NVDA) in providing the foundational hardware, but stresses that the software and infrastructure layers are equally critical.

He explains that Baseten's strategy is to abstract away the complexities of hardware management, allowing businesses to focus on their core AI development. This involves optimizing the entire inference pipeline, from model deployment to real-time execution.

"We see ourselves as enabling a whole new set of applications that might not have been feasible before because of the compute constraints," Srivastava states. He emphasizes that by providing a more efficient and accessible inference solution, Baseten is helping to democratize AI and accelerate its adoption across various industries.

The Future of AI Inference

Looking ahead, both Guo and Srivastava agree that the AI inference market is poised for significant growth and innovation. The increasing demand for AI-powered applications, coupled with the ongoing advancements in model architectures and hardware, will continue to drive the need for optimized inference solutions.

Srivastava expresses optimism about the future, highlighting the potential for AI to transform industries and create new opportunities. He believes that companies that can effectively navigate the complexities of AI inference will be well-positioned to lead in this new era of intelligent technology.

The conversation concludes with a reflection on the rapid pace of change in the AI field, emphasizing the importance of staying adaptable and innovative to meet the evolving demands of the market.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.

#AI Research #Artificial Intelligence #Startup News #Tuhin Srivastava #Sarah Guo #AI inference