#AI Inference

21 articles with this tag

Superlinked's Filip Makraduli on Small Model Inference Infrastructure
Artificial Intelligence

Superlinked's Filip Makraduli on Small Model Inference Infrastructure

Filip Makraduli of Superlinked discusses the critical need for robust small model inference infrastructure, highlighting Superlinked's open-source solution.

about 14 hours ago
Intel's AI Chip Demand: A Boon for Semiconductor Stocks?
Semiconductors

Intel's AI Chip Demand: A Boon for Semiconductor Stocks?

Intel CEO Pat Gelsinger discusses the surging demand for Intel CPUs in AI inference and the company's strategy to leverage its integrated hardware offerings and partnerships for growth.

12 days ago
Orbital aims for space AI data centers
Funding Round

Orbital aims for space AI data centers

Orbital plans its first test mission for space-based AI data centers in 2027, aiming to overcome Earth's power constraints.

22 days ago
llm-d Enters CNCF Sandbox
Artificial Intelligence

llm-d Enters CNCF Sandbox

The llm-d project's entry into the CNCF Sandbox marks a pivotal moment for cloud-native AI inference and open infrastructure.

about 1 month ago
OpenAI Cerebras Deal Targets Real Time AI Speed
AI Research

OpenAI Cerebras Deal Targets Real Time AI Speed

OpenAI's Cerebras partnership prioritizes reducing AI inference latency, aiming for real-time interactions to drive deeper user engagement with deployed models.

4 months ago
Google TPU Ironwood: Inference Powerhouse Arrives
AI Research

Google TPU Ironwood: Inference Powerhouse Arrives

5 months ago
Google Cloud鈥檚 AI Storage Strategy: Optimizing Performance and Cost
AI Video

Google Cloud鈥檚 AI Storage Strategy: Optimizing Performance and Cost

6 months ago
vLLM Solves the AI Model Serving Conundrum at Scale
AI Video

vLLM Solves the AI Model Serving Conundrum at Scale

6 months ago
Google Cloud Unveils Blueprint for Reliable, Scalable AI Inference
AI Video

Google Cloud Unveils Blueprint for Reliable, Scalable AI Inference

6 months ago
NVIDIA Dynamo AI Inference Scales Data Center AI
AI Research

NVIDIA Dynamo AI Inference Scales Data Center AI

6 months ago
Impala AI Targets LLM Inference Costs with $11M Seed
Funding Round

Impala AI Targets LLM Inference Costs with $11M Seed

6 months ago
Fireworks AI raises $250M to advance its AI inference platform
Funding Round

Fireworks AI raises $250M to advance its AI inference platform

6 months ago
Tensormesh exits stealth with $4.5M to slash AI inference caching costs
AI Research

Tensormesh exits stealth with $4.5M to slash AI inference caching costs

The generative AI gold rush has an expensive secret: running the models costs a fortune.

6 months ago
Tensormesh exits stealth with $4.5M to slash AI inference caching costs
AI Research

Tensormesh exits stealth with $4.5M to slash AI inference caching costs

The generative AI gold rush has an expensive secret: running the models costs a fortune.

6 months ago
Qualcomm鈥檚 Bold AI Inference Play Challenges NVIDIA Dominance
AI Video

Qualcomm鈥檚 Bold AI Inference Play Challenges NVIDIA Dominance

6 months ago
AI Research

Blackwell AI Inference: NVIDIA's Extreme-Scale Bet

8 months ago
Groq Secures $750M Investment to Expand the American AI Stack
Funding Round

Groq Secures $750M Investment to Expand the American AI Stack

8 months ago
NVIDIA Details SMART Framework for AI Inference at Scale
AI Research

NVIDIA Details SMART Framework for AI Inference at Scale

NVIDIA has outlined its comprehensive strategy for optimizing AI inference performance at scale, introducing the "Think SMART" framework as a guide for enterprises building and operating "AI factories."

9 months ago
NVIDIA Dynamo Redefines AI Inference Economics
AI Video

NVIDIA Dynamo Redefines AI Inference Economics

9 months ago
Chalk Secures $50M Series A to Revolutionize AI Inference
Funding Round

Chalk Secures $50M Series A to Revolutionize AI Inference

11 months ago
Making Machine Learning Inference Meet Real-World Performance Demands
Interview

Making Machine Learning Inference Meet Real-World Performance Demands

FPGAs offer the configurability needed for real-time machine learning inference, with the flexibility to adapt to future workloads. Making these advantages accessible to data-scientists and developers calls for tools that are both comprehensive and easy to use. Daniel Eaton, Sr Manager, Strategic Marketing Development, Xilinx

about 7 years ago