Developers can now deploy and run virtually any model from Hugging Face thanks to Together AI's new Dedicated Container Inference (DCI) offering. This advancement significantly lowers the barrier to entry for experimenting with cutting-edge AI models, abstracting away the complexities of inference server configuration and container setup.
The rapid pace of AI model releases, such as Netflix's recent void-model, often creates a lag between discovery and practical application. Traditionally, integrating a new model involves substantial effort in environment setup, dependency management, and inference server configuration. Together AI's DCI aims to eliminate this delay.
Using agents like Goose and Together's dedicated containers skill, developers can reportedly go from identifying a model to having a running inference container in a single session. For instance, deploying Netflix's void-model involved installing the skill, issuing a single prompt to the agent, and receiving a complete, runnable setup for Together's infrastructure.
