#LLM Efficiency

2 articles with this tag

AdaCodec: Efficient Video MLLM Encoding

AdaCodec revolutionizes video MLLMs by using predictive visual coding to drastically cut tokenization costs and latency, achieving superior performance at a fraction of the budget.

17 days ago

AI Research

DMax: Parallel Decoding for Diffusion LLMs

DMax revolutionizes diffusion language models with Soft Parallel Decoding, boosting TPF significantly while preserving accuracy and achieving 1,338 TPS.

2 months ago