Code2LoRA: Repository Context without Overhead

Large language models for code grapple with the critical need for repository-level context—understanding imports, APIs, and project conventions. Traditional approaches, relying on extensive retrieval-augmented generation (RAG) or per-repository fine-tuning, impose significant computational costs and struggle with dynamic codebases. This limitation is now addressed by Code2LoRA, a novel hypernetwork framework.

Visual TL;DR. Code LLM context gap leads to Costly traditional methods. Costly traditional methods leads to Code2LoRA framework. Code2LoRA framework leads to Injects repo knowledge. Injects repo knowledge leads to Zero inference overhead. Zero inference overhead leads to Adapts to evolving code. Code2LoRA framework introduces RepoPeftBench.

Code LLM context gap: large language models for code struggle with repository-level context
Costly traditional methods: RAG or per-repo fine-tuning impose significant computational costs
Code2LoRA framework: novel hypernetwork framework generating dynamic LoRA adapters
Injects repo knowledge: dynamically generates repository-specific LoRA adapters on the fly
Zero inference overhead: no increase in inference-time token consumption
Adapts to evolving code: dynamically maintains and updates adapters via GRU hidden state
RepoPeftBench: a new standard for evaluating parameter-efficient code adaptation

Visual TL;DRQuickExplainDeeper

Injecting Repository Knowledge with Zero Inference Overhead

Code2LoRA generates repository-specific LoRA adapters on the fly. This ingenious method injects crucial repository knowledge without increasing inference-time token consumption, a substantial departure from prior methods that either bloat input sequences or require costly fine-tuning. The framework offers two distinct modes: Code2LoRA-Static is optimized for static code snapshots, ideal for analyzing stable projects. Code2LoRA-Evo, on the other hand, dynamically maintains and updates adapters via a GRU hidden state, directly responding to code differences (diffs) and thus perfectly suited for actively developing projects.

RepoPeftBench: A New Standard for Evaluating Parameter-Efficient Code Adaptation

To rigorously assess Code2LoRA's efficacy, the researchers introduced RepoPeftBench. This comprehensive benchmark comprises 604 Python repositories, featuring a static track with 40,000 training and 12,000 test assertion-completion tasks, and an evolution track with 215,000 commit-derived training and 87,000 commit-derived test tasks. On the static track, Code2LoRA-Static demonstrated impressive results, achieving 63.8% cross-repo and 66.2% in-repo exact match, effectively matching the performance ceiling of per-repository LoRA. The evolution track saw Code2LoRA-Evo achieve 60.3% cross-repo exact match, a significant 5.2 percentage point improvement over a single, shared LoRA model.

Code2LoRA: Repository Context without Overhead

Injecting Repository Knowledge with Zero Inference Overhead

Related startups

RepoPeftBench: A New Standard for Evaluating Parameter-Efficient Code Adaptation

AI Daily Digest