← Back to Computation and Language
cs.CL

What makes a task vector actually work for faster inference?

Jihoon Kwon, Jiwon Choi, Jy-yong Sohn

May 20, 2026

In-context learning lets LLMs adapt to new tasks via examples, but longer contexts mean slower inference. Task vectors compress those demonstrations into hidden states, but existing methods only check if they work—not why. The authors introduce a metric that directly measures whether a task vector's predictions match in-context learning's distribution, then use it to design Linear Task Vector (LTV), which minimizes this gap via closed-form regression. Across eight benchmarks and five LLMs, LTV improves accuracy by 9.2% while cutting latency, and task vectors from larger models even boost smaller models' performance by 6.4%.
Published as Distributional Alignment as a Criterion for Designing Task Vectors in In-Context Learning arXiv:2605.20730
Read the original paper →