Geometry of Efficient Fine Tuning: LoRA, Intrinsic Dimension & Subspace Learning

Submitted May 30, 2025

Choose the topic your submission falls under: Next generation architectures I am submitting for: Speaking at the Fifth Elephant 2025 Annual Conference Type of submission: 30 mins talk

Abstract

Large pre-trained models are now the norm, making Parameter-Efficient Fine-Tuning techniques like LoRA essential to reduce computational and storage costs. But why do these methods work so well? This talk explores the theory of Intrinsic Dimension (ID)—the idea that neural networks often need far fewer effective directions to learn a task than their total parameters suggest.

We’ll estimate a task’s ID via random subspace training on an MLP for MNIST, reproducing results from foundational papers. Then, we’ll compare how LoRA approximates subspace training in compute, training time, and accuracy—clarifying key design trade-offs. LoRA succeeds not just from engineering but by exploiting the low-dimensional structure revealed by ID.
We also highlight PyTorch internals that enable flexible subspace training. This talk builds on a four-part blog series bridging theory and engineering.

Motivation & Background

The rise of PEFT in the large model era: Why full fine-tuning is wasteful.
The role of LoRA as a scalable, resource-friendly alternative.
Introducing Intrinsic Dimension (ID): Why models often don’t need all their parameters.

Subspace Training for Measuring ID

Intrinsic Dimension of a task and its implications.
Hands-on walkthrough: Measuring ID using random subspace training on an MLP for MNIST.
Reproducing results from original ID literature.

LoRA as a Subspace Method

Matching LoRA’s parameter budget to subspace dimensions.
Compare LoRA vs. subspace training in FLOPs, convergence speed, accuracy.

PyTorch Internals

How nn.Parameter allows selective fine-tuning.
Using register_buffer for fixed subspace bases.
Custom forward() logic to implement dynamic re-parameterization.
Why such flexibility is hard in static-graph frameworks.

Key Takeaways

ID gives a principled lens to understand PEFT.
LoRA can be seen as a constrained subspace optimizer.
Subspace experiments offer a tool to reason about model capacity and fine-tuning efficiency.

Additional Resources –

This talk builds on the four-part blog series on LoRA and Intrinsic Dimension). These blogs gained good visibility — receiving positive traction on /r/MachineLearning and ranking 7th on Hacker News (front page for a day).

While the first two blogs on LoRA became the basis of the talk at PyCon India 2024 and at the Fifth Elephant Open Source AI meet (April 2025), this submission is based on the later two blogs which dive deeper into intrinsic dimension and measuring model complexity offering new perspectives.

The Fifth Elephant 2025 Annual Conference CfP