SPRInG: Continual LLM Personalization via Selective Parametric Adaptation and Retrieval-Interpolated Generation

Abstract

Personalizing Large Language Models typically relies on static retrieval or one-time adaptation, assuming user preferences remain invariant over time. However, real-world interactions are dynamic, where user interests continuously evolve, posing a challenge for models to adapt to preference drift without catastrophic forgetting. Standard continual learning approaches often struggle in this context, as they indiscriminately update on noisy interaction streams, failing to distinguish genuine preference shifts from transient contexts.

To address this, we introduce SPRInG, a novel semi-parametric framework designed for effective continual personalization.

During training, SPRInG employs drift-driven selective adaptation, which utilizes a likelihood-based scoring function to identify high-novelty interactions. This allows the model to selectively update the user-specific adapter on drift signals while preserving hard-to-learn residuals in a replay buffer. During inference, we apply strict relevance gating and fuse parametric knowledge with retrieved history via logit interpolation.

Experiments on the long-form personalized generation benchmark demonstrate that SPRInG outperforms existing baselines, validating its robustness for real-world continual personalization.

Motivation

1 Limitations in Existing Personalization Methods

Existing LLM personalization methods primarily rely on a time-invariant view of the user. Fixed histories are repeatedly used for retrieval, or parametric models are frozen after initial training, failing to account for constant preference drifts.

Non-parametric Strategies

Leveraging retrieved interaction histories or user profiles during prompting.

❌ Weak recency signals

❌ Costs of storage/retrieval

Parametric Strategies

Encoding user preferences through direct parameter adaptation (e.g., LoRA).

❌ Expensive updates

❌ Catastrophic forgetting.

2 Standard CL vs. Personalization Reality

While Continual Learning (CL) offers a path to sequential updates, standard approaches are suboptimal because user interaction streams are inherently continuous and noisy.

🚧

No Clear Boundaries

Preferences drift gradually rather than exhibiting discrete task shifts or abrupt distribution changes.

🌪️

Transient Noise

Transient noise can be mistaken for informative drift due to high loss or uncertainty metrics.

Our Approach: SPRInG

Our key insight is to jointly utilize parametric and non-parametric approaches for distinct yet complementary roles, capturing genuine drifts while preserving historical context.

How SPRInG Works

TRAINING

Drift-Driven Selective Adaptation

SPRInG updates the LoRA adapter exclusively on high-drift interactions, ensuring the model adapts to genuine preference evolution while filtering out transient noise.

Drift Score

$$ \mathcal{S}(q, a) = \underbrace{-\log \frac{p_\texttt{adap}}{p_\texttt{base}}}_{\text{Novelty}} + \underbrace{\alpha \cdot \log p_\texttt{base}}_{\text{Quality}} $$

Drift Sample Selection: Distinguishes meaningful preference shifts from uninformative noise by balancing pattern novelty with linguistic quality.
Residual Replay Buffer: Preserves hard-to-learn interactions that remain challenging even after the adapter update, preventing the loss of sparse historical nuances.

INFERENCE

Gated Retrieval & Interpolation

Dynamically fuses internalized parametric knowledge of stable patterns with explicit non-parametric context from the replay buffer at the logit level, ensuring responses reflect both long-term traits and specific historical details.

Logit Interpolation

$$ p_\texttt{fin} = \lambda \cdot p_\texttt{ret} + (1 - \lambda) \cdot p_\texttt{par} $$

Relevance Gating: Applies a similarity threshold to ensure retrieval only occurs when strong historical evidence exists.
Dual-path Fusion: Interpolates the Parametric Path and Retrieval-augmented Path to leverage complementary signals, combining internalized preference patterns with specific historical evidence.

Experiments

SPRInG consistently outperforms all baselines in both Abstract Generation and Review Writing tasks. It achieves relative improvements of up to 15.85% in ROUGE-L over the strongest continual learning baselines. While standard RAG and PAG methods often struggle with noisy contexts, SPRInG effectively balances plasticity and stability by selectively leveraging beneficial history.

Our Top-30% strategy outperforms training on 100% of data, confirming that focusing on high-drift interactions filters out redundant noise and enhances adaptation efficiency.

Balanced logit interpolation yields the best results by combining stable parametric traits with specific historical details from the residual buffer.

SPRInG demonstrates robust scalability across varying interaction volumes.

BibTeX

@article{kim2026spring,
  title     = {SPRInG: Continual LLM Personalization via Selective Parametric Adaptation and Retrieval-Interpolated Generation},
  author    = {Kim, Seoyeon and Kim, Jaehyung},
  journal   = {arXiv preprint arXiv:2601.09974},
  year      = {2026}
}