2025-12-06

Beyond Steering: Evaluating Fine-Grained and Multi-Concept Control in LLMs Proceedings Article

Summary

Large Language Models (LLMs) have achieved remarkable success across a wide range of generative tasks. However, users often desire explicit control over the presence and extent of specific \textit{concepts} in the generated text; for example, controlling how \emph{humorous} or \emph{persuasive} a passage should be. While prior work in prompt engineering and representation-based concept steering has enabled coarse directional control, these methods rarely address the need for \textit{fine-grained} specification, such as explicitly setting a concept on a continuous scale. The challenge is amplified when controlling multiple concepts simultaneously, where the interaction between concepts may interfere with precise control. In this work, we introduce an evaluation framework to systematically measure the fine-grained controllability of LLMs in both single- and dual-concept settings. Our findings reveal that while simple prompt-based approaches show promise for single-concept fine-grained control, performance degrades substantially in the more challenging two-concept scenario. These results suggest that current prompting strategies are insufficient for robust multi-concept control. We encourage future work to explicitly develop methods for fine-grained control that maintain effectiveness from the single-concept to multi-concept setting.