Authors

* External authors

Venue

Date

Share

Can Large Language Models Predict Audio Effects Parameters from Natural Language?

Seungheon Doh

Junghyun Koo*

Marco A. Martínez-Ramírez

Wei-Hsiang Liao

Juhan Nam

Yuki Mitsufuji

* External authors

WASPAA-25

2025

Abstract

In music production, manipulating audio effects (Fx) parameters through natural language has the potential to reduce technical barriers for non-experts. We present LLM2Fx, a framework leveraging Large Language Models (LLMs) to predict Fx parameters directly from textual descriptions without requiring task-specific training or fine-tuning. Our approach address the text-to-effect parameter prediction (Text2Fx) task by mapping natural language descriptions to the corresponding Fx parameters for equalization and reverberation. We demonstrate that LLMs can generate Fx parameters in a zero-shot manner that elucidates the relationship between timbre semantics and audio effects in music production. To enhance performance, we introduce three types of in-context examples: audio Digital Signal Processing (DSP) features, DSP function code, and few-shot examples. Our results demonstrate that LLM-based Fx parameter generation outperforms previous optimization approaches, offering competitive performance in translating natural language descriptions to appropriate Fx settings. Furthermore, LLMs can serve as text-driven interfaces for audio production, paving the way for more intuitive and accessible music production tools.

Related Publications

Large-Scale Training Data Attribution for Music Generative Models via Unlearning

ICML, 2025
Woosung Choi, Junghyun Koo*, Kin Wai Cheuk, Joan Serrà, Marco A. Martínez-Ramírez, Yukara Ikemiya, Naoki Murata, Yuhta Takida, Wei-Hsiang Liao, Yuki Mitsufuji

This paper explores the use of unlearning methods for training data attribution (TDA) in music generative models trained on large-scale datasets. TDA aims to identify which specific training data points contributed to the generation of a particular output from a specific mod…

Fx-Encoder++: Extracting Instrument-Wise Audio Effects Representations from Mixtures

ISMIR, 2025
Yen-Tung Yeh, Junghyun Koo*, Marco A. Martínez-Ramírez, Wei-Hsiang Liao, Yi-Hsuan Yang, Yuki Mitsufuji

General-purpose audio representations have proven effective across diverse music information retrieval applications, yet their utility in intelligent music production remains limited by insufficient understanding of audio effects (Fx). Although previous approaches have empha…

ITO-Master: Inference-Time Optimization for Audio Effects Modeling of Music Mastering Processors

ISMIR, 2025
Junghyun Koo*, Marco A. Martínez-Ramírez, Wei-Hsiang Liao, Giorgio Fabbro*, Michele Mancusi, Yuki Mitsufuji

Music mastering style transfer aims to model and apply the mastering characteristics of a reference track to a target track, simulating the professional mastering process. However, existing methods apply fixed processing based on a reference track, limiting users' ability to…

  • HOME
  • Publications
  • Can Large Language Models Predict Audio Effects Parameters from Natural Language?

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.