Discogs-VINet-MIREX
Xavier Serra
R.O. Araz
J. Serrà
D. Bogdanov
MIREX 2024
2025
Abstract
This technical report presents our submission to the cover song identification task for the 2024 edition of the Music Information Retrieval Evaluation eXchange (MIREX). For this submission, we enhanced our Discogs-VINet model by changing the definition of an epoch, incorporating automatic mixed precision (AMP) during both training and inference, and sampling four versions per clique during triplet mining (which became possible with AMP). Due to this enhanced model’s performance on the Discogs-VI test set, we trained a new model from scratch using the entire Discogs-VI dataset, rather than just the training partition used in Discogs-VINet (a 45% increase in the number of versions). This enhanced and retrained model is named Discogs-VINet-MIREX.
Related Publications
In music production, manipulating audio effects (Fx) parameters through natural language has the potential to reduce technical barriers for non-experts. We present LLM2Fx, a framework leveraging Large Language Models (LLMs) to predict Fx parameters directly from textual desc…
This paper explores the use of unlearning methods for training data attribution (TDA) in music generative models trained on large-scale datasets. TDA aims to identify which specific training data points contributed to the generation of a particular output from a specific mod…
General-purpose audio representations have proven effective across diverse music information retrieval applications, yet their utility in intelligent music production remains limited by insufficient understanding of audio effects (Fx). Although previous approaches have empha…
JOIN US
Shape the Future of AI with Sony AI
We want to hear from those of you who have a strong desire
to shape the future of AI.