H Wierstorf, D Ward, EM Grais, MD Plumbley, R Mason, C Hummersone, "Perceptual Evaluation of Source Separation for Remixing Music," in 143rd Convention of the Audio Engineering Society, Paper 9880 (2017). [ link ] [ paper ] [ presentation ]

Bibtex

@inproceedings{Wierstorf2017d,
    title     = {Perceptual Evaluation of Source Separation for Remixing
                 Music},
    author    = {Wierstorf, Hagen and Ward, Dominic and Grais, Emad M.
                 and Plumbley, Mark D. and Mason, Russell
                 and Hummersone, Chris},
    booktitle = {143rd Convention of the Audio Engineering Society},
    address   = {New York, NY},
    pages     = {Paper 9880},
    month     = {October},
    year      = {2017},
    url       = {http://www.aes.org/e-lib/browse.cfm?elib=19277}
}

Abstract

Music remixing is difficult when the original multitrack recording is not available. One solution is to estimate the elements of a mixture using source separation. However, existing techniques suffer from imperfect separation and perceptible artifacts on single separated sources. To investigate their influence on a remix, five state-of-the-art source separation algorithms were used to remix six songs by increasing the level of the vocals. A listening test was conducted to assess the remixes in terms of loudness balance and sound quality. The results show that some source separation algorithms are able to increase the level of the vocals by up to 6 dB at the cost of introducing a small but perceptible degradation in sound quality.

Supplementary material

The results from the listening test presented in this paper together with the procedure are available at 10.5281/zenodo.835191. The statistical analysis and code to reproduce the figures are accessible at 10.5281/zenodo.835196. The actual stimuli of the test are available at 10.5281/zenodo.835182. The presentation with the sound files included can be downloaded from 10.5281/zenodo.1034175