top of page

Python NMF/HPSS Results

For this part of the project, librosa was used to analyze the bass and drums of an audio file called Music Delta - Disco. The individual bass and drums files were added together, and spectrograms and Mel spectrograms were created. The spectrogram, Mel spectrogram, original bass file, and original drums file are shown below:​​​​​​

bass
00:00 / 02:04
drums
00:00 / 02:04

Different separation techniques were performed on each of the spectrograms.

NMF

In performing NMF, we used the following parameters:

​

  • n_components: 12

  • init: nndsvda

  • solver: mu

  • beta_loss: kullback-leiber

  • max_iter: 500

  • random_state: 0

  • l1_ratio: 0.1

​

After reconstruction, we obtained the following spectrograms:​​​

We obtained the following audio files:

bass_nmf
00:00 / 00:09
drums_nmf
00:00 / 00:09
bass_nmf_mel
00:00 / 00:09
drums_nmf_mel
00:00 / 00:09

NMF seemed to separate drums very well from the audio. However, we noticed that there were noticeable drum components present in the bass output. The bass output was capturing some of the lowest and highest frequency drum sounds.​

HPSS

In performing HPSS, we used a power of 1.5 and also used soft masks.​​​ After reconstruction, we obtained the following spectrograms:​​​

We obtained the following audio files:

bass_hpss
00:00 / 00:09
drums_hpss
00:00 / 00:09
bass_hpss_mel
00:00 / 00:09
drums_hpss_mel
00:00 / 00:09

HPSS seemed to also separate drums well. However, it was better than NMF at attributing the lower frequency drum components to the drum output correctly. The higher frequency snares were still present in the bass output.

HPSS and NMF

To attempt at improving our results, we first performed HPSS and then performed NMF on its output.​​​ After reconstruction, we obtained the following spectrograms:​​​

We obtained the following audio files:

bass_hpss_nmf
00:00 / 00:09
drums_hpss_nmf
00:00 / 00:09
bass_hpss_nmf_mel
00:00 / 00:09
drums_hpss_nmf_mel
00:00 / 00:09

The bass and drums output using regular spectrograms sounded very well separated. However, the bass output was very quiet, even with a gain applied. Additionally, the Mel spectrogram output had significant distortion in its sound.

Spleeter

Spleeter was used to separate the mixed audio as well. We obtained the following audio files:

bass_spleeter
00:00 / 00:10
drums_spleeter
00:00 / 00:10

As expected, Spleeter's bass and drums outputs very accurately represented the original bass and drums input. We noticed that the bass output was also quieter than the original bass input, which indicates that bass sounds are generally very difficult to separate.

bottom of page