MATLAB NMF Results
We used NMF and experimented with different parameters and audios. The parameters changed were:
-
Window Length - Length of the window doing Short Time Fourier Transform (STFT) on
-
Window Shape - What window is used in STFT, like Hamming, Hann, Kaiser, Triangular, Rectangular
-
Noverlap - Number of points that overlap from one window to the next
-
Nfft - Number of points in FFT
-
Rank of the factorization - How many sources we want NMF produce
​
We experimented with different audios: two-note instruments, 20-second drums, boy-man conversation, and different mixings among drums, vocals, guitar, bass.
Two-note instruments
We made two instruments each playing 2 separate notes in Logic (music software). We expected NMF to work really well, as 1) instruments greatly varied in frequency range, 2) each of them were playing one note, so it's easy to differentiate frequencies, and 3) each of them were playing notes at different times.
Results
All the combinations we tried have very similar results. You can very clearly hear that when the bass note stops or reduces in volume, so does the higher frequency component. It clearly demonstrates how NMF works.
Best Results
20-second drums
We made a shorter version of just drums based on drums of a song.
Results
We felt like the "stuttering" effect was from the window length STFT.
Best Results
Boy-man conversation
We found online an audio of a boy and an old man having a conversation. We thought this would be better as they're talking at different times (but we were very wrong).
​
There is noise in between talkers which could disrupt/affect NMF algorithm negatively.
​
There's a weird modulation effect when window length is small. Humans can detect voices and words easily so the bar of source separating speakers is lower.
Best Results
"Nobody" stitch up bass & vocal
Song called "Nobody". We just combined bass and vocals since in theory, they wouldn't have overlapping frequency range. It did better than expected.
Best Results
"Nobody" stitch up drums & vocal
Song called "Nobody". We just combined drums and vocals since in theory, NMF is really good at drums, so we tried with vocals.
​
The snare directly is the same frequency as the vocal. When one audio has really good vocals, the other source audio sounds bad.
Best Result
Other Findings
-
Impact of Factorization Rank: A small rank results in overly coarse separation, where each component retains a mix of various elements, leading to an indistinct separation. Conversely, a high rank causes the gain of each component to diminish significantly, amplifying artifacts and reducing the clarity of separation.​
-
Window Shape Comparison: After testing several window shapes, the Hamming window consistently delivered the best performance. Its smoothing properties appear to enhance the quality of the separation.
-
Effect of Input Audio Composition: Mixed audio that lacks vocals generally yields better separation results, likely due to reduced complexity in the signal.
-
Influence of FFT Points and Overlap: Our experiments revealed a general trend: higher FFT points and greater overlap (Noverlap) lead to improved separation results. These parameters enhance frequency resolution and temporal continuity, contributing to higher-quality outputs.
These findings show the importance of carefully balancing parameters and choosing suitable preprocessing methods to achieve optimal results in audio separation using NMF.