Updates to cepstral features (MFCC and GFCC)
30 Dec 2016Working towards the next Essentia release, we have updated our cepstral features. The updates include:
-
Support for extracting MFCCs ‘the htk way’ (python example).
-
In literature there are two common MFCCs ‘standards’ differing in some parameters and the mel-scale computation itself: the Slaney way (Auditory toolbox) and the htk way (chapter 5.4 from htk book).
-
See a python notebook for a comparison with mfcc extracted with librosa and with htk.
-
Support for inverting the computed MFCCs back to spectral (mel) domain (python example).
-
The first MFCC coefficients are standard for describing singing voice timbre. The MFCC feature vector however does not represent the singing voice well visually. Instead, it is a common practice to invert the first 12-15 MFCC coefficients back to mel-bands domain for visualization. We have ported invmelfcc.m as explained here.
-
Support for cent scale.
You can start using these features before the official release by building Essentia from the master branch.
[ news ]