Contact: {merlijn.blaauw,jordi.bonada}@upf.edu
Not published.
Comparison of different systems, in all cases the input mel-spectrogram/WORLD features are obtained by analysis of a recording.
GT = Reference recording
NW = Neural WORLD (our proposed system)
NW-NoAdv = Neural WORLD (our proposed system) without adversarial training
E-PWG = Excited Parallel WaveGAN (non-autoregressive baseline)
WORLD = WORLD vocoder (signal processing based)
AR-WNV = Autoregressive WaveNet vocoder