Datasets#

Datasets are key for the data-driven computational research of a music tradition. Thoroughly designed collections of data that represent the most relevant aspects of a musical repertoire may open the door for solutions of several problems. For that reason, huge efforts have been made within the scope of this tutorial and compiam to (1) boost the visibility and access to Carnatic and Hindustani music datasets, and (2) provide standardized tools to get and use the said datasets.

mirdata#

mirdata is an open-source and pip-installable Python library that provides tools for working with common Music Information Retrieval (MIR) datasets [BFR+19]. Given the crucial importance of such a software for data and corpus-driven research, we have done a great efforts to integrate several IAM-centered datasets into mirdata. To date, the following datasets can be found in the latest mirdata release:

  • Carnatic collection of Saraga [SGRS20]

  • Hindustani collection of Saraga [SGRS20]

  • Carnatic Music Rhythm [SS14a]

  • Hindustani Music Rhythm [SHCS16]

  • Indian Art Music Tonic Dataset [SGS12]

  • Indian Art Music Raga Dataset [GSG+16]

  • Mridangam Stroke Dataset [ABKM14]

  • Four-Way Tabla Dataset (ISMIR 2021) [RBR21]

  • Carnatic Varnam Dataset [KISS14]

  • Saraga Carnatic Melody Synth [PRNP+23]

compiam provides access to these datasets through the mirdata loaders. Make sure to check the mirdata documentation to learn the functionalities of the loaders.

Note

The alias of the mirdata method .initialize() in our library is compiam.load_dataset(). Use this wrapper to access the mirdata loaders for Indian Art Music datasets from compiam.

## Installing (if not) and importing compiam to the project
import importlib.util
if importlib.util.find_spec('compiam') is None:
    ## Bear in mind this will only run in a jupyter notebook / Collab session
    %pip install compiam
import compiam

# Supress warnings to keep the tutorial clean
import warnings
warnings.filterwarnings('ignore')
mridangam_stroke = compiam.load_dataset("mridangam_stroke")
mridangam_stroke.download()
mridangam_stroke.validate()

This snippet of code has basically downloaded and validated the dataset, to make sure that the parsed version is canonical and not-corrupted. Let’s observe how a random track from the dataloader looks like.

Tip

Run compiam.list_datasets() to list the available datasets to use.

## Let's get a random track from the dataset!
track = mridangam_stroke.choice_track()
track
Track(
  audio_path="...me/runner/mir_datasets/mridangam_stroke/mridangam_stroke_1.5/D#/229472__akshaylaya__tham-dsh-001.wav",
  stroke_name="tham",
  tonic="D#",
  track_id="229472",
  audio: The track's audio

        Returns,
)

By displaying the track we have randomly parsed from the dataset we can observe the available annotations we can access and use. Accessing the tonic or stroke annotation for this track is as easy as:

print(track.tonic)
D#
print(track.stroke_name)
tham

Tip

The .choice_track() method parses a random track from the dataloader.

Why mirdata loaders?#

Accessing the datasets through mirdata brings numerous advantages and provides a more standardized and easy integration of the said datasets into our pipelines. See:

import numpy as np

## Loading all tracks from the dataset
mridangam_tracks = mridangam_stroke.load_tracks()

## Get available ragas
available_strokes = np.unique([mridangam_tracks[x].stroke_name \
    for x in mridangam_stroke.track_ids])
available_strokes
array(['bheem', 'cha', 'dheem', 'dhin', 'num', 'ta', 'tha', 'tham', 'thi',
       'thom'], dtype='<U5')

mirdata loaders help on getting the data loaded and organized without the need of writing functions to do that ourselves. In this example below, we create a dictionary in which stroke names are keys and for each key we have a list of audio samples including their respective stroke.

stroke_dict = {item: [] for item in available_strokes}
for i in mridangam_stroke.track_ids:
    stroke_dict[mridangam_tracks[i].stroke_name].append(mridangam_tracks[i].audio_path)

stroke_dict['bheem'][0]
'/home/runner/mir_datasets/mridangam_stroke/mridangam_stroke_1.5/B/224030__akshaylaya__bheem-b-001.wav'

Let’s play this example! Audio (and also annotations!) can be easily loaded from each track.

# Let's first get a random track
random_track = mridangam_stroke.choice_track()

track_id = random_track.track_id  # Getting id of track
stroke_name = random_track.stroke_name  # Getting stroke label
tonic = random_track.tonic  # Getting tonic of stroke

import IPython.display as ipd
print("Play recording of id: {}, including stroke '{}' and tonic {}"\
    .format(track_id, stroke_name, tonic))
    
ipd.Audio(
    random_track.audio[0],
    rate=random_track.audio[1]
) # Returns tuple (audio, sr)
Play recording of id: 224976, including stroke 'thi' and tonic B

The relevance of mirdata and the advantages that using the dataloaders will be further seen multiple times along the entire tutorial.

Contributing with a dataloader#

As already mentioned in the compiam contribution insights, if interested on including your Indian Art Music dataset in mirdata and subsequently in compiam, feel free to follow the mirdata contributing guidelines to get your dataset integrated to mirdata, and then open an issue in compiam to have your dataset considered there as well.

Note

Not all datasets compiled from the CompMusic corpora have been integrated into mirdata (and therefore into compiam). You may also help us (and especially the community!) by writing a dataloaders for the datasets in CompMusic that are yet to be integrated in mirdata.