Tonic identification

Tonic identification#

As mentioned in the introduction of the tambūrā drone, the sa played by the tambūrā is very important from a computational analysis point of view, because it provides information to locate the sa of the melody (and also the rest of svāras in the rāga), while it is also very useful to normalise the melodic lines for a better processing and understanding.

Several works have been proposed aiming at automatically identifying the tonic from Carnatic and Hindustani recordings [BIS12, GSS12, RAS11, SGS12], being all of these methods knowledge-based and all operating on diverse features extracted from the pitch curves. The pitch curves are typically automatically extracted from the music signals (see the pitch extraction walkthrough for further detail). More recently, a DL-based approach for the said task has been proposed [SB21].

Note

Although the term “tonic” does not appear in Indian Art Music, the sa (ṣaḍja) pitch position is often referred to as the “tonic” in MIR literature, due to similarities between the two concepts.

Let us first start by installing and importing the latest released version of compiam.

## Installing (if not) and importing compiam to the project
import importlib.util
if importlib.util.find_spec('compiam') is None:
    ## Bear in mind this will only run in a jupyter notebook / Collab session
    %pip install git+git://github.com/MTG/compIAM.git
import compiam

# Import extras and supress warnings to keep the tutorial clean
import os
import numpy as np
from pprint import pprint
import warnings
warnings.filterwarnings('ignore')

Important

You need to have compiam installed to execute the walkthroughs of this tutorial in your machine or in the cloud using, for instance, Google Collab. Make sure you install compiam by running: pip install compiam. You can run command line functions from a notebook by writing a % or ! at the beginning of the command, e.g. %pip install compiam.

Let’s list the available tools in compiam to perform tonic identification.

compiam.melody.tonic_identification.list_tools()
['TonicIndianMultiPitch']

Multi-pitch tonic identification#

Tip

Make sure you take a look at the documentation of the tool you are willing to use, in case this needs an optional dependency, or has some relevant particularity.

In this case, in the documentation of the tonic identification approach we have in compiam, we observe the need of essentia, which is an optional dependency. Let’s install it before moving on.

Note

Initializing a tool without a required optional dependency will basically throw and error and provide the user with instructions to easily install it.

%pip install essentia
import essentia
# Importing the tool
from compiam.melody.tonic_identification import TonicIndianMultiPitch

# We first initialize the tool we have just imported
tonic_multipitch = TonicIndianMultiPitch()
# Let's first see the specific attributes of this tool
attributes = [x for x in dir(tonic_multipitch) if "__" not in x]
pprint(attributes)
['bin_resolution',
 'extract',
 'frame_size',
 'harmonic_weight',
 'hop_size',
 'magnitude_compression',
 'magnitude_threshold',
 'max_tonic_frequency',
 'min_tonic_frequency',
 'num_harmonics',
 'ref_frequency',
 'sample_rate']

We observe a long list of attributes for this tool. That is because this is an extractor. Within the context of this tutorial, we use this concept to refer to heuristic-based tools that extract or compute a particular representation from a music signal. In this case, the tonic of the input musical recording. Heuristic-based approaches are commonly tuned by a list of particular parameters, which can be tuned to improve the performance for particular cases.

# We can print out the standard value for a particular parameter
tonic_multipitch.minTonicFrequency
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In [6], line 2
      1 # We can print out the standard value for a particular parameter
----> 2 tonic_multipitch.minTonicFrequency

AttributeError: 'TonicIndianMultiPitch' object has no attribute 'minTonicFrequency'

Parameters can be updated by basically setting the new value to the attribute.

# Updating value for a particular parameter
tonic_multipitch.minTonicFrequency = 80
tonic_multipitch.minTonicFrequency
80

Let’s now load the CompMusic Indian Art Music Tonic dataset to evaluate the performance of this approach.

# Loading Tonic dataset using the mirdata loader
tonic_dataset = compiam.load_dataset(
    "compmusic_indian_tonic",
    data_home=os.path.join("..", "audio", "mir_datasets"),
)

Unfortunately, the audio tracks for the Indian Art Music Tonic dataset are not openly available but only shared under explicit request. However that is not a problem! The workflow to get the entire dataset would look as simple as that:

tonic_dataset = compiam.load_dataset("compmusic_indian_tonic")
tonic_dataset.download()
### Request audio in https://zenodo.org/record/7342372
### Download audio, unzip, and arrange folders as specified in docs
tonic.dataset.validate()
### You are ready to go!
tonic_dataset
The compmusic_indian_tonic dataset
----------------------------------------------------------------------------------------------------


Call the .cite method for bibtex citations.
----------------------------------------------------------------------------------------------------


CompMusic Tonic Dataset track class

    Args:
        track_id (str): track id of the track
        data_home (str): Local path where the dataset is stored.

    Attributes:
        track_id (str): track id
        audio_path (str): audio path

    Cached Properties:
        tonic (float): tonic annotation
        artist (str): performing artist
        gender (str): gender of the recording artists
        mbid (str): MusicBrainz ID of the piece (if available)
        type (str): type of piece (vocal, instrumental, etc.)
        tradition (str): tradition of the piece (Carnatic or Hindustani)

    ----------------------------------------------------------------------------------------------------

Otherwise, if available, you may get the audios from the Dunya database. Accessing the data in Dunya requires you to have a unique and non-shareable access token. For that reason, we cannot provide here interactive walkthrough of how to parse audio examples from Dunya.

Not a problem though! We list here an example code block that may be run to parse audio from Dunya, and we provide, within the tutorial materials, a couple of audio excerpts to show you through the available tools. We have observed in the cell output above that the tracks in the Indian Art Music Tonic dataloader have a mbid attribute. Having a MusicBrainz ID at hand, we can run a code snippet, as the example below, to get the audio from the Dunya database.

import compiam
carnatic_corpora = compiam.load_corpora("carnatic", cc=True, token="<your-token>")

# Print out the available recordings in the database
print(carnatic_corpora.get_collection())

# Print out available data for specific track
print(carnatic_corpora.get_recording("<mbid>"))

# Download and save mp3 audio for particular track
carnatic_corpora.download_mp3("<mbid>", "<path/to/save")  

Important

Please keep in mind that not all recordings in Dunya can be downloaded with a regular token. You may need to request access through the Dunya website in order to get your token upgraded and get access to the restricted (see the Section about accessing the Dunya corpora).

Assuming we have downloaded the audio for two examples, let’s extract the tonic from these. The two example tracks we have selected for this tutorial have the following IDs.

  • 0a6ebaa4-87cc-452d-a7af-a2006e96f16a_0-180

  • 01-varnam-nayaki

Note

Tracks in mirdata loaders have a unique ID. Sometimes, this ID may not be very intuitive. However, the idea behind the dataloader is that you can load, filter, and use the tracks in a dataloader programatically, avoiding as much manual work as possible. Make sure you make the most out of the mirdata loaders. Relevant examples of that are given in this webbook.

tonic_tracks = tonic_dataset.load_tracks()
track_1 = tonic_tracks["0a6ebaa4-87cc-452d-a7af-a2006e96f16a_0-180"]
track_2 = tonic_tracks["01-varnam-nayaki"]
track_1
Track(
  audio_path="...udio/mir_datasets/indian_art_music_tonic_1.0/CM/audio/0a6ebaa4-87cc-452d-a7af-a2006e96f16a_0-180.mp3",
  track_id="0a6ebaa4-87cc-452d-a7af-a2006e96f16a_0-180",
  artist: ,
  audio: The track's audio

        Returns,
  gender: ,
  mbid: ,
  tonic: ,
  tradition: ,
  type: ,
)

Let’s just print out some of the relevant metadata and annotations for this track.

print("mbid:", track_1.mbid)
print("Tonic:", track_1.tonic)
print("Artist:", track_1.artist)
mbid: 0a6ebaa4-87cc-452d-a7af-a2006e96f16a
Tonic: 131.436
Artist: T. N. Seshagopalan

As you already know, we can listen to the actual recording.

import IPython.display as ipd

ipd.Audio(
    data=track_1.audio[0][:60*44100],  # Getting the first minute
    rate=track_1.audio[1]
) # Remember: returns tuple (audio, sr)!

Let’s now use the tonic identification approach in compiam to extract the tonic from these recordings. The extract methods takes an audio path as input. The location of the audio per each track is also easily parsed.

tonic_1 = tonic_multipitch.extract(track_1.audio_path)
tonic_2 = tonic_multipitch.extract(track_2.audio_path)
print("Track id:", track_1.track_id)
print("Annotated tonic:", track_1.tonic)
print("Extracted tonic:", tonic_1)
Track id: 0a6ebaa4-87cc-452d-a7af-a2006e96f16a_0-180
Annotated tonic: 131.436
Extracted tonic: 131.7633056640625

For the second example, we synthesize a drone at the estimated tonic, aiming at evaluate by active listening the accuracy of this particular extraction.

print("Track id:", track_2.track_id)
print("Annotated tonic:", track_2.tonic)
print("Extracted tonic:", tonic_2)
Track id: 01-varnam-nayaki
Annotated tonic: 142.0
Extracted tonic: 138.8064422607422
# Let's get the audio for the track
audio, sr = track_2.audio

# Let's synthesize a tambura
synthesized_tambura = 0.75*np.sin(
    2*np.pi*float(tonic_2)*np.arange(0, len(audio)//sr, 1/sr)
)
# Adding some harmonics
synthesized_tambura += 0.25*np.sin(
    2*np.pi*float(tonic_2)*2*np.arange(0, len(audio)//sr, 1/sr)
)
synthesized_tambura += 0.5*np.sin(
    2*np.pi*float(tonic_2)*3*np.arange(0, len(audio)//sr, 1/sr)
)
synthesized_tambura += 0.125*np.sin(
    2*np.pi*float(tonic_2)*4*np.arange(0, len(audio)//sr, 1/sr)
)

# We take just a minute of music (60 seg * 44100)
audio_tonic = audio[:60*44100] + synthesized_tambura[:60*44100]
# And we play it!
ipd.Audio(
    data=audio_tonic[None],
    rate=sr,
)