Rhythm analysis
Percussion Transcription
Mnemonic Transcription
Note
REQUIRES: torch
- class compiam.rhythm.transcription.mnemonic_transcription.MnemonicTranscription(syllables, feature_kwargs={'hop_length': 256, 'n_mfcc': 13, 'win_length': 1024}, model_kwargs={'algorithm': 'viterbi', 'n_components': 7, 'n_iter': 100, 'n_mix': 3, 'params': 'mcw'}, sr=44100)[source]
bōl or solkattu transcription from audio. Based on model presented in [1]
[1] Gupta, S., Srinivasamurthy, A., Kumar, M., Murthy, H., & Serra, X. (2015, October). Discovery of Syllabic Percussion Patterns in Tabla Solo Recordings. In Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR 2015) (pp. 385–391). Malaga, Spain.
- extract_features(audio, sr=None)[source]
Convert input audio to features MFCC features
- Parameters:
audio (np.array) – time series representation of audio
sr – sampling rate of audio to train on (default <self.sr>)
sr – int
- Returns:
array of features
- Return type:
np.array
- get_sample_ix(annotations, audio, syl)[source]
Convert input onset annotations to list of in/out points for a specific bōl/solkattu syllable, <syl>
- Parameters:
annotations (list/iterable) – onset annotations of the form [(timestamp in seconds, bōl/solkattu),… ]
audio (np.array) – time series representation of audio
syl (str) – bōl/solkattu syllable to extract
- Returns:
list or [(t1,t2),..] where t1 and t2 correspdong to in and out points of single bōls/solkattus
- Return type:
str
- load_annotations(annotation_path)[source]
Load onset annotations from <annotation_path>
- Parameters:
annotation_path (str) – path to onset annotations for one recording of the form (timestamp in seconds, bōl/solkattu syllable)
- Returns:
list of onset annotations (timestamp seconds, bōl/solkattu syllable)
- Return type:
list
- map(a)[source]
Map input bōl/solkattu, <a> to reduced bōl/solkattu vocabulary
- Parameters:
a (np.array) – bōl/solkattu string (that must exist in self.mapping)
- Returns:
mapped bōl/solkattu label
- Return type:
str
- predict(file_paths, onsets=None, sr=None)[source]
Predict bōl/solkattu transcription for list of input audios at <file_paths>.
- Parameters:
file_paths (list or string) – Either one file_path or list of file_paths to audios to predict on
onsets (list or None) – list representing onsets in audios. If None, compiam.rhythm.akshara_pulse_tracker is used to automatically identify bōl/solkattu onsets. If passed should be a list of onset annotations, each being a list of bōl/solkattu onsets in seconds. <onsets> should contain one set of onset annotations for each file_path in <file_paths>
sr – sampling rate of audio to train on (default <self.sr>)
sr – int
- Returns:
if <file_paths> is a list, then return a list of transcriptions, each transcription of the form [(timestamp in seconds, bōl/solkattu),…]. Or if <file_paths> is a single fiel path string, return a single transcription.
- Return type:
list
- predict_sample(sample)[source]
Predict one sample using internal models. One sample should correspond to one bōl/solkattu
- Parameters:
sample (np.array) – Numpy array features corresponding to <sample> (extracted using self.extract_features)
- Returns:
bōl/solkattu label
- Return type:
str
- predict_single(file_path, onsets=None, sr=None)[source]
Predict bōl/solkattu transcription directly from audio time series (such as for example that loaded by librosa.load)
- Parameters:
file_path (str) – File path to audio to analyze
onsets (list or None) – If None, compiam.rhythm.akshara_pulse_tracker is used to automatically identify bōl/solkattu onsets. If passed <onsets> should be a list of bōl/solkattu onsets in seconds
sr – sampling rate of audio to train on (default <self.sr>)
sr – int
- Returns:
bōl/solkattu transcription of form [(time in seconds, syllable),… ]
- Return type:
list
- save(model_path)[source]
Save model at path as .pkl
- Parameters:
model_path (strs) – Path to save model to
- train(file_paths_audio, file_paths_annotation, sr=None)[source]
Train one gaussian mixture model hidden markov model for each syllables passed at initialisation on input audios and annotations passed via <file_paths_audio> and <file_paths_annotation>. Training hyperparameters are configured upon intialisation and can be accessed/changed via self.model_kwargs.
- Parameters:
file_paths_audio (list) – List of file_paths to audios to train on
file_paths_annotation (list) – List of file_paths to annotations to train on. annotations should be in csv format, with no header of (timestamp in seconds, <syllable>). Annotated syllables that do not correspond to syllables passed at initialisation will be ignored One annotations path should be passed for each audio path
sr – sampling rate of audio to train on (default <self.sr>)
sr – int
Meter tracking
Akshara Pulse Tracker
- class compiam.rhythm.meter.akshara_pulse_tracker.AksharaPulseTracker(Nfft=4096, frmSize=1024, Fs=44100, hop=512, fBands=array([[10, 110], [110, 500], [500, 3000], [3000, 5000], [5000, 10000], [0, 22000]]), songLenMin=600, octCorrectParam=0.25, tempoWindow=8, stepSizeTempogram=0.5, BPM=array([40., 40.5, 41., ..., 599., 599.5, 600.]), minBPM=120, octTol=20, theta=0.005, delta=1000000, maxLen=0.6, binWidth=0.01, thres=0.05, ignoreTooClose=0.6, decayCoeff=15, backSearch=[5.0, 0.5], alphaDP=3, smoothTime=2560, pwtol=0.2)[source]
Akshara onset detection. CompMusic Rhythm Extractor.
- extract(input_data, input_sr=44100, verbose=True)[source]
Run extraction of akshara pulses from input audio file
- Parameters:
input_data – path to audio file or numpy array like audio signal
input_sr – sampling rate of the input array of data (if any). This variable is only relevant if the input is an array of data instead of a filepath
verbose – verbose level
- Returns:
dict containing estimation for sections, matra period, akshara pulses, and tempo curve