A newer version of the Gradio SDK is available:
5.42.0
About the Classifier
The classifier is a convolutional neural network trained on over 10,000 hours of Carnatic audio sourced from this incredible YouTube collection.
Key Features:
- Can identify 150 ragas most commonly found on YouTube
- Does not require any information about the shruthi (tonic pitch) of the recording
- Compatible with male/female vocal or instrumental recordings
For those who are interested, the inference code and model checkpoints are available at the 'Files' tab in the header.
Interpreting the Classifier:
We can gain an intuitive sense for what the classifier has learned. Here is a t-SNE projection of the hidden activations averaged per ragam. Each point is a ragam, and relative distances between the points indicate the degree to which the classifier thinks the ragas are similar. Each ragam is color coded by the melakartha chakra it belongs to. We observe that the classifier has learned to a representation that roughly corresponds to these chakras!