Spaces:
Sleeping
title: Frugal AI Challenge Submission
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
π Audio classification
Strategy for solving the problem
To minimize energy consumption, we deliberately chose not to use deep learning techniques such as CNN-based spectrogram analysis, LSTM on raw audio signals or transformer models, which are generally more computationally intensive.
Instead, a more lightweight approach was adopted:
- Feature extraction from the audio signal (MFCCs and spectral contrast)
- Training a simple machine learning model (decision tree) on these extracted features
Potential Improvements (Not Yet Tested)
- Hyperparameter tuning for better performance
- Exploring alternative lightweight ML models, such as logistic regression or k-nearest neighbors
- Feature extraction without Librosa, using NumPy directly to compute basic signal properties, further reducing dependencies and overhead.
The model is exported from the notebook notebooks\Audio_Challenge.ipynb
and saved as model_audio.pkl
π Text classification
Evaluate locally
To evaluate the model locally, you can use the following command:
python main.py --config config_evaluation_{model_name}.json
where {model_name}
is either distilBERT
or embeddingML
.
Models Description
DistilBERT Model
The model uses the distilbert-base-uncased
model from the Hugging Face Transformers library, fine-tuned on the
training dataset (see below).
Embedding + ML Model
The model uses a simple embedding layer followed by a classic ML model. Currently, the embedding layer is a simple TF-IDF vectorizer, and the ML model is a logistic regression.