submission

Sleeping

File size: 1,722 Bytes

6fb5d57
42b7ac6
 
 
6fb5d57
 
 
 
 
aeb6036
70f5f26
aeb6036
70f5f26
aeb6036
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70f5f26
42b7ac6
70f5f26
42b7ac6
 
 
70f5f26
42b7ac6
 
 
aeb6036
42b7ac6
aeb6036
42b7ac6
 
 
 
aeb6036
42b7ac6

---
title: Frugal AI Challenge Submission
emoji: 🌍
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
---

## 🔊 Audio classification

### Strategy for solving the problem

To minimize energy consumption, we deliberately **chose not to use deep learning techniques** such as CNN-based spectrogram analysis, LSTM on raw audio signals or transformer models, which are generally **more computationally intensive**.

Instead, a more **lightweight approach** was adopted:
- Feature extraction from the audio signal (MFCCs and spectral contrast)
- Training a simple machine learning model (decision tree) on these extracted features

Potential Improvements (Not Yet Tested)
- Hyperparameter tuning for better performance
- Exploring alternative lightweight ML models, such as logistic regression or k-nearest neighbors
- Feature extraction without Librosa, using NumPy directly to compute basic signal properties, further reducing dependencies and overhead.

The model is exported from the notebook `notebooks\Audio_Challenge.ipynb` and saved as  `model_audio.pkl`

## 📚 Text classification

### Evaluate locally

To evaluate the model locally, you can use the following command:

```bash
python main.py --config config_evaluation_{model_name}.json
```

where `{model_name}` is either `distilBERT` or `embeddingML`.


### Models Description

#### DistilBERT Model

The model uses the `distilbert-base-uncased` model from the Hugging Face Transformers library, fine-tuned on the 
training dataset (see below).

#### Embedding + ML Model

The model uses a simple embedding layer followed by a classic ML model. Currently, the embedding layer is a simple
TF-IDF vectorizer, and the ML model is a logistic regression.