Spaces:
Sleeping
Sleeping
File size: 2,344 Bytes
6fb5d57 42b7ac6 6fb5d57 70f5f26 42b7ac6 70f5f26 42b7ac6 70f5f26 42b7ac6 70f5f26 42b7ac6 70f5f26 42b7ac6 70f5f26 42b7ac6 70f5f26 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
---
title: Frugal AI Challenge Submission
emoji: 🌍
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
---
# Models for Climate Disinformation Classification
## Evaluate locally
To evaluate the model locally, you can use the following command:
```bash
python main.py --config config_evaluation_{model_name}.json
```
where `{model_name}` is either `distilBERT` or `embeddingML`.
## Models Description
### DistilBERT Model
The model uses the `distilbert-base-uncased` model from the Hugging Face Transformers library, fine-tuned on the
training dataset (see below).
### Embedding + ML Model
The model uses a simple embedding layer followed by a classic ML model. Currently, the embedding layer is a simple
TF-IDF vectorizer, and the ML model is a logistic regression.
## Training Data
The model uses the [`QuotaClimat/frugalaichallenge-text-train`](https://huggingface.co/datasets/QuotaClimat/frugalaichallenge-text-train) dataset:
- Size: ~6000 examples
- Split: 80% train, 20% test
- 8 categories of climate disinformation claims
### Labels
0. No relevant claim detected
1. Global warming is not happening
2. Not caused by humans
3. Not bad or beneficial
4. Solutions harmful/unnecessary
5. Science is unreliable
6. Proponents are biased
7. Fossil fuels are needed
## Performance
### Metrics
- **Accuracy**: ~12.5% (random chance with 8 classes)
- **Environmental Impact**:
- Emissions tracked in gCO2eq
- Energy consumption tracked in Wh
### Model Architecture
The model implements a random choice between the 8 possible labels, serving as the simplest possible baseline.
## Environmental Impact
Environmental impact is tracked using CodeCarbon, measuring:
- Carbon emissions during inference
- Energy consumption during inference
This tracking helps establish a baseline for the environmental impact of model deployment and inference.
## Limitations
- Makes completely random predictions
- No learning or pattern recognition
- No consideration of input text
- Serves only as a baseline reference
- Not suitable for any real-world applications
## Ethical Considerations
- Dataset contains sensitive topics related to climate disinformation
- Model makes random predictions and should not be used for actual classification
- Environmental impact is tracked to promote awareness of AI's carbon footprint
```
|