Spaces:
Sleeping

title: Frugal AI Challenge Submission
emoji: 🌍
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
Models for Climate Disinformation Classification
Evaluate locally
To evaluate the model locally, you can use the following command:
python main.py --config config_evaluation_{model_name}.json
where {model_name}
is either distilBERT
or embeddingML
.
Models Description
DistilBERT Model
The model uses the distilbert-base-uncased
model from the Hugging Face Transformers library, fine-tuned on the
training dataset (see below).
Embedding + ML Model
The model uses a simple embedding layer followed by a classic ML model. Currently, the embedding layer is a simple TF-IDF vectorizer, and the ML model is a logistic regression.
Training Data
The model uses the QuotaClimat/frugalaichallenge-text-train
dataset:
- Size: ~6000 examples
- Split: 80% train, 20% test
- 8 categories of climate disinformation claims
Labels
- No relevant claim detected
- Global warming is not happening
- Not caused by humans
- Not bad or beneficial
- Solutions harmful/unnecessary
- Science is unreliable
- Proponents are biased
- Fossil fuels are needed
Performance
Metrics
- Accuracy: ~12.5% (random chance with 8 classes)
- Environmental Impact:
- Emissions tracked in gCO2eq
- Energy consumption tracked in Wh
Model Architecture
The model implements a random choice between the 8 possible labels, serving as the simplest possible baseline.
Environmental Impact
Environmental impact is tracked using CodeCarbon, measuring:
- Carbon emissions during inference
- Energy consumption during inference
This tracking helps establish a baseline for the environmental impact of model deployment and inference.
Limitations
- Makes completely random predictions
- No learning or pattern recognition
- No consideration of input text
- Serves only as a baseline reference
- Not suitable for any real-world applications
Ethical Considerations
- Dataset contains sensitive topics related to climate disinformation
- Model makes random predictions and should not be used for actual classification
- Environmental impact is tracked to promote awareness of AI's carbon footprint