Spaces:
Sleeping
Sleeping
Upload README.md
Browse files- tasks/README.md +71 -0
tasks/README.md
ADDED
@@ -0,0 +1,71 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
title: Submission Template
|
3 |
+
emoji: 🔥
|
4 |
+
colorFrom: yellow
|
5 |
+
colorTo: green
|
6 |
+
sdk: docker
|
7 |
+
pinned: false
|
8 |
+
---
|
9 |
+
|
10 |
+
|
11 |
+
# Logistic regression Model for Climate Disinformation Classification
|
12 |
+
|
13 |
+
## Model Description
|
14 |
+
|
15 |
+
This is a Logistic regression baseline model for the Frugal AI Challenge 2024, specifically for the text classification task of identifying climate disinformation. The model serves as a performance floor.
|
16 |
+
|
17 |
+
### Intended Use
|
18 |
+
|
19 |
+
- **Primary intended uses**: Baseline comparison for climate disinformation classification models
|
20 |
+
- **Primary intended users**: Researchers and developers participating in the Frugal AI Challenge
|
21 |
+
- **Out-of-scope use cases**: Not intended for production use or real-world classification tasks
|
22 |
+
|
23 |
+
## Training Data
|
24 |
+
|
25 |
+
The model uses the QuotaClimat/frugalaichallenge-text-train dataset:
|
26 |
+
- Size: ~6000 examples
|
27 |
+
- Split: 80% train, 20% test
|
28 |
+
- 8 categories of climate disinformation claims
|
29 |
+
|
30 |
+
### Labels
|
31 |
+
0. No relevant claim detected
|
32 |
+
1. Global warming is not happening
|
33 |
+
2. Not caused by humans
|
34 |
+
3. Not bad or beneficial
|
35 |
+
4. Solutions harmful/unnecessary
|
36 |
+
5. Science is unreliable
|
37 |
+
6. Proponents are biased
|
38 |
+
7. Fossil fuels are needed
|
39 |
+
|
40 |
+
## Performance
|
41 |
+
|
42 |
+
### Metrics
|
43 |
+
- **Accuracy**: ~63.5%
|
44 |
+
- **Environmental Impact**:
|
45 |
+
- Emissions tracked in gCO2eq
|
46 |
+
- Energy consumption tracked in Wh
|
47 |
+
|
48 |
+
### Model Architecture
|
49 |
+
The model implements a random choice between the 8 possible labels, serving as the simplest possible baseline.
|
50 |
+
|
51 |
+
## Environmental Impact
|
52 |
+
|
53 |
+
Environmental impact is tracked using CodeCarbon, measuring:
|
54 |
+
- Carbon emissions during inference
|
55 |
+
- Energy consumption during inference
|
56 |
+
|
57 |
+
This tracking helps establish a baseline for the environmental impact of model deployment and inference.
|
58 |
+
|
59 |
+
## Limitations
|
60 |
+
- Makes Logistic regression predictions
|
61 |
+
- No learning or pattern recognition
|
62 |
+
- Input text vectorized
|
63 |
+
- Serves only as a LR baseline reference
|
64 |
+
- Not suitable for any real-world applications
|
65 |
+
|
66 |
+
## Ethical Considerations
|
67 |
+
|
68 |
+
- Dataset contains sensitive topics related to climate disinformation
|
69 |
+
- Model makes random predictions and should not be used for actual classification
|
70 |
+
- Environmental impact is tracked to promote awareness of AI's carbon footprint
|
71 |
+
```
|