File size: 2,347 Bytes
295bacb
 
 
 
 
 
 
 
 
 
 
 
 
27f6fba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5a65877
27f6fba
 
 
 
 
 
 
 
 
 
5a65877
 
27f6fba
 
 
 
 
 
 
 
 
 
 
 
 
 
830a43d
 
 
 
 
 
 
68ddad0
27f6fba
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
title: Fake News Detector
emoji: πŸ“š
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
license: mit
short_description: 'Detects Fake News using the ensemble of 3 Models '
---

# πŸ“š Fake News Detector

**Detects Fake News using an ensemble of 3 Models (Naive Bayes, Logistic Regression, and GloVe-based embeddings)**

---

## 🚨 Important Disclaimer

> ⚠️ This project is built purely for **educational and experimental purposes** to explore basic Natural Language Processing (NLP) and Machine Learning (ML) techniques.  
> 
> ❗ It is **not suitable for real-world fact-checking or decision-making**.  
> 
> The models used are simple, non-contextual, and cannot understand language nuances or factual correctness. Misusing this tool for serious analysis may lead to incorrect or harmful conclusions.  
>
> **Please do not trust or rely on the outputs of this demo.** It is meant for **learning only.**

---

## 🎯 Purpose

This project was created as a part of our research internship as a way to:
- Practice building an ensemble model using different NLP approaches
- Learn to deploy ML apps with Gradio and Hugging Face Spaces
- Experiment with basic text classification on news headlines/articles

It is **not** a robust or reliable system for determining truth or accuracy in media.  

---

## βš™οΈ How It Works

This Fake News Detector uses an ensemble of 3 models:

1. **Naive Bayes with TF-IDF** – assigns 55% weight  
2. **Logistic Regression** – assigns 10% weight  
3. **GloVe Embedding-Based Classifier** – assigns 35% weight

Each model contributes a score between 0 and 1 indicating the likelihood of the input text being "Real." The final prediction is based on a weighted average.

---

## πŸ“„ License & Attribution

This project is licensed under the **MIT License**.

### Libraries and Tools Used:
- 🧠 [GloVe Embeddings by Stanford NLP](https://nlp.stanford.edu/projects/glove/)
- 🌐 [Gradio Interface Library](https://www.gradio.app/)
- πŸ“š [scikit-learn](https://scikit-learn.org/) for model implementation
- πŸ›  [NLTK](https://www.nltk.org/) for basic NLP preprocessing
-    [Dataset](https://www.kaggle.com/datasets/stevenpeutz/misinformation-fake-news-text-dataset-79k)
## πŸ“¦ Installation

```bash
pip install -r requirements.txt
python app.py