File size: 3,694 Bytes
f7711cd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
# WhatsApp Chat Analyzer

A comprehensive tool for analyzing WhatsApp chat exports with sentiment analysis capabilities.

## Table of Contents
1. [System Overview](#system-overview)
2. [Architecture](#architecture)
3. [Components](#components)
4. [Data Flow](#data-flow)
5. [Installation](#installation)
6. [Usage](#usage)
7. [Analysis Capabilities](#analysis-capabilities)
8. 

## System Overview

The WhatsApp Chat Analyzer is a Python-based application that processes exported WhatsApp chat data to provide:
- Message statistics and metrics
- Temporal activity patterns
- User engagement analysis
- Content analysis (words, emojis, links)
- Sentiment analysis capabilities
- Topics analysis in the group chats

Built with Streamlit for the web interface, it offers an interactive way to explore chat dynamics and analyze sentiment.

## Architecture

The system follows a modular architecture with clear separation of concerns:

```
Raw WhatsApp Chat → Preprocessing → Analysis → Visualization
```

Key architectural decisions:
- **Modular Design**: Components are separated by functionality
- **Pipeline Processing**: Data flows through discrete processing stages
- **Interactive UI**: Streamlit enables real-time exploration

## Components

### 1. App Module (`app.py`)
- **Responsibility**: User interface and visualization
- **Key Features**:
  - File upload handling
  - User selection interface
  - Visualization rendering
  - Interactive controls

### 2. Preprocessor (`preprocessor.py`)
- **Responsibility**: Data cleaning and structuring
- **Key Features**:
  - Handles multiple date/time formats
  - Extracts messages and metadata
  - Filters system messages
  - Creates structured DataFrame

### 3. Helper Module (`helper.py`)
- **Responsibility**: Analytical computations
- **Key Features**:
  - Statistical metrics
  - Temporal analysis
  - Content analysis
  - Visualization data preparation

### 4. Notebook (`whatsAppAnalyzer.ipynb`)
- **Responsibility**: Prototyping and experimentation
- **Key Features**:
  - Initial pattern development
  - Data exploration
  - Algorithm testing

## Data Flow

1. **Input**: User uploads WhatsApp chat export (.txt)
2. **Preprocessing**:
   - Raw text is parsed using regex patterns
   - Messages are categorized and timestamped
   - Structured DataFrame is created
3. **Analysis**:
   - Selected metrics are computed
   - Temporal patterns are identified
   - Content features are extracted
4. **Visualization**:
   - Results are displayed in interactive charts
   - User can explore different views

## Installation

### Prerequisites
- Python 3.8+
- pip package manager

### Steps
1. Clone the repository:
   ```bash
   git clone [repository-url]
   cd whatsapp-analyzer
   ```

2. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```

3. Run the application:
   ```bash
   streamlit run srcs/app.py
   ```

## Usage

1. Launch the application
2. Upload a WhatsApp chat export file
3. Select a user or "Overall" for group analysis
4. Explore the various analysis tabs:
   - Statistics
   - Timelines
   - Activity Maps
   - Word Clouds
   - Emoji Analysis

## Analysis Capabilities

### 1. Basic Statistics
- Message counts
- Word counts
- Media shared
- Links shared

### 2. Temporal Analysis
- Daily activity patterns
- Monthly trends
- Hourly distributions

### 3. User Engagement
- Most active users
- User participation rates
- Message distribution

### 4. Content Analysis
- Most common words
- Emoji usage

### 5. Sentiment Analysis
- Message sentiment scoring
- Sentiment trends over time
- User sentiment comparison
## 5. Topics Analysis
- Topic modeling
- Common topics over time
- User interests