Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -12,3 +12,114 @@ short_description: Cleans Data for Sagemaker/Azure Training
|
|
12 |
---
|
13 |
|
14 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
---
|
13 |
|
14 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
15 |
+
|
16 |
+
|
17 |
+
|
18 |
+
|
19 |
+
|
20 |
+
|
21 |
+
Call Center Data Analysis
|
22 |
+
|
23 |
+
A powerful data analysis tool for call center logs, built on Hugging Face Spaces (free tier). This demo showcases after-the-fact analysis of call center data, including data cleaning, statistical visualization, and export options for downstream AI modeling in SageMaker or Azure AI. It reflects over 5 years of AI expertise, focusing on real-world challenges in junk data mitigation for enterprise CX workflows.
|
24 |
+
|
25 |
+
Features
|
26 |
+
|
27 |
+
|
28 |
+
|
29 |
+
|
30 |
+
|
31 |
+
Data Parsing and Cleaning: Processes large call center CSVs, removing nulls, duplicates, short entries, malformed queries, and invalid timestamps, ensuring data integrity.
|
32 |
+
|
33 |
+
|
34 |
+
|
35 |
+
Statistical Visualization: Generates plots for call duration distribution, satisfaction scores by agent, and query frequency by language using Matplotlib and Seaborn.
|
36 |
+
|
37 |
+
|
38 |
+
|
39 |
+
Export Options: Provides downloadable cleaned CSV for SageMaker/Azure AI modeling and a PDF report summarizing data quality and statistics.
|
40 |
+
|
41 |
+
|
42 |
+
|
43 |
+
Gradio-Powered Interface: A responsive, dark-themed UI for viewing raw data, cleanup stats, and visualizations, optimized for enterprise workflows.
|
44 |
+
|
45 |
+
Setup
|
46 |
+
|
47 |
+
|
48 |
+
|
49 |
+
|
50 |
+
|
51 |
+
Clone this repository to a Hugging Face Space (free tier, public visibility).
|
52 |
+
|
53 |
+
|
54 |
+
|
55 |
+
Upload your call_center_logs.csv to the Space.
|
56 |
+
|
57 |
+
|
58 |
+
|
59 |
+
Populate requirements.txt with the specified dependencies, ensuring compatibility with Python 3.9+ and CPU-only execution.
|
60 |
+
|
61 |
+
|
62 |
+
|
63 |
+
Deploy app.py and launch the Space with Gradio SDK.
|
64 |
+
|
65 |
+
Usage
|
66 |
+
|
67 |
+
|
68 |
+
|
69 |
+
|
70 |
+
|
71 |
+
Click the "Analyze Data" button to process the call center logs.
|
72 |
+
|
73 |
+
|
74 |
+
|
75 |
+
View the raw data (first 50 rows), cleanup statistics, and statistical plots.
|
76 |
+
|
77 |
+
|
78 |
+
|
79 |
+
Download the cleaned CSV (cleaned_call_center_logs.csv) for SageMaker/Azure AI modeling.
|
80 |
+
|
81 |
+
|
82 |
+
|
83 |
+
Download the PDF report (data_analysis_report.pdf) summarizing the analysis.
|
84 |
+
|
85 |
+
Technical Architecture
|
86 |
+
|
87 |
+
|
88 |
+
|
89 |
+
|
90 |
+
|
91 |
+
Core Stack:
|
92 |
+
|
93 |
+
|
94 |
+
|
95 |
+
|
96 |
+
|
97 |
+
Python 3.9+: Foundation for data processing and analysis.
|
98 |
+
|
99 |
+
|
100 |
+
|
101 |
+
Pandas: High-performance CSV parsing and data cleaning.
|
102 |
+
|
103 |
+
|
104 |
+
|
105 |
+
Matplotlib/Seaborn: Statistical visualization of call center metrics.
|
106 |
+
|
107 |
+
|
108 |
+
|
109 |
+
Gradio: Interactive UI for data analysis and export.
|
110 |
+
|
111 |
+
|
112 |
+
|
113 |
+
ReportLab/Pillow: PDF report generation with embedded plots.
|
114 |
+
|
115 |
+
|
116 |
+
|
117 |
+
Free Tier Optimization: Designed for CPU-only execution, minimizing memory footprint.
|
118 |
+
|
119 |
+
|
120 |
+
|
121 |
+
Extensibility: Cleaned CSV is structured for SageMaker (e.g., BERT-based intent classification) and Azure AI (e.g., custom ML models).
|
122 |
+
|
123 |
+
Purpose
|
124 |
+
|
125 |
+
This Space demonstrates proficiency in after-the-fact data analysis for call center environments, addressing junk data challenges and preparing data for AI modeling, aligning with enterprise CX needs.
|