Spaces:

Agents-MCP-Hackathon
/

DataForge

Running

App Files Files Community

DataForge / README.md

ai-puppy

Update README.md

94bb6a1 3 days ago

preview code

raw

history blame contribute delete

3.24 kB

	---
	title: DataForge
	emoji: 💬
	colorFrom: yellow
	colorTo: purple
	sdk: gradio
	sdk_version: 5.0.1
	app_file: app.py
	pinned: false
	license: mit
	short_description: CodeAct Agent to process large data set
	tags:
	- agent-demo-track
	---

	## 🎥 Demo Video

	Watch DataForge in Action:

	[![DataForge Demo](https://img.youtube.com/vi/f5jp2i3engM/maxresdefault.jpg)](https://www.youtube.com/watch?v=f5jp2i3engM)

	🎬 [Click here to watch the full demo on YouTube](https://www.youtube.com/watch?v=f5jp2i3engM)

	---

	# 🔍 DataForge - AI Assistant with File Analysis

	An intelligent AI assistant that combines conversational chat capabilities with advanced file analysis using CodeAct agents. Built with Gradio, LangChain, and LangGraph.

	## ✨ Features

	### 💬 Chat Assistant
	- Interactive AI chatbot powered by OpenAI GPT-4
	- Customizable system messages and parameters
	- Real-time streaming responses
	- Conversation history support

	### 📁 File Analysis
	- Upload & Analyze: Support for various file formats (.txt, .log, .csv, .json, .xml, .py, .js, .html, .md)
	- Smart Analysis: Automatic file type detection and tailored analysis
	- CodeAct Integration: Uses LangGraph CodeAct agents for deep file analysis
	- Comprehensive Insights: Provides security analysis, performance insights, error detection, and statistical summaries

	## 🚀 Getting Started

	### Prerequisites
	- Python 3.11+
	- OpenAI API Key

	### Installations

	1. Create and activate virtual environment:
	```bash
	uv venv --python 3.11
	source .venv/bin/activate # On Windows: .venv\Scripts\activate
	```

	2. Install dependencies:
	```bash
	uv pip install -r requirements.txt
	```

	3. Set up environment variables:
	```bash
	# Create .env file and add your OpenAI API key
	OPENAI_API_KEY=your_openai_api_key_here
	```

	### Running the Application
	```bash
	python app.py
	```

	The application will start a Gradio interface accessible at `http://localhost:7860`

	## 📊 File Analysis Capabilities

	### Supported File Types
	- Log files (.log, .txt): Security analysis, performance bottlenecks, error detection
	- Data files (.csv, .json): Data quality assessment, statistical analysis
	- Code files (.py, .js, .html): Structure analysis, best practices review
	- Configuration files (.xml, .md): Content analysis and recommendations

	### Analysis Features
	- Security Analysis: Detect threats, suspicious activities, and security patterns
	- Performance Insights: Identify bottlenecks and performance issues
	- Error Analysis: Categorize and analyze errors and warnings
	- Statistical Summary: Basic statistics and data distribution
	- Pattern Recognition: Identify trends and anomalies
	- Actionable Recommendations: Suggested actions based on analysis

	## 🧪 Testing

	A sample server log file (`sample_server.log`) is included for testing the file analysis functionality.

	## 🛠️ Technical Architecture

	- Frontend: Gradio for web interface
	- Backend: LangChain for AI orchestration
	- Analysis Engine: LangGraph CodeAct agents with PyodideSandbox
	- File Processing: Custom FileInjectedPyodideSandbox for secure file analysis
	- Model: OpenAI GPT-4 for both chat and analysis

	## 📄 License

	MIT License