Spaces:
Running
Running
title: DataForge | |
emoji: π¬ | |
colorFrom: yellow | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 5.0.1 | |
app_file: app.py | |
pinned: false | |
license: mit | |
short_description: CodeAct Agent to process large data set | |
tags: | |
- agent-demo-track | |
## π₯ Demo Video | |
**Watch DataForge in Action:** | |
[](https://www.youtube.com/watch?v=f5jp2i3engM) | |
π¬ **[Click here to watch the full demo on YouTube](https://www.youtube.com/watch?v=f5jp2i3engM)** | |
--- | |
# π DataForge - AI Assistant with File Analysis | |
An intelligent AI assistant that combines conversational chat capabilities with advanced file analysis using CodeAct agents. Built with Gradio, LangChain, and LangGraph. | |
## β¨ Features | |
### π¬ Chat Assistant | |
- Interactive AI chatbot powered by OpenAI GPT-4 | |
- Customizable system messages and parameters | |
- Real-time streaming responses | |
- Conversation history support | |
### π File Analysis | |
- **Upload & Analyze**: Support for various file formats (.txt, .log, .csv, .json, .xml, .py, .js, .html, .md) | |
- **Smart Analysis**: Automatic file type detection and tailored analysis | |
- **CodeAct Integration**: Uses LangGraph CodeAct agents for deep file analysis | |
- **Comprehensive Insights**: Provides security analysis, performance insights, error detection, and statistical summaries | |
## π Getting Started | |
### Prerequisites | |
- Python 3.11+ | |
- OpenAI API Key | |
### Installations | |
1. Create and activate virtual environment: | |
```bash | |
uv venv --python 3.11 | |
source .venv/bin/activate # On Windows: .venv\Scripts\activate | |
``` | |
2. Install dependencies: | |
```bash | |
uv pip install -r requirements.txt | |
``` | |
3. Set up environment variables: | |
```bash | |
# Create .env file and add your OpenAI API key | |
OPENAI_API_KEY=your_openai_api_key_here | |
``` | |
### Running the Application | |
```bash | |
python app.py | |
``` | |
The application will start a Gradio interface accessible at `http://localhost:7860` | |
## π File Analysis Capabilities | |
### Supported File Types | |
- **Log files** (.log, .txt): Security analysis, performance bottlenecks, error detection | |
- **Data files** (.csv, .json): Data quality assessment, statistical analysis | |
- **Code files** (.py, .js, .html): Structure analysis, best practices review | |
- **Configuration files** (.xml, .md): Content analysis and recommendations | |
### Analysis Features | |
- **Security Analysis**: Detect threats, suspicious activities, and security patterns | |
- **Performance Insights**: Identify bottlenecks and performance issues | |
- **Error Analysis**: Categorize and analyze errors and warnings | |
- **Statistical Summary**: Basic statistics and data distribution | |
- **Pattern Recognition**: Identify trends and anomalies | |
- **Actionable Recommendations**: Suggested actions based on analysis | |
## π§ͺ Testing | |
A sample server log file (`sample_server.log`) is included for testing the file analysis functionality. | |
## π οΈ Technical Architecture | |
- **Frontend**: Gradio for web interface | |
- **Backend**: LangChain for AI orchestration | |
- **Analysis Engine**: LangGraph CodeAct agents with PyodideSandbox | |
- **File Processing**: Custom FileInjectedPyodideSandbox for secure file analysis | |
- **Model**: OpenAI GPT-4 for both chat and analysis | |
## π License | |
MIT License |