Spaces:

HamidOmarov
/

First_RAG_System

Sleeping

App Files Files Community

Hamid Omarov commited on 15 days ago

Commit

a6f5647

2 Parent(s): 2005cbc 5ba6d70

Merge remote-tracking branch 'hf/main'

Browse files

Files changed (2) hide show

README.md +12 -83
requirements .txt +0 -6

README.md CHANGED Viewed

@@ -1,86 +1,15 @@
-# RAG 30 Days Sprint 🚀
-This repository contains a 30-day sprint to master Retrieval-Augmented Generation (RAG) systems using Python, LangChain, and modern AI tools.
-## 📅 Day Tracker
-| Day | Folder | Description            | Status |
-|-----|--------|------------------------|--------|
-| 1   | day1   | Hello world test file  | ✅     |
-| 2   | day2   | TBD                    | ⏳     |
-| ... | ...    | ...                    | ...    |
-## 📂 Folder Structure
-rag-30-days/
-│
-├── day1/
-│ └── hello_ai.py
-│
-├── README.md
-markdown
-Copy
-Edit
-## 🧠 Goal
-To build a production-ready RAG pipeline in 30 days and land a remote AI job by the end of the sprint.
-## 🛠️ Tools
-- Python
-- LangChain
-- ChromaDB / Weaviate / FAISS
-- OpenAI API
-- Streamlit (optional UI)
-- Git & GitHub
-## 📈 Progress
-Check commits and folders daily to follow the sprint. Each folder corresponds to 1 day of learning and building.
-## 📅 Day 1 – Getting Started with Python & Flask
-### ✅ What I Learned
-- Refreshed core **Python basics** (variables, functions, classes, etc.)
-- Built my first **Flask API** with real-world JSON responses
-- Practiced structured coding with **Copilot assistance**
-### 🛠️ What I Built
-- `hello_ai.py`: A minimal Python script to print a welcome message
-- `api.py`: A Flask application with 3 endpoints:
-  - `/hello`: greeting message
-  - `/calculate`: accepts 2 numbers (POST) and returns their sum
-  - `/ai-ready`: motivational message for AI learning
-### 🔮 Tomorrow's Plan
-- Begin **LangChain** setup and environment configuration
-- Start working on **RAG-based document processing**
-- Set up folder structure and `day2` workflow
-> 👣 One day down, 29 to go. Keep shipping.
-## Day 3: First RAG System ✅
-### What I Built
-- PDF processing pipeline (loader + optimal chunker)
-- Compared 3 chunking strategies (fixed, recursive, token)
-- ChromaDB vector storage (persistent)
-- SentenceTransformer embeddings (MiniLM)
-- Gradio chat interface (upload PDF → ask)
-- Deployment on Hugging Face Spaces
-### Key Learnings
-- Fixed vs Recursive vs Token-based chunking trade-offs
-- Embedding format must be list[list[float]] for Chroma
-- New Chroma API uses `PersistentClient`
-- Prompt design: extractive answers + fallback
-### Live Demo
-🔗 [HuggingFace Space Link](https://didactic-winner-q7g79xg9gp4626w56-7860.app.github.dev/)
-## 📬 Contact
-Made by [Hamid Omarov](https://www.linkedin.com/in/hamidomarov)
-Check out my portfolio: [Notion Page](https://www.notion.so/AI-Content-Factory-Operations-2400a72a724c8050b5c6ddc0e6a0a77d)

+---
+title: PDF RAG (Chroma + Groq)
+emoji: 📚
+colorFrom: indigo
+colorTo: green
+sdk: gradio
+sdk_version: "4.44.0"
+app_file: app.py
+pinned: false
+---
+# PDF RAG (Chroma + Groq)
+Upload a PDF and ask questions. Uses ChromaDB for retrieval and Groq LLM for answers.

requirements .txt DELETED Viewed

@@ -1,6 +0,0 @@
-gradio
-chromadb
-sentence-transformers
-langchain-groq
-pypdf
-python-dotenv