Spaces:
Sleeping
Sleeping
RAG 30 Days Sprint ๐
This repository contains a 30-day sprint to master Retrieval-Augmented Generation (RAG) systems using Python, LangChain, and modern AI tools.
๐ Day Tracker
Day | Folder | Description | Status |
---|---|---|---|
1 | day1 | Hello world test file | โ |
2 | day2 | TBD | โณ |
... | ... | ... | ... |
๐ Folder Structure
rag-30-days/ โ โโโ day1/ โ โโโ hello_ai.py โ โโโ README.md
markdown Copy Edit
๐ง Goal
To build a production-ready RAG pipeline in 30 days and land a remote AI job by the end of the sprint.
๐ ๏ธ Tools
- Python
- LangChain
- ChromaDB / Weaviate / FAISS
- OpenAI API
- Streamlit (optional UI)
- Git & GitHub
๐ Progress
Check commits and folders daily to follow the sprint. Each folder corresponds to 1 day of learning and building.
๐ Day 1 โ Getting Started with Python & Flask
โ What I Learned
- Refreshed core Python basics (variables, functions, classes, etc.)
- Built my first Flask API with real-world JSON responses
- Practiced structured coding with Copilot assistance
๐ ๏ธ What I Built
hello_ai.py
: A minimal Python script to print a welcome messageapi.py
: A Flask application with 3 endpoints:/hello
: greeting message/calculate
: accepts 2 numbers (POST) and returns their sum/ai-ready
: motivational message for AI learning
๐ฎ Tomorrow's Plan
- Begin LangChain setup and environment configuration
- Start working on RAG-based document processing
- Set up folder structure and
day2
workflow
๐ฃ One day down, 29 to go. Keep shipping.
Day 3: First RAG System โ
What I Built
- PDF processing pipeline (loader + optimal chunker)
- Compared 3 chunking strategies (fixed, recursive, token)
- ChromaDB vector storage (persistent)
- SentenceTransformer embeddings (MiniLM)
- Gradio chat interface (upload PDF โ ask)
- Deployment on Hugging Face Spaces
Key Learnings
- Fixed vs Recursive vs Token-based chunking trade-offs
- Embedding format must be list[list[float]] for Chroma
- New Chroma API uses
PersistentClient
- Prompt design: extractive answers + fallback
Live Demo
๐ฌ Contact
Made by Hamid Omarov
Check out my portfolio: Notion Page