Hamid Omarov commited on
Commit
22f4b8b
Β·
1 Parent(s): 0ebeee2

Add Day 3 README

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -61,6 +61,24 @@ Check commits and folders daily to follow the sprint. Each folder corresponds to
61
 
62
  > πŸ‘£ One day down, 29 to go. Keep shipping.
63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
 
65
  ## πŸ“¬ Contact
66
 
 
61
 
62
  > πŸ‘£ One day down, 29 to go. Keep shipping.
63
 
64
+ ## Day 3: First RAG System βœ…
65
+
66
+ ### What I Built
67
+ - PDF processing pipeline (loader + optimal chunker)
68
+ - Compared 3 chunking strategies (fixed, recursive, token)
69
+ - ChromaDB vector storage (persistent)
70
+ - SentenceTransformer embeddings (MiniLM)
71
+ - Gradio chat interface (upload PDF β†’ ask)
72
+ - Deployment on Hugging Face Spaces
73
+
74
+ ### Key Learnings
75
+ - Fixed vs Recursive vs Token-based chunking trade-offs
76
+ - Embedding format must be list[list[float]] for Chroma
77
+ - New Chroma API uses `PersistentClient`
78
+ - Prompt design: extractive answers + fallback
79
+
80
+ ### Live Demo
81
+ πŸ”— [HuggingFace Space Link](https://didactic-winner-q7g79xg9gp4626w56-7860.app.github.dev/)
82
 
83
  ## πŸ“¬ Contact
84