Spaces:

berndf
/

EmbeddingVisualizer

Running

App Files Files Community

berndf commited on 3 days ago

Commit

b27e307

verified ·

1 Parent(s): 1adab1d

fix

Browse files

Files changed (1) hide show

app.py +13 -52

app.py CHANGED Viewed

@@ -107,56 +107,17 @@ def goto(page: str):
 page = st.query_params.get("page", "demo")
 if page == "info":
-    st.title("about this demo")
     st.write("""
-# 🧠 Embedding Visualizer – About
-This demo shows how **vector embeddings** can capture the meaning of words and place them in a **numerical space** where related items appear close together.
-You can:
-- Choose from predefined or mixed datasets (e.g., countries, animals, actors, sports)
-- Select different embedding models to compare results
-- Switch between 2D and 3D visualizations
-- Edit the list of words directly and see the updated projection instantly
----
-## 📌 What are Vector Embeddings?
-A **vector embedding** is a way of representing text (words, sentences, or documents) as a list of numbers — a point in a high-dimensional space.
-These numbers are produced by a trained **language model** that captures semantic meaning.
-In this space:
-- Words with **similar meanings** end up **near each other**
-- Dissimilar words are placed **far apart**
-- The model can detect relationships and groupings that aren’t obvious from spelling or grammar alone
-Example:
-`"cat"` and `"dog"` will likely be closer to each other than to `"table"`, because the model “knows” they are both animals.
----
-## 🔍 How the Demo Works
-1. **Embedding step** – Each word is converted into a high-dimensional vector (e.g., 384, 768, or 1024 dimensions depending on the model).
-2. **Dimensionality reduction** – Since humans can’t visualize hundreds of dimensions, the vectors are projected to 2D or 3D using **PCA** (Principal Component Analysis).
-3. **Visualization** – The projected points are plotted, with labels showing the original words.
-   You can rotate the 3D view to explore groupings.
----
-## 💡 Typical Applications of Embeddings
-- **Semantic search** – Find relevant results even if exact keywords don’t match
-- **Clustering & topic discovery** – Group related items automatically
-- **Recommendations** – Suggest similar products, movies, or articles
-- **Deduplication** – Detect near-duplicate content
-- **Analogies** – Explore relationships like *"king" – "man" + "woman" ≈ "queen"*
----
-## 🚀 Try it Yourself
-- Pick a dataset or create your own by editing the list
-- Switch models to compare how the embedding space changes
-- Toggle between 2D and 3D to explore patterns
     """.strip())
     if st.button("⬅ back to demo"):
         goto("demo")
@@ -184,10 +145,10 @@ with c2:
     st.session_state.model_name = MODELS[chosen_label]
 with c3:
-    # Single-click fix: stable key and only set index on first render
     radio_kwargs = dict(options=["2D", "3D"], horizontal=True, key="proj_mode")
     if "proj_mode" not in st.session_state:
-        radio_kwargs["index"] = 1  # default to 3D initially
     st.radio("projection", **radio_kwargs)
 with c4:
@@ -311,4 +272,4 @@ with right:
             )]
         )
-    st.plotly_chart(fig, use_container_width=True)

 page = st.query_params.get("page", "demo")
 if page == "info":
+    st.title("ℹ about this demo")
     st.write("""
+**embeddings** turn words (or longer text) into numerical vectors.
+in this vector space, **semantically related** items end up **near** each other.
+use cases:
+- semantic search & retrieval
+- clustering & topic discovery
+- recommendations & deduplication
+- measuring similarity and analogies
+this demo embeds single words with a selectable model, reduces to 2d/3d with pca,
+and shows how related words appear near each other in the projected space.
     """.strip())
     if st.button("⬅ back to demo"):
         goto("demo")
     st.session_state.model_name = MODELS[chosen_label]
 with c3:
+    # Default to 3D on first render; single-click thereafter
     radio_kwargs = dict(options=["2D", "3D"], horizontal=True, key="proj_mode")
     if "proj_mode" not in st.session_state:
+        radio_kwargs["index"] = 1  # 3D default
     st.radio("projection", **radio_kwargs)
 with c4:
             )]
         )
+    st.plotly_chart(fig, use_container_width=True)