Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -9,4 +9,26 @@ app_file: app.py
|
|
9 |
pinned: false
|
10 |
---
|
11 |
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
pinned: false
|
10 |
---
|
11 |
|
12 |
+
# 🛰️ Real Browser Demo (with Playwright)
|
13 |
+
|
14 |
+
This is an interactive web browser simulation that uses a **real headless browser (Firefox via Playwright)** on the server to access the live internet.
|
15 |
+
|
16 |
+
It demonstrates how to combine the power of a modern web automation tool like Playwright with the simplicity of a Gradio UI.
|
17 |
+
|
18 |
+
## How it Works
|
19 |
+
|
20 |
+
1. **Enter a URL** (e.g., `https://news.ycombinator.com`) or a **search term** (e.g., `latest science news`) into the address bar and click "Go".
|
21 |
+
2. On the server, **Playwright launches a real, headless Firefox browser** and navigates to the requested page. It waits for the page and its JavaScript to fully load.
|
22 |
+
3. The complete, rendered HTML is then parsed using **BeautifulSoup**.
|
23 |
+
4. The **page title, main text content, and a list of clickable links** are extracted.
|
24 |
+
5. This extracted information is sent back and displayed in the Gradio UI.
|
25 |
+
6. You can **"click" links** by entering their corresponding number and clicking the "Click Link" button.
|
26 |
+
7. All standard browser functions like **tabs, back/forward, and refresh** are supported and operate on the real browser instance.
|
27 |
+
|
28 |
+
### Why Playwright?
|
29 |
+
|
30 |
+
- **JavaScript Rendering:** It can correctly display modern Single-Page Applications (SPAs) that `requests` would fail on.
|
31 |
+
- **Robustness:** It behaves like a real user, making it much less likely to be blocked by websites.
|
32 |
+
- **Stateful Interaction:** It naturally handles cookies, sessions, and complex navigation.
|
33 |
+
|
34 |
+
**Note:** Since this demo is making live web requests, it may be slower than the "pseudo browser" and is subject to the speed and availability of the websites you visit.
|