Spaces:
Running
Update app.py
Browse files31-40
### Handle File Uploads Correctly in Gradio Spaces
This section ensures robust handling of file uploads in Hugging Face Spaces using Gradio:
- **PDF files**: When a user uploads a PDF, the file object is a `NamedBytesIO` and supports `.read()`. We use `fitz` (PyMuPDF) to extract text from each page.
- **TXT files**: When a user uploads a TXT file, the file object is a `NamedString` (acts like a string, does **not** support `.read()`). We simply convert it to a string.
- **Other file types**: These are ignored and result in an empty input.
This distinction is important because Hugging Face Spaces passes different file-like objects depending on the file type. Attempting to call `.read()` on a `NamedString` (TXT) will raise an `AttributeError`.
**Summary:**
- Use `.read()` only for PDFs.
- Use `str(file)` for TXT files.
- This approach prevents runtime errors and ensures the summarizer works for both file types.
@@ -28,19 +28,16 @@ def summarize(file, text, style, length):
|
|
28 |
text_input = ""
|
29 |
if file is not None:
|
30 |
if file.name.endswith(".pdf"):
|
|
|
31 |
with fitz.open(stream=file.read(), filetype="pdf") as doc:
|
32 |
text_input = " ".join([page.get_text() for page in doc])
|
33 |
elif file.name.endswith(".txt"):
|
|
|
34 |
text_input = str(file)
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
# (which has a .read() method).
|
40 |
-
# ### #
|
41 |
-
|
42 |
-
elif text:
|
43 |
-
text_input = text
|
44 |
# If the input text is empty or contains only whitespace,
|
45 |
# return early with a user message and placeholder values.
|
46 |
if not text_input.strip():
|
|
|
28 |
text_input = ""
|
29 |
if file is not None:
|
30 |
if file.name.endswith(".pdf"):
|
31 |
+
# Only PDFs have .read()
|
32 |
with fitz.open(stream=file.read(), filetype="pdf") as doc:
|
33 |
text_input = " ".join([page.get_text() for page in doc])
|
34 |
elif file.name.endswith(".txt"):
|
35 |
+
# TXT files are passed as NamedString -> use str(file)
|
36 |
text_input = str(file)
|
37 |
+
else:
|
38 |
+
text_input = ""
|
39 |
+
elif text:
|
40 |
+
text_input = text
|
|
|
|
|
|
|
|
|
|
|
41 |
# If the input text is empty or contains only whitespace,
|
42 |
# return early with a user message and placeholder values.
|
43 |
if not text_input.strip():
|