Spaces:

snehasanjana
/

image_to_text

Running

snehasanjana commited on Jun 17

Commit

a696afd

verified ·

1 Parent(s): b8305fb

Create app.py

Files changed (1) hide show

app.py ADDED Viewed

+import gradio as gr
+from transformers import AutoProcessor, AutoModelForVision2Seq
+from PIL import Image
+import torch
+# Load model and processor
+processor = AutoProcessor.from_pretrained("microsoft/kosmos-2-patch14-224")
+model = AutoModelForVision2Seq.from_pretrained("microsoft/kosmos-2-patch14-224")
+model.eval()
+def grounding(image, prompt):
+    inputs = processor(text=prompt, images=image, return_tensors="pt")
+    with torch.no_grad():
+        generated_ids = model.generate(**inputs, max_new_tokens=256)
+    generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
+    return generated_text
+gr.Interface(
+    fn=grounding,
+    inputs=[gr.Image(type="pil"), gr.Textbox(label="Text Prompt")],
+    outputs="text",
+    title="Kosmos-2 Grounding Demo"
+).launch()