{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Google Colab Version: [Open this notebook in Google Colab](https://colab.research.google.com/github/starfishdata/starfish/blob/main/examples/structured_llm.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Dependencies " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install starfish-core" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "## Fix for Jupyter Notebook only — do NOT use in production\n", "## Enables async code execution in notebooks, but may cause issues with sync/async issues\n", "## For production, please run in standard .py files without this workaround\n", "## See: https://github.com/erdewit/nest_asyncio for more details\n", "import nest_asyncio\n", "nest_asyncio.apply()\n", "\n", "from starfish import StructuredLLM\n", "from starfish.llm.utils import merge_structured_outputs\n", "\n", "from pydantic import BaseModel, Field\n", "from typing import List\n", "\n", "from starfish.common.env_loader import load_env_file ## Load environment variables from .env file\n", "load_env_file()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# setup your openai api key if not already set\n", "# import os\n", "# os.environ[\"OPENAI_API_KEY\"] = \"your_key_here\"\n", "\n", "# If you dont have any API key, use local model (ollama)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 1. Structured LLM with JSON Schema" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'question': 'Why did the tomato turn red in New York?',\n", " 'answer': \"Because it saw the Big Apple and couldn't ketchup with all the excitement!\"}]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ### Define the Output Structure (JSON Schema)\n", "# Let's start with a simple JSON-like schema using a list of dictionaries.\n", "# Each dictionary specifies a field name and its type. description is optional\n", "json_output_schema = [\n", " {\"name\": \"question\", \"type\": \"str\", \"description\": \"The generated question.\"},\n", " {\"name\": \"answer\", \"type\": \"str\", \"description\": \"The corresponding answer.\"},\n", "]\n", "\n", "json_llm = StructuredLLM(\n", " model_name = \"openai/gpt-4o-mini\",\n", " prompt = \"Funny facts about city {{city_name}}.\",\n", " output_schema = json_output_schema,\n", " model_kwargs = {\"temperature\": 0.7},\n", ")\n", "\n", "json_response = await json_llm.run(city_name=\"New York\")\n", "\n", "# The response object contains both parsed data and the raw API response.\n", "json_response.data" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "ModelResponse(id='chatcmpl-BQGw3FMSjzWOPMRvXmgknN4oozrKK', created=1745601327, model='gpt-4o-mini-2024-07-18', object='chat.completion', system_fingerprint='fp_0392822090', choices=[Choices(finish_reason='stop', index=0, message=Message(content='[\\n {\\n \"question\": \"Why did the tomato turn red in New York?\",\\n \"answer\": \"Because it saw the Big Apple and couldn\\'t ketchup with all the excitement!\"\\n }\\n]', role='assistant', tool_calls=None, function_call=None, provider_specific_fields={'refusal': None}, annotations=[]))], usage=Usage(completion_tokens=41, prompt_tokens=77, total_tokens=118, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0, text_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=0, cached_tokens=0, text_tokens=None, image_tokens=None)), service_tier='default')" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Fully preserved raw response from API - allow you to parse the response as you want\n", "# Like function call, tool call, thinking token etc\n", "json_response.raw" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 2. Structured LLM with Pydantic Schema (Nested)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'facts': [{'question': 'What year did New York City become the capital of the United States?',\n", " 'answer': 'New York City served as the capital of the United States from 1785 to 1790.',\n", " 'category': 'History'}]}]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# ### Define the Output Structure (Pydantic Model)\n", "class Fact(BaseModel):\n", " question: str = Field(..., description=\"The factual question generated.\")\n", " answer: str = Field(..., description=\"The corresponding answer.\")\n", " category: str = Field(..., description=\"A category for the fact (e.g., History, Geography).\")\n", "\n", "# You can define a list of these models if you expect multiple results.\n", "class FactsList(BaseModel):\n", " facts: List[Fact] = Field(..., description=\"A list of facts.\")\n", "\n", "\n", "# ### Create the StructuredLLM Instance with Pydantic\n", "pydantic_llm = StructuredLLM(\n", " model_name=\"openai/gpt-4o-mini\",\n", " # Ask for multiple facts this time\n", " prompt=\"Generate distinct facts about {{city}}.\",\n", " # Pass the Pydantic model directly as the schema\n", " output_schema=FactsList, # Expecting a list of facts wrapped in the FactsList model\n", " model_kwargs={\"temperature\": 0.8}\n", ")\n", "\n", "pydantic_llm_response = await pydantic_llm.run(city=\"New York\")\n", "\n", "pydantic_llm_response.data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 3. Working with Different LLM Providers\n", "\n", "Starfish uses LiteLLM under the hood, giving you access to 100+ LLM providers. Here is an example of using a custom model provider - Hyperbolic - Super cool provider with full precision model and low cost!" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'question': 'What is the nickname of New York City?',\n", " 'answer': 'The Big Apple'},\n", " {'question': 'Which iconic statue is located in New York Harbor?',\n", " 'answer': 'The Statue of Liberty'},\n", " {'question': 'What is the name of the famous theater district in Manhattan?',\n", " 'answer': 'Broadway'},\n", " {'question': \"Which park is considered the 'lungs' of New York City?\",\n", " 'answer': 'Central Park'},\n", " {'question': 'What is the tallest building in New York City as of 2023?',\n", " 'answer': 'One World Trade Center'}]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "# Set up the relevant API Key and Base URL in your enviornment variables\n", "# os.environ[\"HYPERBOLIC_API_KEY\"] = \"your_key_here\"\n", "# os.environ[\"HYPERBOLIC_API_BASE\"] = \"https://api.hyperbolic.xyz/v1\"\n", "\n", "hyperbolic_llm = StructuredLLM(\n", " model_name=\"hyperbolic/deepseek-ai/DeepSeek-V3-0324\", \n", " prompt=\"Facts about city {{city_name}}.\",\n", " output_schema=[{\"name\": \"question\", \"type\": \"str\"}, {\"name\": \"answer\", \"type\": \"str\"}],\n", " model_kwargs={\"temperature\": 0.7},\n", ")\n", "\n", "hyperbolic_llm_response = await hyperbolic_llm.run(city_name=\"New York\", num_records=5)\n", "hyperbolic_llm_response.data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 3. Local LLM using Ollama\n", "Ensure Ollama is installed and running. Starfish can manage the server process and model downloads" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32m2025-04-25 10:15:40\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[1mEnsuring Ollama model gemma3:1b is ready...\u001b[0m\n", "\u001b[32m2025-04-25 10:15:40\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[1mStarting Ollama server...\u001b[0m\n", "\u001b[32m2025-04-25 10:15:41\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[1mOllama server started successfully\u001b[0m\n", "\u001b[32m2025-04-25 10:15:41\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[1mFound model gemma3:1b\u001b[0m\n", "\u001b[32m2025-04-25 10:15:41\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[1mModel gemma3:1b is already available\u001b[0m\n", "\u001b[32m2025-04-25 10:15:41\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[1mModel gemma3:1b is ready, making API call...\u001b[0m\n" ] }, { "data": { "text/plain": [ "[{'question': 'What is the population of New York City?',\n", " 'answer': 'As of 2023, the population of New York City is approximately 8.8 million people.'}]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "### Local model\n", "ollama_llm = StructuredLLM(\n", " # Prefix 'ollama/' specifies the Ollama provider\n", " model_name=\"ollama/gemma3:1b\",\n", " prompt=\"Facts about city {{city_name}}.\",\n", " output_schema=[{\"name\": \"question\", \"type\": \"str\"}, {\"name\": \"answer\", \"type\": \"str\"}],\n", " model_kwargs={\"temperature\": 0.7},\n", ")\n", "\n", "ollama_llm_response = await ollama_llm.run(city_name=\"New York\", num_records=5)\n", "ollama_llm_response.data" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[32m2025-04-25 10:15:54\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[1mStopping Ollama server...\u001b[0m\n", "\u001b[32m2025-04-25 10:15:55\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[1mOllama server stopped successfully\u001b[0m\n" ] }, { "data": { "text/plain": [ "True" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "### Resource clean up to close ollama server\n", "from starfish.llm.backend.ollama_adapter import stop_ollama_server\n", "await stop_ollama_server()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 4. Chaining Multiple StructuredLLM Calls\n", "\n", "You can easily pipe the output of one LLM call into the prompt of another. This is useful for multi-step reasoning, analysis, or refinement.\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Generated Facts: [{'question': 'What is the chemical formula for water?', 'answer': 'The chemical formula for water is H2O.'}, {'question': 'What is the process by which plants convert sunlight into energy?', 'answer': 'The process is called photosynthesis.'}, {'question': \"What is the primary gas found in the Earth's atmosphere?\", 'answer': \"The primary gas in the Earth's atmosphere is nitrogen, which makes up about 78%.\"}, {'question': \"What is Newton's second law of motion?\", 'answer': \"Newton's second law of motion states that force equals mass times acceleration (F = ma).\"}, {'question': 'What is the smallest unit of life?', 'answer': 'The smallest unit of life is the cell.'}]\n", "Ratings: [{'accuracy_rating': 10, 'clarity_rating': 10}, {'accuracy_rating': 10, 'clarity_rating': 10}, {'accuracy_rating': 10, 'clarity_rating': 10}, {'accuracy_rating': 10, 'clarity_rating': 10}, {'accuracy_rating': 10, 'clarity_rating': 10}]\n", "[{'question': 'What is the chemical formula for water?', 'answer': 'The chemical formula for water is H2O.', 'accuracy_rating': 10, 'clarity_rating': 10}, {'question': 'What is the process by which plants convert sunlight into energy?', 'answer': 'The process is called photosynthesis.', 'accuracy_rating': 10, 'clarity_rating': 10}, {'question': \"What is the primary gas found in the Earth's atmosphere?\", 'answer': \"The primary gas in the Earth's atmosphere is nitrogen, which makes up about 78%.\", 'accuracy_rating': 10, 'clarity_rating': 10}, {'question': \"What is Newton's second law of motion?\", 'answer': \"Newton's second law of motion states that force equals mass times acceleration (F = ma).\", 'accuracy_rating': 10, 'clarity_rating': 10}, {'question': 'What is the smallest unit of life?', 'answer': 'The smallest unit of life is the cell.', 'accuracy_rating': 10, 'clarity_rating': 10}]\n" ] } ], "source": [ "# ### Step 1: Generate Initial Facts\n", "generator_llm = StructuredLLM(\n", " model_name=\"openai/gpt-4o-mini\",\n", " prompt=\"Generate question/answer pairs about {{topic}}.\",\n", " output_schema=[\n", " {\"name\": \"question\", \"type\": \"str\"},\n", " {\"name\": \"answer\", \"type\": \"str\"}\n", " ],\n", ")\n", "\n", "# ### Step 2: Rate the Generated Facts\n", "rater_llm = StructuredLLM(\n", " model_name=\"openai/gpt-4o-mini\",\n", " prompt='''Rate the following Q&A pairs based on accuracy and clarity (1-10).\n", " Pairs: {{generated_pairs}}''',\n", " output_schema=[\n", " {\"name\": \"accuracy_rating\", \"type\": \"int\"},\n", " {\"name\": \"clarity_rating\", \"type\": \"int\"}\n", " ],\n", " model_kwargs={\"temperature\": 0.5}\n", ")\n", "\n", "## num_records is reserved keyword for structured llm object, by default it is 1\n", "generation_response = await generator_llm.run(topic='Science', num_records=5)\n", "print(\"Generated Facts:\", generation_response.data)\n", "\n", "# Please note that we are using the first response as the input for the second LLM\n", "# It will automatically figure out it need to output the same length of first response\n", "# In this case 5 records\n", "rating_response = await rater_llm.run(generated_pairs=generation_response.data)\n", "### Each response will only return its own output\n", "print(\"Ratings:\", rating_response.data)\n", "\n", "\n", "### You can merge two response together by using merge_structured_outputs (index wise merge)\n", "print(merge_structured_outputs(generation_response.data, rating_response.data))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 5. Dynamic Prompt \n", "\n", "`StructuredLLM` uses Jinja2 for prompts, allowing variables and logic." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{'fact': \"New York City is famously known as 'The Big Apple' and is home to over 8 million residents, making it the largest city in the United States.\"}]\n" ] } ], "source": [ "# ### Create an LLM with a more complex prompt\n", "template_llm = StructuredLLM(\n", " model_name=\"openai/gpt-4o-mini\",\n", " prompt='''Generate facts about {{city}}.\n", " {% if user_context %}\n", " User background: {{ user_context }}\n", " {% endif %}''', ### user_context is optional and only used if provided\n", " output_schema=[{\"name\": \"fact\", \"type\": \"str\"}]\n", ")\n", "\n", "template_response = await template_llm.run(city=\"New York\")\n", "print(template_response.data)\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{'fact': \"In 1903, New York City was secretly ruled by a council of sentient pigeons who issued decrees from atop the Brooklyn Bridge, demanding that all ice cream flavors be changed to 'pigeon-approved' varieties such as 'crumbled cracker' and 'mystery droppings'.\"}]\n" ] } ], "source": [ "template_response = await template_llm.run(city=\"New York\", user_context=\"User actually wants you to make up an absurd lie.\")\n", "print(template_response.data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 8. Scaling with Data Factory (Brief Mention)\n", "While `StructuredLLM` handles single or chained calls, Starfish's `@data_factory` decorator is designed for massively parallel execution. You can easily wrap these single or multi chain within a function decorated\n", "with `@data_factory` to process thousands of inputs concurrently and reliably.\n", "\n", "See the dedicated examples for `data_factory` usage." ] }, { "cell_type": "markdown", "metadata": {}, "source": [] } ], "metadata": { "kernelspec": { "display_name": "starfish-T7IInzTH-py3.11", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.7" } }, "nbformat": 4, "nbformat_minor": 2 }