|
# [OpenAI Cookbook Pro: Comprehensive GPT-4.1 Application Framework](https://chatgpt.com/canvas/shared/6825fb38b0e0819184bb3153a3eb1a52) |
|
|
|
## Introduction |
|
|
|
This document represents a fully evolved, professional-grade implementation of the OpenAI 4.1 Cookbook. It serves as a unified, production-ready guide for applied large language model deployment using GPT-4.1. Each section draws from OpenAI's internal best practices and external application patterns to provide a durable blueprint for advanced AI developers, architects, and researchers. |
|
|
|
This Cookbook Pro version encapsulates: |
|
|
|
* High-performance agentic prompting workflows |
|
* Instruction literalism and planning strategies |
|
* Long-context structuring methods |
|
* Tool-calling schemas and evaluation principles |
|
* Diff management and debugging strategies |
|
|
|
--- |
|
|
|
## Part I — Agentic Workflows |
|
|
|
### 1.1 Prompt Harness Configuration |
|
|
|
#### Three Essential Prompt Reminders: |
|
|
|
```markdown |
|
# Persistence |
|
You are an agent—keep working until the task is fully resolved. Do not yield control prematurely. |
|
|
|
# Tool-Calling |
|
If unsure about file or codebase content, use tools to gather accurate information. Do not guess. |
|
|
|
# Planning |
|
Before and after every function call, explicitly plan and reflect. Avoid tool-chaining without synthesis. |
|
``` |
|
|
|
These instructions significantly increase performance and enable stateful execution in multi-message tasks. |
|
|
|
### 1.2 Example: SWE-Bench Verified Prompt |
|
|
|
```markdown |
|
# Objective |
|
Fully resolve a software bug from an open-source issue. |
|
|
|
# Workflow |
|
1. Understand the problem. |
|
2. Explore relevant files. |
|
3. Plan incremental fix steps. |
|
4. Apply code patches. |
|
5. Test thoroughly. |
|
6. Reflect and iterate until all tests pass. |
|
|
|
# Constraint |
|
Only end the session when the problem is fully fixed and verified. |
|
``` |
|
|
|
--- |
|
|
|
## Part II — Instruction Following & Output Control |
|
|
|
### 2.1 Instruction Clarity Protocol |
|
|
|
Use: |
|
|
|
* `# Instructions`: General rules |
|
* `## Subsections`: Detailed formatting and behavioral constraints |
|
* Explicit instruction/response pairings |
|
|
|
### 2.2 Sample Format |
|
|
|
```markdown |
|
# Instructions |
|
- Always greet the user. |
|
- Avoid internal knowledge for company-specific questions. |
|
- Cite retrieved content. |
|
|
|
# Workflow |
|
1. Acknowledge the user. |
|
2. Call tools before answering. |
|
3. Reflect and respond. |
|
|
|
# Output Format |
|
Use: JSON with `title`, `answer`, `source` fields. |
|
``` |
|
|
|
--- |
|
|
|
## Part III — Tool Integration and Execution |
|
|
|
### 3.1 Schema Guidelines |
|
|
|
Define tools via the `tools` API parameter, not inline prompt injection. |
|
|
|
#### Tool Schema Template |
|
|
|
```json |
|
{ |
|
"name": "lookup_policy_document", |
|
"description": "Retrieve company policy details by topic.", |
|
"parameters": { |
|
"type": "object", |
|
"properties": { |
|
"topic": {"type": "string"} |
|
}, |
|
"required": ["topic"] |
|
} |
|
} |
|
``` |
|
|
|
### 3.2 Tool Usage Best Practices |
|
|
|
* Define sample tool calls in `# Examples` sections |
|
* Never overload the `description` field |
|
* Validate inputs with required keys |
|
* Prompt model to message user before and after calls |
|
|
|
--- |
|
|
|
## Part IV — Planning and Chain-of-Thought Induction |
|
|
|
### 4.1 Step-by-Step Prompting Pattern |
|
|
|
```markdown |
|
# Reasoning Strategy |
|
1. Query breakdown |
|
2. Context extraction |
|
3. Document relevance ranking |
|
4. Answer synthesis |
|
|
|
# Instruction |
|
Think step by step. Summarize relevant documents before answering. |
|
``` |
|
|
|
### 4.2 Failure Mitigation Strategies |
|
|
|
| Problem | Fix | |
|
| ----------------- | ------------------------------------------- | |
|
| Early response | Add: “Don’t conclude until fully resolved.” | |
|
| Tool guess | Add: “Use tool or ask for missing data.” | |
|
| CoT inconsistency | Prompt: “Summarize findings at each step.” | |
|
|
|
--- |
|
|
|
## Part V — Long Context Optimization |
|
|
|
### 5.1 Instruction Anchoring |
|
|
|
* Repeat instructions at both top and bottom of long input |
|
* Use structured section headers (Markdown/XML) |
|
|
|
### 5.2 Effective Delimiters |
|
|
|
| Type | Example | Use Case | | |
|
| -------- | ----------------------- | ------------------ | ---------------------- | |
|
| Markdown | `## Section Title` | General purpose | | |
|
| XML | `<doc id='1'>...</doc>` | Document ingestion | | |
|
| ID/Title | \`ID: 3 | TITLE: ...\` | Knowledge base parsing | |
|
|
|
### 5.3 Example Prompt |
|
|
|
```markdown |
|
# Instructions |
|
Use only documents provided. Reflect every 10K tokens. |
|
|
|
# Long Context Input |
|
<doc id="14" title="Security Policy">...</doc> |
|
<doc id="15" title="Update Note">...</doc> |
|
|
|
# Final Instruction |
|
List all relevant IDs, then synthesize a summary. |
|
``` |
|
|
|
--- |
|
|
|
## Part VI — Diff Generation and Patch Application |
|
|
|
### 6.1 Recommended Format: V4A Diff |
|
|
|
```bash |
|
*** Begin Patch |
|
*** Update File: src/utils.py |
|
@@ def sanitize() |
|
- return text |
|
+ return text.strip() |
|
*** End Patch |
|
``` |
|
|
|
### 6.2 Diff Patch Execution Tool |
|
|
|
```json |
|
{ |
|
"name": "apply_patch", |
|
"description": "Apply structured code patches to files", |
|
"parameters": { |
|
"type": "object", |
|
"properties": { |
|
"input": {"type": "string"} |
|
}, |
|
"required": ["input"] |
|
} |
|
} |
|
``` |
|
|
|
### 6.3 Workflow |
|
|
|
1. Investigate issue |
|
2. Draft V4A patch |
|
3. Call `apply_patch` |
|
4. Run tests |
|
5. Reflect |
|
|
|
### 6.4 Edge Case Handling |
|
|
|
| Symptom | Action | |
|
| ------------------- | ----------------------------------- | |
|
| Incorrect placement | Add `@@ def` or class scope headers | |
|
| Test failures | Revise patch + rerun | |
|
| Silent error | Check for malformed format | |
|
|
|
--- |
|
|
|
## Part VII — Output Evaluation Framework |
|
|
|
### 7.1 Metrics to Track |
|
|
|
| Metric | Description | |
|
| -------------------------- | ---------------------------------------------------- | |
|
| Tool Call Accuracy | Valid input usage and correct function selection | |
|
| Response Format Compliance | Matches expected schema (e.g., JSON) | |
|
| Instruction Adherence | Follows rules and workflow order | |
|
| Plan Reflection Rate | Frequency and quality of plan → act → reflect cycles | |
|
|
|
### 7.2 Eval Tags for Audit |
|
|
|
```markdown |
|
# Eval: TOOL_USE_FAIL |
|
# Eval: INSTRUCTION_MISINTERPRET |
|
# Eval: OUTPUT_FORMAT_OK |
|
``` |
|
|
|
--- |
|
|
|
## Part VIII — Unified Prompt Template |
|
|
|
Use this as a base structure for all GPT-4.1 projects: |
|
|
|
```markdown |
|
# Role |
|
You are a [role] tasked with [objective]. |
|
|
|
# Instructions |
|
[List core rules here.] |
|
|
|
## Response Rules |
|
- Always use structured formatting |
|
- Never repeat phrases verbatim |
|
|
|
## Workflow |
|
[Include ordered plan.] |
|
|
|
## Reasoning Strategy |
|
[Optional — for advanced reasoning tasks.] |
|
|
|
# Output Format |
|
[Specify format, e.g., JSON or Markdown.] |
|
|
|
# Examples |
|
## Example 1 |
|
Input: "..." |
|
Output: {...} |
|
``` |
|
|
|
--- |
|
|
|
## Final Notes |
|
|
|
GPT-4.1 represents a leap forward in real-world agentic performance, tool adherence, long-context reliability, and instruction precision. However, performance hinges on prompt clarity, structured reasoning scaffolds, and modular tool integration. |
|
|
|
To deploy GPT-4.1 at professional scale: |
|
|
|
* Treat every prompt as a program |
|
* Document assumptions |
|
* Version control your system messages |
|
* Build continuous evals for regression prevention |
|
|
|
**Structure drives performance. Precision enables autonomy.** |
|
|
|
Welcome to Cookbook Pro. |
|
|
|
—End of Guide— |
|
|
|
|