File size: 7,397 Bytes
36cdf5a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 |
# [OpenAI Cookbook Pro: Comprehensive GPT-4.1 Application Framework](https://chatgpt.com/canvas/shared/6825fb38b0e0819184bb3153a3eb1a52)
## Introduction
This document represents a fully evolved, professional-grade implementation of the OpenAI 4.1 Cookbook. It serves as a unified, production-ready guide for applied large language model deployment using GPT-4.1. Each section draws from OpenAI's internal best practices and external application patterns to provide a durable blueprint for advanced AI developers, architects, and researchers.
This Cookbook Pro version encapsulates:
* High-performance agentic prompting workflows
* Instruction literalism and planning strategies
* Long-context structuring methods
* Tool-calling schemas and evaluation principles
* Diff management and debugging strategies
---
## Part I — Agentic Workflows
### 1.1 Prompt Harness Configuration
#### Three Essential Prompt Reminders:
```markdown
# Persistence
You are an agent—keep working until the task is fully resolved. Do not yield control prematurely.
# Tool-Calling
If unsure about file or codebase content, use tools to gather accurate information. Do not guess.
# Planning
Before and after every function call, explicitly plan and reflect. Avoid tool-chaining without synthesis.
```
These instructions significantly increase performance and enable stateful execution in multi-message tasks.
### 1.2 Example: SWE-Bench Verified Prompt
```markdown
# Objective
Fully resolve a software bug from an open-source issue.
# Workflow
1. Understand the problem.
2. Explore relevant files.
3. Plan incremental fix steps.
4. Apply code patches.
5. Test thoroughly.
6. Reflect and iterate until all tests pass.
# Constraint
Only end the session when the problem is fully fixed and verified.
```
---
## Part II — Instruction Following & Output Control
### 2.1 Instruction Clarity Protocol
Use:
* `# Instructions`: General rules
* `## Subsections`: Detailed formatting and behavioral constraints
* Explicit instruction/response pairings
### 2.2 Sample Format
```markdown
# Instructions
- Always greet the user.
- Avoid internal knowledge for company-specific questions.
- Cite retrieved content.
# Workflow
1. Acknowledge the user.
2. Call tools before answering.
3. Reflect and respond.
# Output Format
Use: JSON with `title`, `answer`, `source` fields.
```
---
## Part III — Tool Integration and Execution
### 3.1 Schema Guidelines
Define tools via the `tools` API parameter, not inline prompt injection.
#### Tool Schema Template
```json
{
"name": "lookup_policy_document",
"description": "Retrieve company policy details by topic.",
"parameters": {
"type": "object",
"properties": {
"topic": {"type": "string"}
},
"required": ["topic"]
}
}
```
### 3.2 Tool Usage Best Practices
* Define sample tool calls in `# Examples` sections
* Never overload the `description` field
* Validate inputs with required keys
* Prompt model to message user before and after calls
---
## Part IV — Planning and Chain-of-Thought Induction
### 4.1 Step-by-Step Prompting Pattern
```markdown
# Reasoning Strategy
1. Query breakdown
2. Context extraction
3. Document relevance ranking
4. Answer synthesis
# Instruction
Think step by step. Summarize relevant documents before answering.
```
### 4.2 Failure Mitigation Strategies
| Problem | Fix |
| ----------------- | ------------------------------------------- |
| Early response | Add: “Don’t conclude until fully resolved.” |
| Tool guess | Add: “Use tool or ask for missing data.” |
| CoT inconsistency | Prompt: “Summarize findings at each step.” |
---
## Part V — Long Context Optimization
### 5.1 Instruction Anchoring
* Repeat instructions at both top and bottom of long input
* Use structured section headers (Markdown/XML)
### 5.2 Effective Delimiters
| Type | Example | Use Case | |
| -------- | ----------------------- | ------------------ | ---------------------- |
| Markdown | `## Section Title` | General purpose | |
| XML | `<doc id='1'>...</doc>` | Document ingestion | |
| ID/Title | \`ID: 3 | TITLE: ...\` | Knowledge base parsing |
### 5.3 Example Prompt
```markdown
# Instructions
Use only documents provided. Reflect every 10K tokens.
# Long Context Input
<doc id="14" title="Security Policy">...</doc>
<doc id="15" title="Update Note">...</doc>
# Final Instruction
List all relevant IDs, then synthesize a summary.
```
---
## Part VI — Diff Generation and Patch Application
### 6.1 Recommended Format: V4A Diff
```bash
*** Begin Patch
*** Update File: src/utils.py
@@ def sanitize()
- return text
+ return text.strip()
*** End Patch
```
### 6.2 Diff Patch Execution Tool
```json
{
"name": "apply_patch",
"description": "Apply structured code patches to files",
"parameters": {
"type": "object",
"properties": {
"input": {"type": "string"}
},
"required": ["input"]
}
}
```
### 6.3 Workflow
1. Investigate issue
2. Draft V4A patch
3. Call `apply_patch`
4. Run tests
5. Reflect
### 6.4 Edge Case Handling
| Symptom | Action |
| ------------------- | ----------------------------------- |
| Incorrect placement | Add `@@ def` or class scope headers |
| Test failures | Revise patch + rerun |
| Silent error | Check for malformed format |
---
## Part VII — Output Evaluation Framework
### 7.1 Metrics to Track
| Metric | Description |
| -------------------------- | ---------------------------------------------------- |
| Tool Call Accuracy | Valid input usage and correct function selection |
| Response Format Compliance | Matches expected schema (e.g., JSON) |
| Instruction Adherence | Follows rules and workflow order |
| Plan Reflection Rate | Frequency and quality of plan → act → reflect cycles |
### 7.2 Eval Tags for Audit
```markdown
# Eval: TOOL_USE_FAIL
# Eval: INSTRUCTION_MISINTERPRET
# Eval: OUTPUT_FORMAT_OK
```
---
## Part VIII — Unified Prompt Template
Use this as a base structure for all GPT-4.1 projects:
```markdown
# Role
You are a [role] tasked with [objective].
# Instructions
[List core rules here.]
## Response Rules
- Always use structured formatting
- Never repeat phrases verbatim
## Workflow
[Include ordered plan.]
## Reasoning Strategy
[Optional — for advanced reasoning tasks.]
# Output Format
[Specify format, e.g., JSON or Markdown.]
# Examples
## Example 1
Input: "..."
Output: {...}
```
---
## Final Notes
GPT-4.1 represents a leap forward in real-world agentic performance, tool adherence, long-context reliability, and instruction precision. However, performance hinges on prompt clarity, structured reasoning scaffolds, and modular tool integration.
To deploy GPT-4.1 at professional scale:
* Treat every prompt as a program
* Document assumptions
* Version control your system messages
* Build continuous evals for regression prevention
**Structure drives performance. Precision enables autonomy.**
Welcome to Cookbook Pro.
—End of Guide—
|