Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,122 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
Model Page: [[GPT-5 nano](https://www.cometapi.com/gpt-5-nano-api/)](https://www.cometapi.com/gpt-5-mini-api/)
|
5 |
+
|
6 |
+
**GPT-5 Nano** is the ultra-light, low-latency variant of OpenAI’s GPT-5 family, designed for **cost-sensitive**, **real-time**, and high-throughput applications where speed and price matter more than deep multi-step reasoning. It keeps the GPT-5 instruction-following and safety improvements but trades off reasoning depth and some long-context capabilities to deliver **very low latency** and **very low token cost**.
|
7 |
+
|
8 |
+
## Basic Information & Features
|
9 |
+
|
10 |
+
- **Model Name**: `gpt-5-nano`
|
11 |
+
|
12 |
+
- **Multimodal Support**: Text & Vision (up to 400K context tokens)
|
13 |
+
|
14 |
+
- **Context Window**: 400,000 input tokens; 128,000 output tokens
|
15 |
+
|
16 |
+
- Pricing
|
17 |
+
|
18 |
+
:
|
19 |
+
|
20 |
+
- Input: $0.05 per 1M tokens
|
21 |
+
- Output: $0.40 per 1M tokens
|
22 |
+
|
23 |
+
Compared to GPT-5 main, GPT-5 nano trades off **raw power** for **ultra-low latency** and **reduced cost**, making it ideal for **interactive applications** where speed and budget are critical .
|
24 |
+
|
25 |
+
## Technical Details
|
26 |
+
|
27 |
+
GPT-5 nano leverages the same **transformer architecture** as its larger siblings but incorporates advanced **quantization** and **parameter pruning** techniques to shrink its footprint. It features:
|
28 |
+
|
29 |
+
- **Minimal Reasoning**: A streamlined reasoning pathway optimized for single-turn inference, emulating GPT-5’s “built-in thinking” at reduced compute.
|
30 |
+
- **Verbosity Control**: Adjustable verbosity parameter to fine-tune response length and detail.
|
31 |
+
- **Efficient Attention**: Custom attention kernels for low-memory deployment without sacrificing the model’s ability to handle long sequences.
|
32 |
+
|
33 |
+
When benchmarked against GPT-4 o mini, GPT-5 nano demonstrates up to **2× faster** throughput on identical hardware, thanks to its **lightweight** design .
|
34 |
+
|
35 |
+
------
|
36 |
+
|
37 |
+
## Benchmark Performance
|
38 |
+
|
39 |
+
Although GPT-5 main leads in absolute performance, GPT-5 nano delivers **competitive accuracy** on key benchmarks:
|
40 |
+
|
41 |
+
- **SWE-Bench (Software Engineering)**: Achieves ~75% of GPT-5 main’s code-generation accuracy while reducing inference time by ~50%.
|
42 |
+
- **HealthBench**: Maintains ~80% of clinical reasoning performance of GPT-5 main, suitable for basic triage and summary tasks .
|
43 |
+
- **Multilingual Tests**: Retains robust support across 12 languages, declining by less than 10% in translation quality compared to GPT-5 main .
|
44 |
+
|
45 |
+
These results underscore GPT-5 nano’s suitability for **cost-sensitive** and **latency-critical** environments where slight trade-offs in accuracy are acceptable.
|
46 |
+
|
47 |
+
------
|
48 |
+
|
49 |
+
## Model Version & Lineage
|
50 |
+
|
51 |
+
- **Model Card Name**: `gpt-5-nano`
|
52 |
+
|
53 |
+
- **Knowledge Cut-off**: May 30, 2024 for nano variant
|
54 |
+
|
55 |
+
- Position in Family
|
56 |
+
|
57 |
+
:
|
58 |
+
|
59 |
+
- Replaces GPT-4.1 nano as the entry-level offering
|
60 |
+
- Sits below GPT-5 mini and GPT-5 main in the performance hierarchy
|
61 |
+
|
62 |
+
The nano variant inherits improvements from GPT-5 main’s training, including **reduced hallucinations** and **structural reasoning**, albeit at a smaller scale.
|
63 |
+
|
64 |
+
------
|
65 |
+
|
66 |
+
## Limitations
|
67 |
+
|
68 |
+
While GPT-5 nano excels in **speed** and **cost**, it has inherent drawbacks:
|
69 |
+
|
70 |
+
- **Reduced Depth**: Limited capacity for **multi-step reasoning** compared to GPT-5 main, making it less ideal for complex planning tasks.
|
71 |
+
- **Higher Hallucination Rate**: Slightly elevated risk of generating incorrect details under **ambiguous prompts**.
|
72 |
+
- **Lower Contextual Recall**: Although the raw token window is large, internal mechanisms favor *recent* context, potentially overlooking earlier details in very long dialogues .
|
73 |
+
|
74 |
+
Developers should weigh these constraints when choosing GPT-5 nano for applications demanding **high factual integrity**.
|
75 |
+
|
76 |
+
------
|
77 |
+
|
78 |
+
## Use Cases
|
79 |
+
|
80 |
+
GPT-5 nano shines in scenarios where **real-time** responses and **cost control** are paramount:
|
81 |
+
|
82 |
+
1. **Mobile Assistants**: On-device chatbots for messaging apps, delivering **instant replies** without cloud overhead.
|
83 |
+
2. **IoT Interfaces**: Voice-enabled controls in smart home devices, capitalizing on **low-latency inference**.
|
84 |
+
3. **Edge Analytics**: Summarizing sensor data locally before batching uploads, reducing bandwidth usage.
|
85 |
+
4. **Educational Tools**: Lightweight tutoring bots that operate in-browser or on low-end hardware, providing **interactive learning**.
|
86 |
+
|
87 |
+
Compared to running GPT-5 main in a heavy cloud environment, nano enables **distributed deployment** at scale with **predictable per-token costs**.
|
88 |
+
|
89 |
+
## How to call ***\*`gpt-5-nano`\**** API from CometAPI
|
90 |
+
|
91 |
+
### **`\**\*\*`gpt-5-nano`\*\**\*`** API Pricing in CometAPI,20% off the official price:
|
92 |
+
|
93 |
+
| Input Tokens | $0.04 |
|
94 |
+
| ------------- | ----- |
|
95 |
+
| Output Tokens | $0.32 |
|
96 |
+
|
97 |
+
**See Also [Price](https://api.cometapi.com/pricing)**
|
98 |
+
|
99 |
+
### Required Steps
|
100 |
+
|
101 |
+
- Log in to [cometapi.com](http://cometapi.com/). If you are not our user yet, please register first
|
102 |
+
- Get the access credential API key of the interface. Click “Add Token” at the API token in the personal center, get the token key: sk-xxxxx and submit.
|
103 |
+
- Get the url of this site: https://api.cometapi.com/
|
104 |
+
|
105 |
+
### Use Method
|
106 |
+
|
107 |
+
1. Select the “`**`gpt-5-nano`**`” / “**`gpt-5-nano-2025-08-07`**” endpoint to send the API request and set the request body. The request method and request body are obtained from our website API doc. Our website also provides Apifox test for your convenience.
|
108 |
+
2. Replace <YOUR_API_KEY> with your actual CometAPI key from your account.
|
109 |
+
3. Insert your question or request into the content field—this is what the model will respond to.
|
110 |
+
4. . Process the API response to get the generated answer.
|
111 |
+
|
112 |
+
CometAPI provides a fully compatible REST API—for seamless migration. Key details to [API doc](https://apidoc.cometapi.com/api-13851472):
|
113 |
+
|
114 |
+
- **Core Parameters**: `prompt`, `max_tokens_to_sample`, `temperature`, `stop_sequences`
|
115 |
+
- **Endpoint:** https://api.cometapi.com/v1/chat/completions
|
116 |
+
- **Model Parameter:** “`gpt-5-nano`” / “`gpt-5-nano-2025-08-07`“
|
117 |
+
- **Authentication:** ` Bearer YOUR_CometAPI_API_KEY`
|
118 |
+
- **Content-Type:** `application/json` .
|
119 |
+
|
120 |
+
API Call Instructions: gpt-5-chat-latest should be called using the standard `/v1/chat/completions forma`t. For other models (gpt-5, gpt-5-mini, gpt-5-nano, and their dated versions), using `the /v1/responses format`[ is recommended](https://apidoc.cometapi.com/api-18535147).Currently two modes are available.
|
121 |
+
|
122 |
+
**See Also [GPT-5](https://www.cometapi.com/gpt-5-api/)** Model
|