# Model Version

### What is a Model?

A model is the AI engine behind your agent. Each model has different speed, accuracy, and cost trade-offs.

### What's the Difference?

Here's a quick breakdown of the current models you can use:

#### OpenAI Models

**GPT-4o (2024-11-20)**

* Best-in-class performance and reasoning
* Ideal for advanced use cases, voice agents, and long replies
* $0.0109 /avg response

**GPT-4o-mini**

* Fast, efficient, and highly affordable
* Great for general tasks, FAQs, and quick replies
* $0.0007 /avg response

**GPT-4O (2024-08-06)**

* Previous version of GPT-4o with solid performance
* Reliable for production workloads
* $0.0109 /avg response

**GPT-4.1**

* Strong reasoning and accuracy
* Works well for creative tasks or agents needing nuance
* $0.0088 /avg response

**GPT-4.1-mini**

* Balanced power and cost
* Smart choice for mid-volume use
* $0.0017 /avg response

**GPT-4.1-nano**

* Ultra lightweight, fast, and budget-friendly
* Best for simple Q\&A and high-volume bots
* $0.0004 /avg response

**GPT-5**

* OpenAI's most advanced model with unified reasoning
* Exceptional for coding, complex problem-solving, and expert-level responses
* Built-in smart routing between fast and deep-thinking modes
* $0.0055 /avg response

**GPT-5-mini**

* Cost-effective variant with strong capabilities
* Optimized for speed while maintaining quality
* Great for lighter applications with good reasoning
* $0.0011 /avg response

**GPT-5-nano**

* Ultra-fast, ultra-low-latency model
* Optimized for real-time and embedded applications
* Perfect for high-throughput, speed-critical tasks
* $0.0002 /avg response

#### Anthropic (Claude) Models

**CLAUDE-3.7 Sonnet**

* First hybrid reasoning model combining instant and extended thinking
* State-of-the-art for coding, agentic tasks, and complex workflows
* Exceptional at long context understanding and structured outputs
* Can think through problems step-by-step when needed
* $0.0131 /avg response

**CLAUDE-3.5 Haiku**

* Fast, light, and efficient
* Great choice for speed-focused use cases
* Good balance of quality and performance
* $0.0011 /avg response

#### xAI Models

**GROK-2-1212**

* Enhanced accuracy and instruction-following
* Strong multilingual capabilities
* More personality and conversational style
* Good mix of speed, accuracy, and creativity
* $0.0088 /avg response

***

**Key takeaways:**

* **For complex reasoning & coding**: GPT-5, Claude 3.7 Sonnet
* **For best value**: GPT-4o-mini, GPT-4.1-nano, GPT-5-nano, Claude 3.5 Haiku
* **For balanced performance**: GPT-4.1, GPT-5-mini, Grok-2-1212
* **For production reliability**: GPT-4o (2024-11-20), GPT-4.1

{% embed url="<https://youtu.be/BpF_3IoRV18>" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.stammer.ai/stammer.ai-docs/chat-ai-agents/general-settings/model-version.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
