Anthropic SDK

TL;DR

Official Anthropic client libraries for the Claude family of models, available in Python, TypeScript, Java, Go, and Ruby with parity across the messages API surface.
Supports tool use via JSON-schema tool definitions, streaming via Server-Sent Events, vision input, prompt caching, batch processing, and extended thinking on Claude Opus and Sonnet tiers.
Ships native Model Context Protocol (MCP) client support, letting Claude connect to external tools and data sources via the open protocol Anthropic published in late 2024.
Authenticates with an `ANTHROPIC_API_KEY` environment variable; also available via Amazon Bedrock and Google Vertex AI with provider-specific adapters in the same SDK.

The Messages API#

The Anthropic SDK's primary surface is the Messages API. A request takes a model identifier, a list of messages (each with a role and content), an optional system prompt, and parameters like max tokens and temperature. The response is a Message object containing one or more content blocks — text, tool_use, thinking, or image.

The content-block model is what makes Claude's outputs structured rather than just a string. A response that uses a tool returns a `text` block (the assistant's reasoning), a `tool_use` block (the tool name and JSON arguments), and stops with `stop_reason: tool_use`. The application runs the tool, appends a `tool_result` block, and resends — the same shape on every iteration.

python

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-5-20250930",
    max_tokens=1024,
    tools=[{
        "name": "get_weather",
        "description": "Get the current weather for a city.",
        "input_schema": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    }],
    messages=[{"role": "user", "content": "What's the weather in London?"}],
)

for block in response.content:
    if block.type == "tool_use":
        print(block.name, block.input)

Tool Use#

Tool use is JSON-schema based: each tool has a name, description, and `input_schema` (a JSON Schema describing the expected arguments). The model decides when to call a tool and emits a `tool_use` block with structured arguments. The application is responsible for executing the tool and returning the result.

Anthropic also ships a set of native tools that run server-side without round trips: `web_search`, `code_execution`, `computer_use` (for screen + keyboard + mouse control), `text_editor`, and `bash`. These are invoked by name with no schema definition required and significantly simplify common agentic patterns.

Prompt Caching#

Prompt caching is one of the most impactful production features in the SDK. Mark a content block with `cache_control: {"type": "ephemeral"}` and Anthropic caches it for five minutes (or one hour on the extended tier). Subsequent requests that share the same prefix hit the cache and pay roughly 10% of normal input cost for cached tokens.

For agents with large system prompts, RAG context, or long conversation history, this is often a 70-90% cost reduction with no behaviour change. Cache breakpoints respect the entire prefix up to the marked block, so order matters: put static content (system prompts, tool definitions, retrieved documents) before dynamic content (the user's latest message).

If you are not using prompt caching in production, you are leaving money on the table. Mark the end of your system prompt and the end of your retrieved-context block as cache breakpoints and measure your input token bill the next day.

Extended Thinking#

On Claude Opus and Sonnet, extended thinking lets the model spend additional compute on a `thinking` block before responding. The thinking text is returned in the response (so it can be logged or audited) but is not part of the assistant message the model later sees, preserving turn-by-turn coherence.

Enable it with the `thinking` parameter on the request. The trade-off is straightforward: more thinking budget means better accuracy on hard problems, more tokens, and higher latency. Calibrate against your task — most chat workloads do not benefit, most reasoning-heavy agent workloads do.

Native MCP Integration#

The SDK ships a built-in MCP client that lets Claude connect to MCP servers as tool sources. Configure a list of MCP server connections on the client; their tools become available to the model without manual schema work. This makes integrating filesystem, database, GitHub, and dozens of community MCP servers a configuration change rather than a code change.

Provider Adapters#

The same SDK supports Claude via Amazon Bedrock (`AnthropicBedrock`) and Google Vertex AI (`AnthropicVertex`) with credential handling delegated to the respective cloud SDK. For regulated workloads where data residency or BAA requirements rule out direct API access, the adapter approach means the application code stays identical and only the client construction changes.

References

Anthropic API Documentation · Anthropic
anthropic-sdk-python on GitHub · GitHub
anthropic-sdk-typescript on GitHub · GitHub