Skip to main content

AWS Bedrock

ALL Bedrock models (Anthropic, Meta, Mistral, Amazon, etc.) are Supported

LiteLLM requires boto3 to be installed on your system for Bedrock requests

pip install boto3>=1.28.57

Required Environment Variables

os.environ["AWS_ACCESS_KEY_ID"] = ""  # Access key
os.environ["AWS_SECRET_ACCESS_KEY"] = "" # Secret access key
os.environ["AWS_REGION_NAME"] = "" # us-east-1, us-east-2, us-west-1, us-west-2

Usage

Open In Colab
import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)

LiteLLM Proxy Usage

Here's how to call Anthropic with the LiteLLM Proxy Server

1. Setup config.yaml

model_list:
- model_name: bedrock-claude-v1
litellm_params:
model: bedrock/anthropic.claude-instant-v1
aws_access_key_id: os.environ/CUSTOM_AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/CUSTOM_AWS_SECRET_ACCESS_KEY
aws_region_name: os.environ/CUSTOM_AWS_REGION_NAME

All possible auth params:

aws_access_key_id: Optional[str],
aws_secret_access_key: Optional[str],
aws_session_token: Optional[str],
aws_region_name: Optional[str],
aws_session_name: Optional[str],
aws_profile_name: Optional[str],
aws_role_name: Optional[str],
aws_web_identity_token: Optional[str],

2. Start the proxy

litellm --config /path/to/config.yaml

3. Test it

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
"model": "bedrock-claude-v1",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}
'

Set temperature, top p, etc.

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{ "content": "Hello, how are you?","role": "user"}],
temperature=0.7,
top_p=1
)

Pass provider-specific params

If you pass a non-openai param to litellm, we'll assume it's provider-specific and send it as a kwarg in the request body. See more

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[{ "content": "Hello, how are you?","role": "user"}],
top_k=1 # 👈 PROVIDER-SPECIFIC PARAM
)

Usage - Function Calling

LiteLLM uses Bedrock's Converse API for making tool calls

from litellm import completion

# set env
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location"],
},
},
}
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]

response = completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=messages,
tools=tools,
tool_choice="auto",
)
# Add any assertions, here to check response args
print(response)
assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)
assert isinstance(
response.choices[0].message.tool_calls[0].function.arguments, str
)

Usage - Vision

from litellm import completion

# set env
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""


def encode_image(image_path):
import base64

with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode("utf-8")


image_path = "../proxy/cached_logo.jpg"
# Getting the base64 string
base64_image = encode_image(image_path)
resp = litellm.completion(
model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Whats in this image?"},
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64," + base64_image
},
},
],
}
],
)
print(f"\nResponse: {resp}")

Usage - Bedrock Guardrails

Example of using Bedrock Guardrails with LiteLLM

from litellm import completion

# set env
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
model="anthropic.claude-v2",
messages=[
{
"content": "where do i buy coffee from? ",
"role": "user",
}
],
max_tokens=10,
guardrailConfig={
"guardrailIdentifier": "ff6ujrregl1q", # The identifier (ID) for the guardrail.
"guardrailVersion": "DRAFT", # The version of the guardrail.
"trace": "disabled", # The trace behavior for the guardrail. Can either be "disabled" or "enabled"
},
)

Usage - "Assistant Pre-fill"

If you're using Anthropic's Claude with Bedrock, you can "put words in Claude's mouth" by including an assistant role message as the last item in the messages array.

[!IMPORTANT] The returned completion will not include your "pre-fill" text, since it is part of the prompt itself. Make sure to prefix Claude's completion with your pre-fill.

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

messages = [
{"role": "user", "content": "How do you say 'Hello' in German? Return your answer as a JSON object, like this:\n\n{ \"Hello\": \"Hallo\" }"},
{"role": "assistant", "content": "{"},
]
response = completion(model="bedrock/anthropic.claude-v2", messages=messages)

Example prompt sent to Claude


Human: How do you say 'Hello' in German? Return your answer as a JSON object, like this:

{ "Hello": "Hallo" }

Assistant: {

Usage - "System" messages

If you're using Anthropic's Claude 2.1 with Bedrock, system role messages are properly formatted for you.

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

messages = [
{"role": "system", "content": "You are a snarky assistant."},
{"role": "user", "content": "How do I boil water?"},
]
response = completion(model="bedrock/anthropic.claude-v2:1", messages=messages)

Example prompt sent to Claude

You are a snarky assistant.

Human: How do I boil water?

Assistant:

Usage - Streaming

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
stream=True
)
for chunk in response:
print(chunk)

Example Streaming Output Chunk

{
"choices": [
{
"finish_reason": null,
"index": 0,
"delta": {
"content": "ase can appeal the case to a higher federal court. If a higher federal court rules in a way that conflicts with a ruling from a lower federal court or conflicts with a ruling from a higher state court, the parties involved in the case can appeal the case to the Supreme Court. In order to appeal a case to the Sup"
}
}
],
"created": null,
"model": "anthropic.claude-instant-v1",
"usage": {
"prompt_tokens": null,
"completion_tokens": null,
"total_tokens": null
}
}

Cross-region inferencing

LiteLLM supports Bedrock cross-region inferencing across all supported bedrock models.

from litellm import completion 
import os


os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""


litellm.set_verbose = True # 👈 SEE RAW REQUEST

response = completion(
model="bedrock/us.anthropic.claude-3-haiku-20240307-v1:0",
messages=messages,
max_tokens=10,
temperature=0.1,
)

print("Final Response: {}".format(response))

Alternate user/assistant messages

Use user_continue_message to add a default user message, for cases (e.g. Autogen) where the client might not follow alternating user/assistant messages starting and ending with a user message.

model_list:
- model_name: "bedrock-claude"
litellm_params:
model: "bedrock/anthropic.claude-instant-v1"
user_continue_message: {"role": "user", "content": "Please continue"}

OR

just set litellm.modify_params=True and LiteLLM will automatically handle this with a default user_continue_message.

model_list:
- model_name: "bedrock-claude"
litellm_params:
model: "bedrock/anthropic.claude-instant-v1"

litellm_settings:
modify_params: true

Test it!

curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"model": "bedrock-claude",
"messages": [{"role": "assistant", "content": "Hey, how's it going?"}]
}'

Boto3 - Authentication

Passing credentials as parameters - Completion()

Pass AWS credentials as parameters to litellm.completion

import os
from litellm import completion

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_access_key_id="",
aws_secret_access_key="",
aws_region_name="",
)

Passing extra headers + Custom API Endpoints

This can be used to override existing headers (e.g. Authorization) when calling custom api endpoints

import os
import litellm
from litellm import completion

litellm.set_verbose = True # 👈 SEE RAW REQUEST

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_access_key_id="",
aws_secret_access_key="",
aws_region_name="",
aws_bedrock_runtime_endpoint="https://my-fake-endpoint.com",
extra_headers={"key": "value"}
)

SSO Login (AWS Profile)

  • Set AWS_PROFILE environment variable
  • Make bedrock completion call
import os
from litellm import completion

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}]
)

or pass aws_profile_name:

import os
from litellm import completion

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_profile_name="dev-profile",
)

STS based Auth

  • Set aws_role_name and aws_session_name in completion() / embedding() function

Make the bedrock completion call

from litellm import completion

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=messages,
max_tokens=10,
temperature=0.1,
aws_role_name=aws_role_name,
aws_session_name="my-test-session",
)

If you also need to dynamically set the aws user accessing the role, add the additional args in the completion()/embedding() function

from litellm import completion

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=messages,
max_tokens=10,
temperature=0.1,
aws_region_name=aws_region_name,
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key,
aws_role_name=aws_role_name,
aws_session_name="my-test-session",
)

Passing an external BedrockRuntime.Client as a parameter - Completion()

danger

This is a deprecated flow. Boto3 is not async. And boto3.client does not let us make the http call through httpx. Pass in your aws params through the method above 👆. See Auth Code Add new auth flow

Experimental - 2024-Jun-23: aws_access_key_id, aws_secret_access_key, and aws_session_token will be extracted from boto3.client and be passed into the httpx client

Pass an external BedrockRuntime.Client object as a parameter to litellm.completion. Useful when using an AWS credentials profile, SSO session, assumed role session, or if environment variables are not available for auth.

Create a client from session credentials:

import boto3
from litellm import completion

bedrock = boto3.client(
service_name="bedrock-runtime",
region_name="us-east-1",
aws_access_key_id="",
aws_secret_access_key="",
aws_session_token="",
)

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_bedrock_client=bedrock,
)

Create a client from AWS profile in ~/.aws/config:

import boto3
from litellm import completion

dev_session = boto3.Session(profile_name="dev-profile")
bedrock = dev_session.client(
service_name="bedrock-runtime",
region_name="us-east-1",
)

response = completion(
model="bedrock/anthropic.claude-instant-v1",
messages=[{ "content": "Hello, how are you?","role": "user"}],
aws_bedrock_client=bedrock,
)

Provisioned throughput models

To use provisioned throughput Bedrock models pass

  • model=bedrock/<base-model>, example model=bedrock/anthropic.claude-v2. Set model to any of the Supported AWS models
  • model_id=provisioned-model-arn

Completion

import litellm
response = litellm.completion(
model="bedrock/anthropic.claude-instant-v1",
model_id="provisioned-model-arn",
messages=[{"content": "Hello, how are you?", "role": "user"}]
)

Embedding

import litellm
response = litellm.embedding(
model="bedrock/amazon.titan-embed-text-v1",
model_id="provisioned-model-arn",
input=["hi"],
)

Supported AWS Bedrock Models

Here's an example of using a bedrock model with LiteLLM. For a complete list, refer to the model cost map

Model NameCommand
Anthropic Claude-V3.5 Sonnetcompletion(model='bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0', messages=messages)
Anthropic Claude-V3 sonnetcompletion(model='bedrock/anthropic.claude-3-sonnet-20240229-v1:0', messages=messages)
Anthropic Claude-V3 Haikucompletion(model='bedrock/anthropic.claude-3-haiku-20240307-v1:0', messages=messages)
Anthropic Claude-V3 Opuscompletion(model='bedrock/anthropic.claude-3-opus-20240229-v1:0', messages=messages)
Anthropic Claude-V2.1completion(model='bedrock/anthropic.claude-v2:1', messages=messages)
Anthropic Claude-V2completion(model='bedrock/anthropic.claude-v2', messages=messages)
Anthropic Claude-Instant V1completion(model='bedrock/anthropic.claude-instant-v1', messages=messages)
Meta llama3-1-405bcompletion(model='bedrock/meta.llama3-1-405b-instruct-v1:0', messages=messages)
Meta llama3-1-70bcompletion(model='bedrock/meta.llama3-1-70b-instruct-v1:0', messages=messages)
Meta llama3-1-8bcompletion(model='bedrock/meta.llama3-1-8b-instruct-v1:0', messages=messages)
Meta llama3-70bcompletion(model='bedrock/meta.llama3-70b-instruct-v1:0', messages=messages)
Meta llama3-8bcompletion(model='bedrock/meta.llama3-8b-instruct-v1:0', messages=messages)
Amazon Titan Litecompletion(model='bedrock/amazon.titan-text-lite-v1', messages=messages)
Amazon Titan Expresscompletion(model='bedrock/amazon.titan-text-express-v1', messages=messages)
Cohere Commandcompletion(model='bedrock/cohere.command-text-v14', messages=messages)
AI21 J2-Midcompletion(model='bedrock/ai21.j2-mid-v1', messages=messages)
AI21 J2-Ultracompletion(model='bedrock/ai21.j2-ultra-v1', messages=messages)
AI21 Jamba-Instructcompletion(model='bedrock/ai21.jamba-instruct-v1:0', messages=messages)
Meta Llama 2 Chat 13bcompletion(model='bedrock/meta.llama2-13b-chat-v1', messages=messages)
Meta Llama 2 Chat 70bcompletion(model='bedrock/meta.llama2-70b-chat-v1', messages=messages)
Mistral 7B Instructcompletion(model='bedrock/mistral.mistral-7b-instruct-v0:2', messages=messages)
Mixtral 8x7B Instructcompletion(model='bedrock/mistral.mixtral-8x7b-instruct-v0:1', messages=messages)

Bedrock Embedding

API keys

This can be set as env variables or passed as params to litellm.embedding()

import os
os.environ["AWS_ACCESS_KEY_ID"] = "" # Access key
os.environ["AWS_SECRET_ACCESS_KEY"] = "" # Secret access key
os.environ["AWS_REGION_NAME"] = "" # us-east-1, us-east-2, us-west-1, us-west-2

Usage

from litellm import embedding
response = embedding(
model="bedrock/amazon.titan-embed-text-v1",
input=["good morning from litellm"],
)
print(response)

Supported AWS Bedrock Embedding Models

Model NameUsageSupported Additional OpenAI params
Titan Embeddings V2embedding(model="bedrock/amazon.titan-embed-text-v2:0", input=input)here
Titan Embeddings - V1embedding(model="bedrock/amazon.titan-embed-text-v1", input=input)here
Titan Multimodal Embeddingsembedding(model="bedrock/amazon.titan-embed-image-v1", input=input)here
Cohere Embeddings - Englishembedding(model="bedrock/cohere.embed-english-v3", input=input)here
Cohere Embeddings - Multilingualembedding(model="bedrock/cohere.embed-multilingual-v3", input=input)here

Advanced - Drop Unsupported Params

Advanced - Pass model/provider-specific Params

Image Generation

Use this for stable diffusion on bedrock

Usage

import os
from litellm import image_generation

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = image_generation(
prompt="A cute baby sea otter",
model="bedrock/stability.stable-diffusion-xl-v0",
)
print(f"response: {response}")

Set optional params

import os
from litellm import image_generation

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = image_generation(
prompt="A cute baby sea otter",
model="bedrock/stability.stable-diffusion-xl-v0",
### OPENAI-COMPATIBLE ###
size="128x512", # width=128, height=512
### PROVIDER-SPECIFIC ### see `AmazonStabilityConfig` in bedrock.py for all params
seed=30
)
print(f"response: {response}")

Supported AWS Bedrock Image Generation Models

Model NameFunction Call
Stable Diffusion - v0embedding(model="bedrock/stability.stable-diffusion-xl-v0", prompt=prompt)
Stable Diffusion - v0embedding(model="bedrock/stability.stable-diffusion-xl-v1", prompt=prompt)