Amazon SageMaker AI introduces support for OpenAI-compatible APIs for inference endpoints
Amazon SageMaker Inference now supports OpenAI-compatible APIs, allowing seamless integration with familiar tools by simply changing an endpoint URL. This update is available in multiple global regions.
Amazon SageMaker Inference has announced support for OpenAI-compatible APIs, enabling users to utilize familiar tools and frameworks such as the OpenAI SDK, LangChain, and Strands Agents to connect directly to SageMaker endpoints. This integration requires only a change in the endpoint URL, eliminating the need for custom integration code, SDK wrappers, or rewrites. Users do not need to adopt a different API format or alter their authentication method; a simple change of the endpoint URL ensures that existing SDK calls, streaming logic, and framework integrations function seamlessly.
With this update, users can select their preferred GPU instances, maintain data within their own VPC, operate any open source or fine-tuned model, and implement auto-scaling policies that are optimized for their specific workloads. Authentication is managed through existing AWS credentials with automatic token refresh, simplifying production management.
This new capability is currently available in various regions, including US East (N. Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Mumbai), Asia Pacific (Jakarta), Europe (Ireland), Europe (Frankfurt), South America (São Paulo), Asia Pacific (Tokyo), Asia Pacific (Seoul), Europe (London), Asia Pacific (Singapore), Asia Pacific (Sydney), and Canada (Central). For further information and to begin utilizing this feature, users are encouraged to read the launch blog or consult the SageMaker Inference documentation.