Amazon SageMaker HyperPod expands support to G7e and r5d.16xlarge instances
Amazon SageMaker HyperPod has expanded its capabilities by adding support for G7e and r5d.16xlarge instances, enhancing performance and scalability for AI/ML infrastructure.
Amazon SageMaker HyperPod has announced the expansion of its supported instance types to include G7e and r5d.16xlarge. SageMaker HyperPod is specifically designed to facilitate the development, training, and deployment of foundational models at scale, offering a robust and efficient environment. It features built-in fault tolerance, automated cluster recovery, and optimized libraries for distributed training, which collectively ease the complexities associated with managing extensive AI/ML infrastructure.
The G7e instances, equipped with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, provide significantly improved performance, delivering up to 2.3 times better inference capabilities than their G6e counterparts. This enhancement allows for processing a greater number of requests per second with reduced latency. With a total GPU memory capacity of up to 768 GB, G7e instances support the deployment of larger language models and the operation of multiple models simultaneously on a single endpoint. These instances are ideal for deploying large language models (LLMs), agentic AI, multimodal generative AI, and physical AI models. Furthermore, they offer a cost-effective solution for single-node fine-tuning or training of natural language processing (NLP), computer vision, and smaller generative AI models, boasting up to 1.27 times the TFLOPs and up to four times the GPU-to-GPU bandwidth compared to G6e instances.
The r5d.16xlarge instance is also now supported by HyperPod. It features 64 vCPUs, 512 GB of memory, and five 600 GB NVMe SSD instance storage, powered by Intel Xeon Platinum 8000 series processors with a sustained all-core turbo frequency of up to 3.1 GHz. This instance is particularly suited for distributed training data preprocessing, especially with frameworks like Ray, large-scale feature engineering, and running memory-intensive orchestration services alongside GPU computation.
G7e instances are available in regions including US East (N. Virginia), US East (Ohio), Asia Pacific (Tokyo), and US West (Oregon), while r5d.16xlarge is accessible in all regions where Amazon SageMaker HyperPod is offered.