NVIDIA and AWS: EC2 G7 instances with Blackwell GPU deliver 4.6× better AI inference
NVIDIA and AWS announced EC2 G7 instances with the RTX PRO 4500 Blackwell GPU delivering 4.6× better AI inference than the previous G6 generation, while the cuVS library becomes the default in Amazon OpenSearch Serverless with 10× faster vector indexing.
This article was generated using artificial intelligence from primary sources.
EC2 G7: the new AWS standard for AI inference
Amazon EC2 G7 instances, powered by the NVIDIA RTX PRO 4500 Blackwell Server Edition GPU — a fifth-generation architecture designed exclusively for inference and graphics workloads in data centers — become the new AWS standard for AI inference (running trained model predictions in production). Compared to the previous G6 instances, G7 delivers up to 4.6× better AI inference performance thanks to the advanced Blackwell microarchitecture and higher memory bandwidth.
The G7 instance configuration is impressive: up to 8 GPUs per instance with a combined 256 GB of GPU memory, 700 Gbps EFA (Elastic Fabric Adapter) networking for low latency between nodes, and 7.6 TB of NVMe SSD storage. This configuration enables running large language models and multimodal AI applications without data transfer bottlenecks.
Why is cuVS in OpenSearch a milestone?
NVIDIA cuVS (CUDA Vector Search) — a library for GPU-accelerated vector indexing and semantic search — has become the default option in Amazon OpenSearch Serverless. Vector indexing is the process of converting textual, image, or audio data into mathematical vectors that can be quickly searched by semantic similarity, which is the foundation of RAG (Retrieval-Augmented Generation) systems and modern AI search engines.
By integrating cuVS as the default setting, OpenSearch Serverless users automatically get 10× faster vector indexing at just a quarter of the previous cost — with no changes to their code or configuration. This is particularly significant for companies building AI applications with large document catalogs or products based on semantic search.
AWS achieves NVIDIA Exemplar Cloud status
Amazon Web Services has achieved NVIDIA Exemplar Cloud status for GB300 training — the highest level of NVIDIA certification for cloud partners. This designation confirms that AWS infrastructure meets the most demanding requirements for training the most complex AI models on NVIDIA GB300 NVL72 clusters, positioning AWS as the primary platform for large-scale enterprise AI projects.
Frequently Asked Questions
- What are Amazon EC2 G7 instances and how do they differ from G6?
- EC2 G7 instances run the NVIDIA RTX PRO 4500 Blackwell Server Edition GPU, which offers up to 4.6× better AI inference than the previous G6 instances based on the older Ampere/Ada architecture.
- What is NVIDIA cuVS and why does it matter that it becomes the default in OpenSearch?
- cuVS (CUDA Vector Search) is NVIDIA's library for accelerated GPU vector indexing and search; its integration as the default in Amazon OpenSearch Serverless automatically gives users 10× faster indexing at a quarter of the previous cost.