NVIDIA Hopper

NVIDIA H200 GPU Cloud

Run NVIDIA H200 SXM GPUs with 141GB HBM3e — the same Hopper compute as the H100 with far more memory bandwidth, built for long-context LLMs and larger models without sharding.

Reserve capacity

NVIDIA H200 SXM

GPU memory: 141GB HBM3e
Memory bandwidth: 4.8 TB/s

Technical specifications

Architecture: Hopper
GPU memory: 141GB HBM3e
Memory bandwidth: 4.8 TB/s
NVLink: 900 GB/s
FP16/BF16 (Tensor): 1,979 TFLOPS
FP8 (Tensor): 3,958 TFLOPS
Max TDP: 700W
GPUs per node: 8 (HGX H200)

*Peak performance with sparsity, per NVIDIA official specs. Final specs may vary by node configuration.

Pricing & availability

NVIDIA H200 SXMAvailable on request

Talk to sales

What's the H200 best for?

Long-context & large-model inference

141GB HBM3e fits 70B-class models — and big KV caches — on a single GPU, so you serve longer context windows without tensor-parallel sharding.

Memory-bound training & fine-tuning

Larger batches and longer sequences fit in memory; 4.8 TB/s bandwidth keeps Hopper FP8 tensor cores fed for memory-bound workloads.

Drop-in Hopper upgrade

Same CUDA, PyTorch, and NeMo stack as the H100 — move memory-constrained workloads over with no code changes and more headroom.

Compare NVIDIA data-center GPUs

	H100 Hopper	H200 You're viewing	B200 Blackwell	B300 Blackwell
Architecture	Hopper	Hopper	Blackwell	Blackwell
GPU memory	80GB HBM3	141GB HBM3e	192GB HBM3e	up to 288GB HBM3e
Memory bandwidth	3.35 TB/s	4.8 TB/s	8 TB/s	8 TB/s
FP8 (Tensor)	3,958 TFLOPS	3,958 TFLOPS	9 PFLOPS	10 PFLOPS
Access	from $2.39/hr	Available on request	Available on request	Available on request
Best for	Cost-efficient training & inference	Long-context & large-model inference	Frontier-scale training (FP4)	Largest models & reasoning inference

Why industry-leading teams run GPUs on VESSL Cloud

No waitlists

Access capacity across clouds through one platform — skip quotas and procurement.

Scale to multi-node

Spin up a single GPU or scale to large multi-node clusters over high-speed InfiniBand — as much as you need.

Transparent pricing

Spot, on-demand, and reserved options with pay-as-you-go billing.

Enterprise-ready

SOC 2 Type II compliance, with dedicated support for production AI.

Frequently asked questions

Is the H200 available now?

H200 capacity is available on request. Talk to our team for current availability and pricing — we'll match capacity to your timeline.

What's the difference between the H100 and H200?

Both share the same Hopper compute (1,979 TFLOPS FP16 / 3,958 TFLOPS FP8). The H200 carries 141GB HBM3e at 4.8 TB/s vs the H100's 80GB HBM3 at 3.35 TB/s — so it fits larger models, bigger batches, and longer context windows.

How much memory does the H200 have?

The H200 SXM has 141GB of HBM3e memory at 4.8 TB/s bandwidth — about 76% more capacity and 43% more bandwidth than the H100.

Can I run multi-node H200 training?

Yes. We provision HGX H200 nodes (8 GPUs each) with high-speed InfiniBand for distributed training, with auto-checkpointing.

Should I pick the H200 or a Blackwell B200?

The H200 is the memory-rich Hopper option for large-model inference today. If you need FP4 acceleration and 192GB+ for frontier-scale work, look at the Blackwell B200/B300 — talk to us and we'll help you choose.