NVIDIA H100 vs H200
Compare NVIDIA H100 and H200 side by side — compute, memory, and pricing. H100 on-demand from $2.39/hr; H200 (141GB HBM3e) on request. Run either on VESSL Cloud.
from $2.39/hr

- GPU memory
- 80GB HBM3
- Memory bandwidth
- 3.35 TB/s
- GPU memory
- 141GB HBM3e
- Memory bandwidth
- 4.8 TB/s
Technical specifications
H100 NVIDIA H100 SXM | H200 NVIDIA H200 SXM | |
|---|---|---|
| Architecture | Hopper | Hopper |
| GPU memory | 80GB HBM3 | 141GB HBM3e |
| Memory bandwidth | 3.35 TB/s | 4.8 TB/s |
| NVLink | 900 GB/s | 900 GB/s |
| FP16/BF16 (Tensor) | 1,979 TFLOPS | 1,979 TFLOPS |
| FP8 (Tensor) | 3,958 TFLOPS | 3,958 TFLOPS |
| Max TDP | 700W | 700W |
| GPUs per node | 8 (HGX H100) | 8 (HGX H200) |
*Peak performance with sparsity, per NVIDIA official specs. Final specs may vary by node configuration.
Pricing & availability
What's Hopper best for?
Large-scale LLM training & fine-tuning
Run 70B–400B-class training on a proven Hopper stack — FP8 tensor cores, multi-node InfiniBand, NeMo and Megatron support, with auto-checkpointing baked in.
High-throughput LLM inference
H200's 141GB HBM3e fits 70B-class models on a single GPU, while H100 FP8 keeps token throughput high and tail latency low.
Research & HPC
Mature CUDA, PyTorch, JAX, and NeMo ecosystem — spin up Hopper nodes on demand for experiments, scientific compute, and deadline-driven runs.
Why industry-leading teams run GPUs on VESSL Cloud
No waitlists
Access capacity across clouds through one platform — skip quotas and procurement.
Scale to multi-node
Spin up a single GPU or scale to large multi-node clusters over high-speed InfiniBand — as much as you need.
Transparent pricing
Spot, on-demand, and reserved options with pay-as-you-go billing.
Enterprise-ready
SOC 2 Type II compliance, with dedicated support for production AI.
Frequently asked questions
How much does an NVIDIA H100 cost on VESSL Cloud?
H100 SXM (80GB) starts at $2.39/hr on-demand, with up to 15% off on reserved commitments. You can start instantly at cloud.vessl.ai — no waitlist.
What's the difference between the H100 and H200?
Both share the same Hopper compute, but the H200 carries 141GB of faster HBM3e memory (vs 80GB HBM3 on the H100) at 4.8 TB/s bandwidth — fitting larger models, bigger batches, and longer context windows.
Is the H200 available now?
H200 capacity is available on request. Talk to our team for current availability and pricing.
Can I run multi-node H100/H200 training?
Yes. VESSL Cloud provisions HGX nodes (8 GPUs each) with high-speed InfiniBand for distributed training, plus auto-checkpointing.
Do you offer reserved or academic pricing?
Reserved commitments (3+ months) save up to 15% and guarantee capacity. Research labs and universities can access discounted academic rates — contact us.
Stop chasing GPUs.
Start shipping AI.
Unified access to GPU capacity across providers. One platform, transparent pricing.
- Start in minutes
- Scale to multi-node clusters
- High availability built-in
- 24/7 support available