NVIDIA H100 GPU Cloud
Spin up NVIDIA H100 SXM GPUs in minutes from $2.39/hr — the proven Hopper workhorse for large-scale training, fine-tuning, and high-throughput inference.
from $2.39/hr

- GPU memory
- 80GB HBM3
- Memory bandwidth
- 3.35 TB/s
Technical specifications
- Architecture
- Hopper
- GPU memory
- 80GB HBM3
- Memory bandwidth
- 3.35 TB/s
- NVLink
- 900 GB/s
- FP16/BF16 (Tensor)
- 1,979 TFLOPS
- FP8 (Tensor)
- 3,958 TFLOPS
- Max TDP
- 700W
- GPUs per node
- 8 (HGX H100)
*Peak performance with sparsity, per NVIDIA official specs. Final specs may vary by node configuration.
Pricing & availability
What's the H100 best for?
Large-scale LLM training & fine-tuning
Run 70B–400B-class training on a proven Hopper stack — FP8 tensor cores, multi-node InfiniBand, NeMo and Megatron support, with auto-checkpointing baked in.
High-throughput inference
H100 FP8 keeps token throughput high and tail latency low for production LLM serving.
Research & HPC
Mature CUDA, PyTorch, JAX, and NeMo ecosystem — spin up H100 nodes on demand for experiments, scientific compute, and deadline-driven runs.
Compare NVIDIA data-center GPUs
| H100 You're viewing | H200 Hopper | B200 Blackwell | B300 Blackwell | |
|---|---|---|---|---|
| Architecture | Hopper | Hopper | Blackwell | Blackwell |
| GPU memory | 80GB HBM3 | 141GB HBM3e | 192GB HBM3e | up to 288GB HBM3e |
| Memory bandwidth | 3.35 TB/s | 4.8 TB/s | 8 TB/s | 8 TB/s |
| FP8 (Tensor) | 3,958 TFLOPS | 3,958 TFLOPS | 9 PFLOPS | 10 PFLOPS |
| Access | from $2.39/hr | Available on request | Available on request | Available on request |
| Best for | Cost-efficient training & inference | Long-context & large-model inference | Frontier-scale training (FP4) | Largest models & reasoning inference |
Why industry-leading teams run GPUs on VESSL Cloud
No waitlists
Access capacity across clouds through one platform — skip quotas and procurement.
Scale to multi-node
Spin up a single GPU or scale to large multi-node clusters over high-speed InfiniBand — as much as you need.
Transparent pricing
Spot, on-demand, and reserved options with pay-as-you-go billing.
Enterprise-ready
SOC 2 Type II compliance, with dedicated support for production AI.
Frequently asked questions
How much does an NVIDIA H100 cost on VESSL Cloud?
H100 SXM (80GB) starts at $2.39/hr on-demand, with up to 15% off on reserved commitments. You can start instantly at cloud.vessl.ai — no waitlist.
How much VRAM does the H100 have?
The H100 SXM has 80GB of HBM3 memory at 3.35 TB/s bandwidth. If you need more memory for larger models or longer context, the H200 offers 141GB HBM3e.
What's the difference between the H100 and H200?
Both share the same Hopper compute, but the H200 carries 141GB of faster HBM3e memory (vs 80GB HBM3 on the H100) at 4.8 TB/s — fitting larger models, bigger batches, and longer context windows.
Can I run multi-node H100 training?
Yes. VESSL Cloud provisions HGX H100 nodes (8 GPUs each) with high-speed InfiniBand for distributed training, plus auto-checkpointing. Dedicated H100 VM Clusters (beta) add root SSH and bare-metal-class performance.
Do you offer reserved or academic pricing?
Reserved commitments (3+ months) save up to 15% and guarantee capacity. Research labs and universities can access discounted academic rates — contact us.
Explore other GPUs
Different workload? Pick the GPU that fits your memory, throughput, and budget.
Same Hopper compute as the H100 with 141GB HBM3e — for long-context LLMs and larger models without sharding.
View detailsBlackwell with 192GB HBM3e and FP4 acceleration — for frontier-scale training and high-throughput inference.
View detailsBlackwell Ultra with up to 288GB HBM3e — for the largest models and high-concurrency reasoning inference.
View detailsStop chasing GPUs.
Start shipping AI.
Unified access to GPU capacity across providers. One platform, transparent pricing.
- Start in minutes
- Scale to multi-node clusters
- High availability built-in
- 24/7 support available