NVIDIA Blackwell Ultra

NVIDIA B300 GPU Cloud

Reserve NVIDIA B300 (Blackwell Ultra) capacity on VESSL Cloud — up to 288GB HBM3e and FP4 acceleration for the largest models and high-concurrency reasoning inference.

Reserve capacity

NVIDIA B300

GPU memory: up to 288GB HBM3e
Memory bandwidth: 8 TB/s

Technical specifications

Architecture: Blackwell
GPU memory: up to 288GB HBM3e
Memory bandwidth: 8 TB/s
NVLink: 1.8 TB/s
FP8 (Tensor): 10 PFLOPS
FP4 (Tensor): 20 PFLOPS
Max TDP: 1,400W
GPUs per node: 8 (HGX B300)

*Peak performance with sparsity, per NVIDIA official specs. Final specs may vary by node configuration.

Pricing & availability

NVIDIA B300Available on request

Talk to sales

What's the B300 best for?

Largest-model training

Up to 288GB HBM3e per GPU and 1.8 TB/s NVLink keep trillion-parameter models resident with fewer partitions and less communication overhead.

Reasoning & long-context inference

Blackwell Ultra's huge HBM3e holds massive KV caches; ~1.5× FP4 vs B200 serves agentic and reasoning workloads at high concurrency.

Consolidate inference fleets

More memory and FP4 throughput per GPU means fewer GPUs for the same serving capacity — lower cost-per-token for large deployments.

Compare NVIDIA data-center GPUs

	H100 Hopper	H200 Hopper	B200 Blackwell	B300 You're viewing
Architecture	Hopper	Hopper	Blackwell	Blackwell
GPU memory	80GB HBM3	141GB HBM3e	192GB HBM3e	up to 288GB HBM3e
Memory bandwidth	3.35 TB/s	4.8 TB/s	8 TB/s	8 TB/s
FP8 (Tensor)	3,958 TFLOPS	3,958 TFLOPS	9 PFLOPS	10 PFLOPS
Access	from $2.39/hr	Available on request	Available on request	Available on request
Best for	Cost-efficient training & inference	Long-context & large-model inference	Frontier-scale training (FP4)	Largest models & reasoning inference

Why industry-leading teams run GPUs on VESSL Cloud

No waitlists

Access capacity across clouds through one platform — skip quotas and procurement.

Scale to multi-node

Spin up a single GPU or scale to large multi-node clusters over high-speed InfiniBand — as much as you need.

Transparent pricing

Spot, on-demand, and reserved options with pay-as-you-go billing.

Enterprise-ready

SOC 2 Type II compliance, with dedicated support for production AI.

Frequently asked questions

How do I get access to NVIDIA B300 GPUs?

B300 (Blackwell Ultra) capacity is allocated on request. Talk to our team and we'll secure capacity matched to your timeline.

How much memory does the B300 have?

HGX B300 (Blackwell Ultra) scales to up to 288GB HBM3e per GPU — talk to our team for current node configurations and availability.

What's the difference between the B200 and B300?

The B300 (Blackwell Ultra) increases memory to up to 288GB HBM3e (vs 192GB on the B200) and adds roughly 1.5× FP4 compute — built for the largest models and high-concurrency reasoning inference.

Is the B300 better for training or inference?

Both. FP4/FP8 acceleration and up to 288GB HBM3e make the B300 ideal for frontier-scale training and high-throughput, low-latency reasoning inference.