All patterns
Snapshot from Replicate
Verified on 2026-03-09
GPU Hardware Tier Pricing
Pay for compute time by hardware.
Different GPU tiers (T4, A100, H100) with per-second billing. You pay only for the time your model is running.
Aligns cost with actual compute resources used. Hardware tiers let customers optimize for speed vs cost. Per-second billing ensures fair pricing for short inference jobs.
Implementation
This snippet is the closest Owostack implementation of the live pricing shape above. It is not a literal copy of the vendor's internal billing system.
const t4Seconds = metered("t4-seconds", {
name: "T4 Inference Seconds",
});
const a100Seconds = metered("a100-seconds", {
name: "A100 Inference Seconds",
});
plan("inference", {
name: "Inference",
price: 0,
currency: "USD",
interval: "monthly",
features: [
t4Seconds.config({
usageModel: "usage_based",
pricePerUnit: 81,
billingUnits: 3600,
reset: "monthly",
}),
a100Seconds.config({
usageModel: "usage_based",
pricePerUnit: 504,
billingUnits: 3600,
reset: "monthly",
}),
],
});Rules
Billing starts when inference begins, ends when complete.
Different models can run on different GPU tiers.
Per-second granularity ensures fair pricing.