Compare the cost of self-hosting a model on GPUs vs using an inference API
How much of the GPU time you expect to use. Lower = higher effective cost.
Configure your workload and click Calculate