How Hicap Works

Same models you already use, routed through reserved GPU capacity so you pay less. Here's how it works.

STEP 01

Swap Your Base URL

Hicap is a drop-in replacement for the OpenAI API. Point your existing SDK, CLI tool, or extension to our endpoint and you're done.

  • No new SDK — works with the OpenAI client you already use
  • Access OpenAI, Anthropic, and Google models through one endpoint
  • Integrate in under five minutes with minimal code changes
Code Example
import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://api.hicap.ai/v1",
  defaultHeaders: {
    "api-key": process.env.HICAP_API_KEY
  }
});

const response = await openai.chat.completions.create({
  model: "gpt-5.4",
  messages: [{ role: "user", content: "Hello!" }]
});
STEP 02

We Route to Reserved GPUs

Your requests are intelligently routed to our reserved GPU capacity across multiple providers for optimal performance and cost.

  • Dedicated capacity means consistent, predictable performance
  • Automatic load balancing across providers
  • When reserved capacity overflows, requests fall back to on-demand
Request Flow
Your App
Hicap Gateway
Reserved
GPT-5.43,400 TPM
OVERFLOW112% utilized
Claude Opus 4.62,100 TPM
OVERFLOW108% utilized
Gemini 3.0 Flash5,000 TPM
OVERFLOW115% utilized
Dedicated to your app
On-Demand
GPT-5.4ACTIVE
Claude Opus 4.6ACTIVE
Gemini 3.0 FlashACTIVE
GPT-4.1STANDBY
Claude Sonnet 4.5STANDBY
Overflow & additional models
STEP 03

Spot Market for Excess Capacity

When your reserved capacity sits idle, it enters the spot market. Other customers buy that excess at dynamic prices — you earn revenue, they save money, and the platform takes a small fee.

  • Sellers monetize unused reserved capacity instead of letting it sit idle
  • Buyers access capacity below on-demand rates with dynamic spot pricing
  • Platform earns a transparent fee on every spot trade
  • Three-way value creation: seller, buyer, and platform all benefit
Spot Market Flow
Seller
Reserved Capacity100k tokens
62% used internally38k excess → spot market
$0.62 / token
Buyer
Demand: 50k tokens38% savings
76% spot24% on-demand
Platform fee (5%)$1.18
STEP 04

See Where Every Dollar Goes

Get full visibility into token usage, costs, and model performance across dev tools and production apps — all in one dashboard.

  • Track usage by model, application, and team
  • Compare dev tooling vs production spend at a glance
  • Spot your most expensive models and optimize
Usage Insights

Coding Tools

Dev tooling with BYOK configuration.

$246.45

28.6M tokens

Codexcodex-mini-latestOn-Demand18.2M$118.30
Clineclaude-opus-4.5On-Demand10.4M$128.15

Your App

Production workload with reserved capacity.

$1,704.00

10,500 TPM + 66.4M

Plannergpt-5.4Reserved3,400 TPM$720.00
Reasoningclaude-opus-4.6Reserved2,100 TPM$480.00
Summarizergemini-3.0-flashReserved5,000 TPM$240.00
Tool Usegpt-4.1On-Demand52M$156.00
Classifierclaude-sonnet-4.5On-Demand14.4M$108.00
Last used Apr 13, 5:14 PM

Why Choose Hicap?

Production-ready AI infrastructure with enterprise features built in.

Enterprise Security

We never store your prompts or completions. Your data passes through our gateway and is never retained.

Multi-Provider Flexibility

Access OpenAI, Anthropic, Google, and more through a single API endpoint. Switch models without changing providers.

Spot Market Economics

Sell idle reserved capacity or buy excess from others at dynamic spot prices. Three-way value creation for sellers, buyers, and platform.

Real-Time Analytics

Monitor usage, costs, and performance metrics in real-time with detailed dashboards.

Flexible Pricing

Pay as you go with no commitments, or lock in reserved throughput for even deeper savings.

Ready to start saving?

Create an account, swap your base URL, and start paying less for the same models. Setup takes under five minutes.