The three cost components
Custom GPT pricing has three discrete components. They behave very differently and are often confused, which leads to nasty budget surprises 6 months in.
Build cost: One-time. The work to design, develop, and deploy your custom GPT. Typically $15,000–$200,000+ depending on scope.
Operational cost: Ongoing monthly. Hosting, monitoring, admin overhead, retraining cycles. Typically $1,500–$10,000/month depending on uptime requirements and complexity.
Inference cost: Per-query, charged by your AI model provider (OpenAI, Anthropic, or open-source running on your infrastructure). Typically $200–$5,000/month for an SMB; $5,000–$50,000/month for enterprise.
Build cost ranges (real, 2026 prices)
Simple ($15k–$30k)
Single-purpose bot. One data source. No complex integrations. Basic retrieval. Example: customer service FAQ bot trained on your help centre.
Mid-complexity ($30k–$80k)
Multi-step workflow. 2–4 data sources integrated. CRM or ERP integration. Custom UI. Example: sales enablement bot with Salesforce + content library + Gong integration.
Complex ($80k–$200k)
Multi-agent orchestration. 5+ integrations. Custom fine-tuning. Specific compliance requirements. Example: compliance GPT for an APRA-regulated firm with SOC 2 attestation.
Enterprise ($200k+)
Custom model training, on-premise deployment, multi-region rollout, specific certification. Example: defence-cleared deployment with sovereign cloud and ITAR awareness.
Operational cost ranges
- Hosting (Azure/AWS): $300–$3,000/month depending on uptime and scale
- Vector database: $200–$1,500/month (Pinecone, Weaviate, Qdrant)
- Monitoring & observability: $200–$800/month (Datadog, New Relic, Langfuse)
- Admin/maintenance: $1,000–$5,000/month (us, your team, or a hybrid)
- Compliance overhead: $0–$3,000/month additional for regulated industries
Inference cost reality
Inference is metered usage — every prompt and response burns tokens, and tokens cost money. Real numbers for an Australian business:
- Light internal tool (50 users, 10 queries/day each): ~$300–$600/month on OpenAI, ~$400–$800/month on Anthropic
- Customer-facing bot (1,000 conversations/day): ~$1,500–$4,000/month
- Heavy multi-agent system (10,000+ daily interactions, complex chains): ~$8,000–$25,000/month
- Self-hosted open source (Llama 3.1 70B on your infrastructure): typically 30–50% lower than OpenAI/Anthropic at scale, but with higher operational overhead
What drives the price up
If you're shopping multiple proposals, here's where the genuine cost variance lives:
- Number of integrations: Each integration adds 3–10 days of build effort. CRM + ATS + ERP + comms + ticketing = a serious bill.
- Compliance requirements: APRA, ASIC, AHPRA, ITAR, classified — each adds documentation, attestation, and architectural complexity.
- Custom model training: Fine-tuning isn't always needed. When it is, add $20k–$80k.
- Deployment surface: Web only is cheap. Web + mobile + Slack + Teams + voice/IVR is expensive.
- SLA requirements: 99.5% uptime is standard. 99.95% requires multi-region active-active and roughly doubles the operational cost.
The cheapest proposal is rarely the cheapest outcome. Underspecified projects routinely come in at 2–3x the original estimate by month 4. We'd rather quote conservatively and deliver close to the number than win on price and have the conversation about scope creep later. Most of our quotes include a 15% contingency line which we'd rather not use.
Where you can compress cost
Three ways to genuinely save money without sacrificing outcomes: (1) Start narrow — one use case, one data source, one user group. Prove it. Then expand. (2) Use ChatGPT Team or Copilot for use cases that actually fit. Save custom for the use cases that don't. (3) Run inference on Anthropic Claude Haiku or smaller open-source models for use cases where Opus/Sonnet is overkill. Most internal tools don't need the strongest model.
Frequently asked questions
Why is custom GPT 10x the cost of ChatGPT Team?
Different products solving different problems. ChatGPT Team is generic productivity for individuals. Custom GPT is a deeply integrated system that does specific work. The cost difference reflects the engineering work, the specialised compliance posture, and the ongoing operational responsibility. Don't pay 10x unless the use case justifies it.
How long does the build take?
Simple builds: 4–6 weeks. Mid-complexity: 8–14 weeks. Complex: 4–6 months. Enterprise with certifications: 6–12 months. We can run multiple parallel workstreams to compress timelines on time-sensitive projects, but at higher cost.
Is there a way to predict our specific cost?
Yes — we run a 1-week paid scoping engagement ($3,500–$7,500) for projects we expect to come in over $50k. The output is a detailed scope of work with an honest cost estimate. Most of our clients say the scoping phase is the most valuable part of the engagement; it surfaces requirements that would have caused overruns.
Can we self-host to avoid cloud costs?
Yes — self-hosting Llama 3.1 70B or similar open-source models on your own GPU infrastructure can cut inference costs significantly at scale (typically break-even point is around $5,000/month of OpenAI/Anthropic spend). The trade-off is operational complexity. We've deployed self-hosted models for clients with strong DevOps capability or specific compliance needs; for most clients, cloud-based commercial models are simpler and cheaper.
Ready to build your custom GPT?
Get a free 30-minute scoping call. We'll map your use case, data sources, and ROI before you commit.
Start the Conversation