IBM’s watsonx.ai free tier (the Lite plan) gives you 20 Capacity Unit Hours (CUH) per month plus 300,000 tokens per month for foundation model inference. That’s enough to prototype with IBM Granite, Meta Llama, Mistral, and other models without paying a monthly bill.
Startup founders trying to stretch runway, developers validating an LLM feature, and researchers who need a small-but-real inference budget can all get value here. It’s also a nice “sandbox” if you want to learn watsonx.ai before pitching it internally.
This guide covers IBM watsonx.ai Lite Plan eligibility, the exact signup steps, what the credits cover, the limits that matter, and a few tactics to make 300K tokens go further.
Program at a Glance
| Provider | IBM |
| Credit Amount | 20 CUH/month + 300,000 tokens/month |
| Duration | Always-free (monthly limits reset each month) |
| Eligibility | IBM Cloud account with identity verification; 1 Lite instance/account |
| Credit Card Required? | Yes, for identity verification (no charges unless upgraded) |
| Difficulty | Easy; signup is quick if verification passes. |
| Best For | Experimentation, prototyping, learning foundation models |
| Official Page | IBM Program Page |
What You Actually Get
The IBM watsonx.ai Lite plan is an always-free tier for IBM’s enterprise AI platform. You get two separate buckets each month: 20 Capacity Unit Hours (CUH) for watsonx.ai Runtime activity (training AutoAI models, scoring deployed ML models, running Decision Optimization jobs), plus 300,000 tokens per month for foundation model inference. The foundation model catalog on Lite includes IBM Granite models, Meta Llama models, Mistral models, and additional third-party options, with model availability varying by data center.
In practical terms, 300,000 tokens per month can support about 1,500 short Q&A interactions (around 200 tokens each), roughly 300 document summaries (about 1,000 tokens each), or a few hundred RAG chatbot exchanges. Add the 20 CUH and you can also test ML workflows like AutoAI or small Decision Optimization batches, as long as you stay within the Lite runtime limits.
Who Qualifies (and Who Doesn’t)
IBM positions the Lite plan as an always-free onramp for watsonx.ai, so there’s no application review or startup-accelerator gate. The main “gate” is account creation plus identity verification, and the one-instance-per-account rule.
- You need an IBM Cloud account (log in with your IBMid or create a new account).
- A credit card is used for identity verification during signup, even though the plan itself is free.
- Only one Lite plan instance is allowed per IBM Cloud account.
- If you collaborate with others, each collaborator must have their own Lite plan.
If you’re hoping to do foundation model tuning, run GPU notebook runtimes, or request quota increases, Lite is the wrong tier. Those are explicitly not supported, so you’ll hit a wall fast.
How to Sign Up
Signup usually takes about 10 minutes, assuming the verification step goes smoothly.
- Go to dataplatform.cloud.ibm.com/registration/stepone?context=wx.
- Select an IBM Cloud region (Dallas or Frankfurt is recommended for full foundation model access).
- Log in with your IBMid or create a new IBM Cloud account (email, personal info, and credit card for identity verification).
- If prompted, select your account and resource group.
- Click Continue and wait for activation to complete.
- Bookmark your watsonx.ai home page for future logins.
Two common gotchas: you only get one Lite instance per IBM Cloud account, and if you end up on the IBM Cloud Dashboard, go back to the signup page and click “Log in with existing account.” Also, IBM notes there are no charges unless you manually upgrade to a paid plan.
What the Credits Cover
The Lite plan is split by how watsonx.ai bills work. CUH covers watsonx.ai Runtime activity except foundation model inference, while tokens (tracked as Resource Units, where 1 RU = 1,000 tokens) cover foundation model inference. That separation matters, because you can run out of tokens even if you still have CUH left, and vice versa.
| Service / Feature | What It Does | Included? |
|---|---|---|
| Foundation model inference (RU/tokens) | Run prompts against Granite, Llama, Mistral, and other models. | ✓ |
| AutoAI training (CUH) | Train AutoAI models; CUH consumed during training runs. | ✓ |
| Deployed ML model scoring (CUH) | Score/run deployed ML models; idle deployments stop consuming CUH. | ✓ |
| Decision Optimization jobs (CUH) | Execute batch optimization jobs with Lite limits on parallelism and retention. | ✓ |
Notable exclusions are real, not fine print: foundation model tuning (prompt tuning/fine-tuning) and custom foundation model deployments are not supported on Lite, and neither are GPU runtime environments for notebooks.
Limitations to Know About
Every free plan has boundaries. With watsonx.ai Lite, the limits are mostly about monthly capacity, request throughput, and which platform features IBM keeps behind paid tiers.
- You are capped at 20 CUH per month for watsonx.ai Runtime activity (foundation model inference is separate).
- Foundation model usage is capped at 300,000 tokens per month, with an inference request rate limit of 2 requests per second.
- Deployments go idle after 1 day, which means they stop consuming CUH after inactivity.
- The Lite plan does not support foundation model tuning, custom foundation model deployments, GPU notebook runtimes, large runtime environments (8+ vCPU), project export, BYOK encryption, or quota increases beyond Lite limits.
When you hit the monthly limits, you don’t get surprise charges by default because IBM notes there are no charges unless you manually upgrade to a paid plan. If you do upgrade, take the “no spending caps by default” warning seriously and set billing alerts so a demo doesn’t turn into a bill.
Have Unused IBM Credits?
IBM credits can pile up in real teams, especially when a project shifts or a pilot ends early. Sometimes the credits are time-boxed, and watching them expire is basically watching budget evaporate. If you’re sitting on unused IBM credits you won’t realistically burn through, AI Credit Mart lets you list them and recover value instead of letting them go to zero.
Need More IBM Credits?
Once you outgrow Lite (which happens quickly if you’re doing repeated demos or heavier inference), paying retail is not your only option. AI Credit Mart lists discounted IBM credits from organizations with surplus allocations. Typical discounts land around 30-70% below retail, which can stretch your experimentation budget a lot further.
Tips for Getting the Most Out of Your Credits
- Use smaller models like granite-4-h-small or mistral-small when you can, because they’re typically more token-efficient for basic tasks.
- Monitor usage in the IBM Cloud dashboard so CUH and RU don’t hit zero mid-test.
- Use the Prompt Lab to iterate interactively before you build a full pipeline or integrate into an app.
- Pick Dallas or Frankfurt as your region if you want the widest foundation model availability.
- Plan around the 1-day deployment idle timeout; redeploy when needed rather than expecting a dormant deployment to stay “hot.”
- Stack it with IBM’s free demo at ibm.com/products/watsonx/get-started for an additional 20,000 tokens over 30 days (it only requires an IBM/Google/LinkedIn login).
- If you’re building anything repeatable, use the Python SDK (ibm-watsonx-ai) for automation instead of clicking around the UI forever.
- If you ever upgrade, set up billing alerts immediately, because there are no spending caps by default on paid usage.
Frequently Asked Questions
You get 20 CUH per month plus 300,000 tokens per month. IBM estimates 300,000 tokens is enough for about 1,500 short Q&A prompts (~200 tokens each), around 300 document summaries (~1,000 tokens each), or roughly 375 RAG chatbot exchanges (~800 tokens per exchange). CUH value depends on what you run, since CUH consumption varies by tool, hardware spec, and runtime environment.
Yes, a credit card is used for identity verification during signup.
It’s an always-free tier, and the 20 CUH and 300,000-token limits reset monthly.
Yes. If you have IBM credits you won’t use before they expire, you can list them on AI Credit Mart and sell them at up to 70% of face value. Companies regularly list surplus credits from startup programs and enterprise agreements.
AI Credit Mart has discounted IBM credits available from companies with surplus allocations. Prices are typically 30-70% below retail.
You won’t be charged unless you manually upgrade to a paid plan.
CUH is IBM’s billing unit for watsonx.ai Runtime activity except foundation model inference, and it’s consumed by things like training AutoAI models, scoring deployed ML models, and running Decision Optimization jobs. Foundation model inference uses Resource Units (RU), where 1 RU equals 1,000 tokens. CUH is measured to the millisecond with a one-minute minimum per operation, and idle deployments can stop consuming CUH after inactivity. Tokens count input and output combined, so verbose prompts burn budget fast.
Dallas or Frankfurt is recommended for full foundation model access, since model availability varies by data center.
IBM watsonx.ai Lite is one of the more practical always-free tiers: real model access, real monthly limits, and enough room to build a convincing prototype. Use it to learn fast, then either upgrade or source discounted IBM credits when you need to scale.
Your AI credits are losing value every day
Join the marketplace and start trading unused credits today.