Spoold is a free, privacy-first developer toolbox. Paste into the Magic Box and it detects JSON, HTML, JWT, curl, OpenAPI, CSV, timestamps, and more—then suggests the best tool. You can also browse the full catalog by category. No sign-up is required.

Is Spoold free to use?

Yes. Core tools are free. Heavy work runs in your browser so your payloads are not processed on Spoold servers for formatting, decoding, and similar utilities.

Is my data safe with Spoold?

Tool processing for supported client-side utilities happens in your browser. Encrypted share links are designed so only people with the link can read the payload. Review the Privacy Policy for details on cookies, optional ads, and third-party services.

What tools are available?

Spoold ships 61+ tools including JSON formatter and diff, JSON Schema validation, YAML and TOML converters, HTML and Markdown preview, JWT decode and sign, OpenAPI/Swagger viewer with curl, GraphQL formatter, curl to code and curl compare, certificate viewer, CSV preview, regex tester, Mermaid, code editor, LLM token utilities, QR codes, and many encoding and text utilities.

Does Spoold show ads?

The site may display advertisements or sponsored placements to help keep the service free. House promotions highlight other Spoold tools. Ads do not change the fact that supported tools process your pasted content locally in the browser. You can read how to manage ad cookies and opt-outs in the Privacy Policy.

Can I use keyboard shortcuts?

Yes. Press Cmd or Ctrl+K to open tool search, use / from the homepage, and use per-tool shortcuts (shown in the UI) such as jj for JSON and hh for HTML where configured.

LLM Token Calculator — Counter, Tiktoken (cl100k/o200k) & Cost Estimate Online

Model	Context	Input /1M	Cached /1M	Output /1M	Your input	Your cached*	Your output	Total†
GPT-5.4	1M	$2.5	$0.25	$15	$0.00	$0.00	$0.00384	$0.00384
GPT-5 mini	400K	$0.25	$0.025	$2	$0.00	$0.00	$0.00051	$0.00051
GPT-5 nano	400K	$0.1	$0.01	$0.4	$0.00	$0.00	$0.00010	$0.00010
GPT-4.1	1M	$2	$0.5	$8	$0.00	$0.00	$0.00205	$0.00205
GPT-4.1 mini	1M	$0.4	$0.1	$1.6	$0.00	$0.00	$0.00041	$0.00041
GPT-4o	128K	$2.5	$1.25	$10	$0.00	$0.00	$0.00256	$0.00256
GPT-4o mini	128K	$0.15	$0.075	$0.6	$0.00	$0.00	$0.00015	$0.00015
o3	200K	$10	$2.5	$40	$0.00	$0.00	$0.0102	$0.0102
o3-mini	200K	$1.1	$0.55	$4.4	$0.00	$0.00	$0.00113	$0.00113
o4-mini	200K	$1.1	$0.275	$4.4	$0.00	$0.00	$0.00113	$0.00113

Model	Context	Input /1M	Cached /1M	Output /1M	Your input	Your cached*	Your output	Total†
Claude Opus 4.6	200K	$5	$0.5	$25	$0.00	$0.00	$0.00640	$0.00640
Claude Sonnet 4.6	200K	$3	$0.3	$15	$0.00	$0.00	$0.00384	$0.00384
Claude Sonnet 4	200K	$3	$0.3	$15	$0.00	$0.00	$0.00384	$0.00384
Claude Haiku 4.5	200K	$1	$0.1	$5	$0.00	$0.00	$0.00128	$0.00128
Claude 3.7 Sonnet	200K	$3	$0.3	$15	$0.00	$0.00	$0.00384	$0.00384

Model	Context	Input /1M	Cached /1M	Output /1M	Your input	Your cached*	Your output	Total†
Gemini 3 Pro (Preview)	1M	$2	$0.2	$12	$0.00	$0.00	$0.00307	$0.00307
Gemini 3 Flash (Preview)	1M	$0.5	$0.05	$3	$0.00	$0.00	$0.00077	$0.00077
Gemini 2.5 Pro	2M	$1.25	$0.125	$10	$0.00	$0.00	$0.00256	$0.00256
Gemini 2.5 Flash	1M	$0.3	$0.03	$2.5	$0.00	$0.00	$0.00064	$0.00064
Gemini 2.0 Flash	1M	$0.1	$0.025	$0.4	$0.00	$0.00	$0.00010	$0.00010

Model	Context	Input /1M	Cached /1M	Output /1M	Your input	Your cached*	Your output	Total†
Grok 3	131K	$3	$0.75	$15	$0.00	$0.00	$0.00384	$0.00384
Grok 3 mini	131K	$0.3	$0.075	$0.5	$0.00	$0.00	$0.00013	$0.00013

Model	Context	Input /1M	Cached /1M	Output /1M	Your input	Your cached*	Your output	Total†
Mistral Large 2	128K	$2	$0.5	$6	$0.00	$0.00	$0.00154	$0.00154
Mistral Small	32K	$0.2	$0.05	$0.6	$0.00	$0.00	$0.00015	$0.00015
Codestral	256K	$0.3	$0.075	$0.9	$0.00	$0.00	$0.00023	$0.00023

Model	Context	Input /1M	Cached /1M	Output /1M	Your input	Your cached*	Your output	Total†
Command R+	128K	$2.5	$2.5	$10	$0.00	$0.00	$0.00256	$0.00256
Command R	128K	$0.5	$0.5	$1.5	$0.00	$0.00	$0.00038	$0.00038

Model	Context	Input /1M	Cached /1M	Output /1M	Your input	Your cached*	Your output	Total†
DeepSeek-V3	128K	$0.27	$0.07	$1.1	$0.00	$0.00	$0.00028	$0.00028
DeepSeek-R1	128K	$0.55	$0.14	$2.19	$0.00	$0.00	$0.00056	$0.00056

Model	Context	Input /1M	Cached /1M	Output /1M	Your input	Your cached*	Your output	Total†
Llama 3.3 70B (hosted API)	128K	$0.72	$0.36	$0.72	$0.00	$0.00	$0.00018	$0.00018
Llama 3.1 405B (hosted API)	128K	$3.5	$1.75	$3.5	$0.00	$0.00	$0.00090	$0.00090

Model	Context	Input /1M	Cached /1M	Output /1M	Your input	Your cached*	Your output	Total†
Sonar Pro	200K	$3	$3	$15	$0.00	$0.00	$0.00384	$0.00384
Sonar	127K	$1	$1	$1	$0.00	$0.00	$0.00026	$0.00026

Model	Context	Input /1M	Cached /1M	Output /1M	Your input	Your cached*	Your output	Total†
Bedrock: Claude Sonnet (example)	200K	$3	$0.3	$15	$0.00	$0.00	$0.00384	$0.00384
Amazon Nova Pro	300K	$0.8	$0.2	$3.2	$0.00	$0.00	$0.00082	$0.00082

Guide: LLM token calculator & cost estimate

↑ Back to tool

Also called an LLM token counter, GPT token calculator, or tiktoken calculator online — this page helps you estimate prompt tokens, compare illustrative API cost per million tokens, and learn words-to-tokens rules of thumb. Count tokens for pasted text using the same modes as Spoold's context budget tool: approximate (bytes ÷ 4), cl100k, or o200k via tiktoken-compatible BPE in the browser. Dollar amounts in tables are illustrative planning estimates — not live API quotes. For billing, use your provider's dashboard and tokenizer. Layout inspiration: token-calculator.net.

Quick read: start with Understanding tokenization if you are new to BPE counts; see what 10K tokens looks like for pages/chat/vision scale; then use the cost table for rough budgets.

What does 10,000 tokens look like?

A visual scale for thinking about token budgets — handy when someone says "stay under 10k" or you are comparing a prompt to a page of prose. Figures below are rules of thumb for English-like text; code, JSON, and other languages differ. Framing aligns with public token "cheatsheet" guides such as this token cheatsheet.

Scale	≈ 10,000 tokens
Words & characters	≈ 7,500 words · ≈ 40,000 characters Heuristic: `1 token ≈ ¾ word ≈ 4 chars` (English prose)
Printed pages	~15 pages single-spaced · ~30 pages double-spaced — think one dense book chapter.
Conversation	≈ 45–50 minutes of two-way chat (rough), depending on turns and verbosity.
Code footprint	On the order of a few thousand lines of commented application code (language and style change the ratio a lot).
JSON / data	~350 KB of raw JSON is in the same ballpark — useful when planning vector chunks or ETL.
Images (vision)	A 1024 × 1024 photo under OpenAI-style rules is often about 85 tokens at `detail:"low"` vs ~765 tokens at `detail:"high"` (tiling) — crop, resize, or pair with a caption/URL to stay lean. Use Vision token estimator for your exact pixels.
Docs / slides	A 15-slide deck at ~75 words/slide is roughly 1,500 tokens of slide text alone — OCR'd scans chunk differently; embed for RAG in slices.
Support / cases	Ballpark: many short notes (e.g. dozens to hundreds of brief case summaries) can land near 10k tokens total — good for clustering or agent-style workflows over a corpus.

Understanding token usage (short version)

What is a token? — The model's tokenizer splits text into pieces (not always whole words). English often averages near ~0.75 words per token, but it varies widely.
Why tokens matter — Context limits, latency, and cost are usually expressed in tokens. Efficient use affects budget and responsiveness at scale.
Optimizing — Be concise, prefer structured formats when they help, lower image detail when quality allows, chunk large documents for RAG, and monitor usage in your provider dashboard.

For BPE mechanics and encoding modes, see Understanding tokenization below.

Similar tools

Understanding tokenization

Large language models do not read raw characters directly the way humans skim text. They consume tokens: integer IDs produced by a tokenizer that maps byte or text fragments to a fixed vocabulary. API pricing, context limits, and latency discussions are almost always phrased in tokens, not words or Unicode code points.

From text to tokens (BPE-style)

Modern chat models typically use subword tokenization (often byte-pair encoding, BPE, or close variants). The tokenizer learns frequent chunks of characters—common words may become one token, rare words split into several, and punctuation or spaces often merge with neighbors. That is why "hello" might be one token while "tokenization" might be two or three, depending on the vocabulary.

Frequent tokens — Short, common strings in the training data tend to get dedicated IDs (fewer tokens per word on average).
Rare tokens — Long words, rare names, or typos may split into many small pieces.
Structure — Braces, quotes, and indentation in code or JSON usually add extra tokens beyond "plain English" prose.

Tokens vs words vs characters

Words — Human-oriented; token counts rarely match word counts 1:1.
Characters — Longer than token count for Latin text, but not a substitute for BPE: combining marks, emoji, and CJK can map to very different token densities.
Tokens — What the model and billing pipeline use. Always prefer tokenizer output for budget math.

Encoding modes in this calculator

This page uses js-tiktoken in the browser for exact encodings when you choose cl100k_base or o200k_base. Approximate mode divides UTF-8 byte length by 4 (a fast planning heuristic; it is not BPE-accurate).

Mode	What it is	Best for
Approximate (bytes ÷ 4)	Rough token estimate from UTF-8 length	Quick sizing when exact BPE is not required
cl100k_base	Tiktoken encoding used for many GPT-3.5 / GPT-4–class chat models	Matching "classic" OpenAI-style token counts
o200k_base	Tiktoken encoding aligned with GPT-4o / newer o-series–style vocabularies	Closer counts for 4o-era prompts when you select this mode

Visualization: in approximate mode, token "chips" are evenly sliced segments for display only. In cl100k/o200k, chips reflect real BPE pieces (IDs shown when available).

Input tokens, output tokens, and billing

Providers typically bill input (prompt) tokens and output (completion) tokens separately. Output is often priced higher per token. The calculator lets you set an expected output length so totals reflect prompt + reply planning—not prompt alone. Your real bill also depends on whether the provider charges for cached prompt prefixes, tool calls, or system wrappers, so treat numbers as estimates.

Context windows and limits

A context window is the maximum number of tokens the model can process in one forward pass—often counting input + output together (exact rules vary by API). Exceeding the limit causes truncation, errors, or forced summarization. When you pack RAG chunks, system prompts, and user messages, sum their tokenizer counts with the same encoding you deploy against.

Why another model can disagree

Different families use different vocabularies (Gemini, Claude, Llama, etc.). A token count from cl100k is informative for OpenAI-compatible pipelines but may not match Anthropic or Google tokenizers byte-for-byte. For production budgets, run the provider's official tokenizer or API "count tokens" endpoint when available.

How to use the token calculator

Paste your text into the editor above.
View token count using approximate, cl100k, or o200k encodings (cl100k/o200k use tiktoken-compatible BPE in the browser).
Compare costs using the dropdown estimate and the multi-provider table above—always verify on the provider site.
Optimize prompts using the visualization and the word/token guide to spot where length piles up.

Word-to-token conversion guide

Ratios are rules of thumb. Always run representative text through the tokenizer above—especially for code, JSON, or non-English text. Inspiration: token-calculator.net.

Content type	Example	Typical ratio	~1,000 words	Notes
English text	Hello world	~1.3 tokens/word	~1,300–1,500	Standard prose averages about 1.3 tokens per word for Latin script.
Code (Python/JS)	def func():	~2–3 tokens/word	~2,000–3,000	Symbols, operators, and indentation usually add tokens compared to prose.
Chinese / Japanese	你好世界	~2+ tokens/char	~2,000+	CJK characters often map to multiple tokens; counts vary by tokenizer.
Technical writing	API endpoint	~1.5 tokens/word	~1,500–1,800	Abbreviations and domain terms can merge or split unpredictably.
JSON / XML	{"key":"value"}	~3–4 tokens/word	~3,000–4,000	Braces, quotes, and structure characters often consume extra tokens.

FAQ

What is a token in LLMs?

A token is a chunk of text from the model’s tokenizer (often BPE). It can be a word, part of a word, or punctuation. Billing is usually per token, not per character. For a deeper walkthrough, see the “Understanding tokenization” section on this page.

What is the difference between cl100k and o200k here?

Both are tiktoken encodings in the browser: cl100k_base matches many GPT-3.5 / GPT-4–class chat models; o200k_base aligns with GPT-4o / newer o-series–style vocabularies. Token counts differ between them—choose the mode that matches the stack you are estimating. Other providers (Anthropic, Google, etc.) use different tokenizers entirely.

Why does token count matter?

APIs charge by tokens, context windows are measured in tokens, and latency often scales with prompt length. Fewer tokens usually means lower cost and more room for the reply.

What metrics does this calculator show?

Tokens (via tiktoken-style encodings or a rough byte-based estimate), words, characters, and illustrative cost rows. Your exact bill depends on the provider’s tokenizer and pricing.

Is there a fee to use this page?

No. Spoold runs the calculator in your browser. You are not calling paid APIs from this tool by default.

How is my text handled?

Processing is client-side in this tool. Don’t paste secrets or PII you wouldn’t put in a local script.

Roughly how many tokens is 1,000 words?

For English prose, often about 1,300–1,500 tokens, but code, JSON, or CJK text can be very different—use the calculator on a sample.

What is a context window?

The maximum number of tokens (input + output combined, depending on the API) the model can attend to in one request. Exceeding it causes truncation or errors.

What is cached input pricing?

Some providers discount repeated prompt prefixes when you enable prompt caching. The same logical text may bill at cached rates on subsequent calls—check your provider’s docs.

How can I reduce API cost?

Shorten prompts, compress structured data, set max output tokens, pick a smaller model when quality allows, and reuse stable prefixes to benefit from caching where supported.

Which providers are in the comparison table?

The table includes illustrative rows for OpenAI, Anthropic, Google, xAI, Mistral, Cohere, DeepSeek, Meta-hosted Llama, Perplexity, and Amazon Bedrock—plus more models over time. Rates and model names change; always confirm on the provider’s pricing page.

Token calculator & cost

Compare token costs

OpenAI (10 models)

Anthropic (5 models)

Google (5 models)

xAI (2 models)

Mistral (3 models)

Cohere (2 models)

DeepSeek (2 models)

Meta (2 models)

Perplexity (2 models)

Amazon Bedrock (2 models)

Guide: LLM token calculator & cost estimate

What does 10,000 tokens look like?

Understanding token usage (short version)

Similar tools

Understanding tokenization

From text to tokens (BPE-style)

Tokens vs words vs characters

Encoding modes in this calculator

Input tokens, output tokens, and billing

Context windows and limits

Why another model can disagree

How to use the token calculator

Word-to-token conversion guide

FAQ

What is a token in LLMs?

What is the difference between cl100k and o200k here?

Why does token count matter?

What metrics does this calculator show?

Is there a fee to use this page?

How is my text handled?

Roughly how many tokens is 1,000 words?

What is a context window?

What is cached input pricing?

How can I reduce API cost?

Which providers are in the comparison table?

Compare token costs

OpenAI (10 models)

Anthropic (5 models)

Google (5 models)

xAI (2 models)

Mistral (3 models)

Cohere (2 models)

DeepSeek (2 models)

Meta (2 models)

Perplexity (2 models)

Amazon Bedrock (2 models)

Guide: LLM token calculator & cost estimate

What does 10,000 tokens look like?

Understanding token usage (short version)

Similar tools

Understanding tokenization

From text to tokens (BPE-style)

Tokens vs words vs characters

Encoding modes in this calculator

Input tokens, output tokens, and billing

Context windows and limits

Why another model can disagree

How to use the token calculator

Word-to-token conversion guide

Related terms & common names

FAQ

What is a token in LLMs?

What is the difference between cl100k and o200k here?

Why does token count matter?

What metrics does this calculator show?

Is there a fee to use this page?

How is my text handled?

Roughly how many tokens is 1,000 words?

What is a context window?

What is cached input pricing?

How can I reduce API cost?

Which providers are in the comparison table?