Tech Term Decoded: Tokens

Definition

Tokens are sections of text that are inputted into and generated by the machine learning model. They include individual characters, whole words, parts of words, or even larger chunks of text. When dealing with tokens, the standard is that one token usually corresponds to ~4 characters of text for common English text. This is equivalent to ¾ of a word (so 100 tokens ~= 75 words) [1]. AI tokens are not limited to text alone. They can be in various data forms and play a key role in AI’s ability to understand and learn from them. For instance, in computer vision, an AI token may denote an image segment, like a group of pixels or a single pixel. Similarly, in audio processing, a token might be a snippet of sound [2].

To better illustrate tokens in AI, lets take a look at the two sentences below. Each colored box represents one token - a unit of text that AI processes;

Token counting in AI

The example above shows each token as a distinct colored box while also providing a summary table at the bottom comparing tokens, words, and characters for both sentences. This clearly demonstrates how the three metrics differ - tokens are typically close to but not exactly the same as word count, while character count is substantially higher than both. With this demonstration, users can now understand how AI companies calculate their pricing models, as they typically charge based on token count rather than words or characters.

Origin

The concept of tokens dates back to the early days of computer science, when researchers were exploring ways to represent and process human language. One of the earliest tokenization algorithms was developed in the 1950s by the linguist and computer scientist, Noam Chomsky. Chomsky’s algorithm, known as the “Chomsky Normal Form,” was used to parse and analyze the structure of sentences.

In the 1980s, the development of the “bag-of-words” model revolutionized the field of natural language processing (NLP). The bag-of-words model represented text as a collection of individual words, without considering the order or context of the words. This model was widely used in early NLP applications, such as text classification and information retrieval [3].

Context and Usage

Tokens play a key role in the field of natural language processing (NLP) and generative AI. Understanding the concept of tokens is fundamental. They are the building blocks that models like ChatGPT, Gemini, MetaAI, and Claude use to process and generate language [4].

Why it Matters

Grasping how AI tokens work is necessary for the efficient use of language models. The ability to count them is critical for the optimization of costs and the best use of tools like ChatGPT. Skillful management of AI tokens enables you to control costs and ensure that you can easily fit within the imposed limits [5].

Related Terms

Word Tokens: Each word is treated as a separate token.
Subword Tokens: Words are broken down into smaller meaningful units to handle out-of-vocabulary words better. E.g. "cats" can be broken down into "cat" and "s".
Phrase Tokens: These consist of multiple words that are grouped together, such as "Benin City" or "Artificial Intelligence".

In Practice

A real-life case study of a company practicing tokens in AI can be see in the case of OpenAI with their GPT models which uses a token system for processing language. When you use ChatGPT or the OpenAI API, your text is broken down into tokens (word fragments), and you're charged based on token usage. This token-based approach allows for precise measurement of computational resources used.

References

The Ministry of AI. (2023). Demystifying Tokens: A Beginners Guide to Understanding AI Building Blocks.
Miquido. (2025). What is an AI Token?
Ikangai. (2025). A Brief History of Tokens.
Beekman, j. (N.D). Tokens 101: Understanding Tokens in Generative AI Models
The Story. (2025). AI tokens. What are AI tokens?

Tech Term Decoded: Tokens

Post a Comment

Contact Form