From Sand to Superintelligence · Drill cards · Chapter 32
Drills
Tokens on the Wire
10 atomic recall cards. Export to Anki and let spaced repetition do its slow work.
In Anki: File → Import, choose this TSV, set field separator to Tab, deck = Sand to Silicon · Ch 32, note type = Basic.
| Front | Back |
|---|---|
| What is a token, as used by inference providers? | The smallest unit a language model deals with — a subword piece produced by a tokenizer, from a vocabulary of around 100,000 units. |
| How many characters is one English token, roughly? | Roughly four characters, or three quarters of a word. |
| What is the context window of the long-context Claude Sonnet 4.5 and 4.6 settings? | Up to about one million tokens. |
| What context window size do some Gemini variants support? | Two million tokens. |
| What is the median price of one million input tokens, as of end of 2025? | ~$0.30. |
| How many dimensions does an OpenAI text-embedding-3 vector have? | 3,072 dimensions (with 768 and 1,536 as common alternatives). |
| What geometric operation measures semantic similarity between two embeddings? | Cosine similarity or inner product (dot product) of the two vectors. |
| Why are output tokens more expensive to produce than input tokens? | Output requires running the full forward pass token by token; input is processed in a single parallel prefill pass. |
| What are the three objects the chapter says travel on the fifth wire? | Tokens, embeddings, and structured calls (function/tool calls conforming to JSON schemas). |
| What physical transport layer does the AI API wire run on? | Standard TCP/HTTPS/HTTP2-or-3 and JSON — no new physical layer is required. |