Generative AIPrompt engineeringPrompting techniques

Chain-of-thought prompting

8 minutes read

When humans solve complex problems, they naturally break them into smaller logical steps before reaching the final conclusion. This approach helps us solve complex tasks, especially those that require logical reasoning or calculations.

Traditional LLM prompts often skip this "show your work" phase and jump straight to an answer. While this approach may work for simpler text generation tasks, it falls short for tasks that need multi-step inference, logical reasoning, or arithmetic. This is where Chain of Thought (CoT) prompting comes in. Let's explore how it works.

What is CoT Prompting?

Chain‐of‐thought prompting is a prompt‐engineering technique that instructs a model to create a sequence of thought steps before generating the final response. CoT prompts encourage the model to "think aloud" by generating its reasoning chain step by step before arriving at an answer. This approach improves performance when handling complex tasks.

Here's a high-level overview of the CoT process:

  1. Prompt analysis the model reads the question and outlines a plan.

  2. Thought 1 — first inference or calculation.

  3. Thought 2, thought 3, … thought N — subsequent steps, each building on the last.

  4. Final response — conclude once the full reasoning chain is complete.

Each thought is a self-contained statement that makes the chain easy to follow and verify. Without CoT, this might happen:

Q: A bat and a ball cost $1.10. The bat costs $1.00 more than the ball. How much does the ball cost?
A: $0.10

In this example, the model jumps to the intuitive (but incorrect) answer without working through the algebra. Now, let's use the chain of thought prompting technique and see what happens:

Q: A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost? Let's think step by step.
A: 

Along with the query, we instruct the model to "think step by step". Now, the model will reason through the problem and reach an accurate answer:

First, let the ball cost x dollars, so the bat costs x + 1 dollars. 
Since together they sum to 1.10, we have x + (x + 1) = 1.10, which simplifies to 2_x_ + 1 = 1.10, so 2_x_ = 0.10 and x = 0.05. 
Therefore, the ball costs $0.05.

This approach not only improves accuracy on tasks that need multistep inference or logical reasoning but also makes the model's conclusions more transparent.

Designing effective CoT prompts

So, is asking the LLM to “think step-by-step” or “show its reasoning” all it takes? Well, in some cases, yes. This is because LLMs are trained on massive amounts of data and have learned how to form patterns and reason about problems. However, to improve generation even for smaller models, we can provide some examples showing the model how it should reason about a given problem.

For example, imagine you want the LLM to think like a detective to determine the prime suspect for a burglary. Instead of a generic prompt like:

You’re a detective solving a burglary. There is a muddy footprint, someone heard shouting at 2AM and the owner was away between 1-3AM. Identify the suspect.

You can design a good CoT prompt with a few examples similar to how a detective may go about solving a case like this:

You’re a detective solving a burglary. Using evidence X (a muddy footprint), witness Y (heard shouting at 2 AM), and timeline Z (owner away 1–3 AM), think step by step to identify the suspect.
- The muddy footprint suggests entry through the garden gate.
- Gate is near the neighbor’s yard—check who lives there.
- Shouting at 2 AM indicates confrontation—ask neighbor if they saw anyone.
- Owner’s absence 1–3 AM matches suspect’s window—interview neighbor’s guest.

For your CoT prompts to be highly effective, you would include:

  • The question;

  • A detailed step-by-step reasoning path;

  • Example of what the final response might be.

This shows the model exactly how you want it to unpack its reasoning, leading to even more accurate results. For example:

Example 1:
Q: If Alice has 3 apples and buys 2 more, how many does she have?
Thoughts:
1. Alice starts with 3 apples.
2. She buys 2 more, so add 2 to 3 → 5.
Answer: 5

Example 2:
Q: A car travels 60 km in 2 hours. What is its average speed?
Thoughts:
1. Distance = 60 km, Time = 2 h.
2. Speed = distance ÷ time = 60 / 2 = 30.
Answer: 30 km/h

Now solve:
Q: A rectangular garden’s length is twice its width. If the garden’s perimeter is 60 meters, what are its length and width?
Thoughts:

The model’s response would look similar to this:

Thoughts:
1. Let the width be w meters, so the length is 2w meters.
2. The perimeter of a rectangle is 2 × (length + width).
3. Therefore, 2 × (2w + w) = 60.
4. Simplify inside: 2 × 3w = 60 ⇒ 6w = 60.
5. Solve for w: w = 10.
6. Then length = 2w = 20.
Answer: Length = 20 m, Width = 10 m.

As you can see, we now get more accurate results and can trace exactly how the LLM derived the final answer.

Benefits and limitations of CoT

CoT provides several advantages:

  • Breaking a complex problem into small, clear steps helps the model solve difficult tasks more accurately.

  • Showing each step allows users to see how the model reached its answer, making the process easier to follow.

  • Tackling one part at a time produces more reliable answers, especially when a problem requires multiple steps.

  • Explaining every step works like a tutor teaching students—it helps learners understand and remember better.

  • You can use this approach for various tasks: math, common-sense questions, or complex puzzles.

There are also some limitations. Reasoning through multiple steps requires extra time and processing compared to traditional approaches, which can be costly. Fine-tune your prompts to keep only the best examples to avoid adding unnecessary reasoning steps.

Also, the research showed that CoT prompting improves reasoning and accuracy only in large (∼100B-parameter) models—smaller models create inaccurate chains of thought, performing worse than standard prompting. Thus, the benefits of CoT increase with model size. Providing examples and instructions may improve generation (this is known as instruction tuning).

Like any other prompt, poorly written prompts lead to poor results. In CoT, this may cause flawed reasoning steps. Therefore, make sure you understand the domain and model capabilities when designing prompts. This ensures that the model doesn't produce a chain of thought that sounds logical but contains incorrect reasoning.

Conclusion

Chain-of-Thought prompting guides the model to break down a problem into small, clear steps instead of jumping directly to an answer. The model then thinks through each step, using information from previous steps to make decisions. This approach helps LLMs solve complex problems, like math or logic puzzles, more accurately while showing you their reasoning process. As a result, you get more reliable and transparent results.

16 learners liked this piece of theory. 2 didn't like it. What about you?
Report a typo