Prompt to Make an Agent Summarize Its Own Progress

A research agent was asked to gather competitive intelligence on five companies in a specific market. After 35 tool calls, it had produced detailed analysis on the first company's history, leadership changes over the past decade, and press coverage. It had not yet looked at the second company. The original task was to compare five companies. The agent had not lost track of the goal — it had simply been too thorough on step one to ever reach steps two through five.

This is agent drift. The agent was working correctly on a subtask and had simply lost the proportion between that subtask and the overall objective. Without a mechanism for stepping back and checking alignment, agents in long-running tasks drift reliably. They do not fail. They just gradually become less useful, completing ever-more-detailed work on a shrinking slice of the original task.

Progress summarization is the standard intervention. By making the agent periodically generate a structured summary of its own state, you force it to compare current position to original objective. Misalignment becomes visible at step 10, not step 50.

The Two Problems Summarization Solves

Progress summarization addresses two distinct problems that both increase with task length.

The first is drift, described above. As the agent processes more information and completes more sub-steps, the weight of recent context grows. Recent context pulls attention. The original objective, stated far back in the context window, receives progressively less weight. The agent does not forget the goal but begins to treat recent sub-goals as if they were the primary objective.

The second problem is context window saturation. Most production language models have finite context windows. As a long-running agent accumulates action logs, tool outputs, and observations, the raw action history becomes too long to fit in the window. When context is truncated or compressed automatically, the agent loses access to its own earlier reasoning. Structured self-generated summaries can compress that history into a dense format that preserves the critical information in far fewer tokens.

An agent that regularly compresses its own history into summaries has, in effect, a longer working memory than an agent that relies on the raw action log. The summary is not a loss of information — it is a distillation of the information that matters.

The Prompt

## PROGRESS TRACKING PROTOCOL

At every [N] actions (recommended: 5), and at every major phase transition, generate a structured progress summary before continuing.

Progress summary format:

---PROGRESS SUMMARY [Step N]---
ORIGINAL GOAL: [Restate the original task in one sentence, exactly as given.]

COMPLETED:
- [Specific item or sub-task completed with concrete result, one line each]

PENDING:
- [Remaining items or sub-tasks not yet started or not yet completed, one line each]

CURRENT FOCUS: [What you are currently working on]

ALIGNMENT CHECK: Is the current focus still directly contributing to the original goal? [Yes / No / Partially — with one sentence explanation if Partially or No]

BLOCKERS: [Any issue preventing progress, or "None"]
---END SUMMARY---

After generating the summary, review the ALIGNMENT CHECK. If the answer is No or Partially:
1. Stop the current line of work.
2. Identify which pending item most directly serves the original goal.
3. Redirect to that item.

Do not generate the summary and then ignore it. The summary is a decision-making tool, not a log entry.

ADDITIONAL RULE — SCOPE CHECK:
If the PENDING list at any summary point contains more items than it did at the previous summary point (scope has grown), flag this explicitly: "NOTE: Scope has expanded. Original task had [N] items. Current pending list has [M] items. Confirm whether expanded scope is intended."

At task completion, generate a final summary with the same format plus a RESULT field summarizing what was produced.

Why Each Component Matters

Restating the original goal verbatim

The single most important element of the progress summary is the verbatim restatement of the original goal. Not a paraphrase. Not a summary of the goal. The exact original instruction.

This matters because paraphrases drift. Each paraphrase is a small interpretation. After five summaries, the agent's working version of the goal may have evolved significantly from the original. Requiring verbatim restatement prevents this semantic drift.

If the original goal is long, quote the first sentence and the core task statement. Do not summarize it.

The alignment check as a decision gate

The alignment check is not merely a logging field. The prompt explicitly states: "Do not generate the summary and then ignore it. The summary is a decision-making tool." This instruction is necessary because models will generate the format correctly but then continue on the same path regardless of what the alignment check says, unless the prompt explicitly specifies that the check triggers a decision.

A progress summary that triggers no action is a log. A progress summary that changes behavior is a control mechanism. The prompt must specify which it is, or the model will default to treating it as a log.

The scope expansion flag

Scope expansion is one of the most common causes of long-running agent failures. The agent starts with a five-item task, discovers related items while completing item one, adds them to the pending list, and by step 30 has a 20-item pending list that was never authorized. The scope check rule catches this early, when it can still be corrected.

Summary Interval	Best For	Risk Without It
Every 5 actions	Complex tasks with many short steps	Drift detected late after significant wasted work
Every 10 actions	Tasks with moderate-length steps	Good balance for most agent tasks
At phase transitions	Tasks with distinct phases (gather, analyze, produce)	Phase alignment checked, between-phase drift still possible
Every 20+ actions	Highly structured tasks where phases are long	Significant drift possible before detection
Never	Not recommended for tasks over 10 actions	Drift, scope creep, and context saturation uncontrolled

Using Summaries for Context Compression

In systems with explicit context window management, progress summaries can be used to replace the raw action log. After each summary, the actions in the previous interval can be compressed or removed from the active context, with the summary serving as the canonical record of that period.

This requires a system-level implementation in your agent framework, but the prompt produces summaries in a format that supports this use case. The COMPLETED field is a dense record of what was done. The PENDING field is a prioritized task queue. Together, they contain enough information to resume the task from the summary point without needing the full action history.

In LangChain or similar frameworks, this pattern can be implemented by having the agent generate a summary every N steps, injecting the summary into a persistent memory store, and allowing the raw action buffer to be truncated once a summary is confirmed.

Recovering a Stalled or Failed Agent Run

One underused application of progress summaries is partial recovery. When an agent run fails midway through a long task, the standard approach is to restart from the beginning. If the agent has generated progress summaries, you can restart from the last summary point instead.

To do this, inject the last summary into the system prompt or first user message of a new agent run:

You are resuming a task that was partially completed. The last progress summary was:

[PASTE SUMMARY]

Continue from where this left off. The items in PENDING are your remaining work. Do not redo work in COMPLETED.

This technique requires that summaries are stored outside the agent's context, which means your agent system must capture summary outputs explicitly. This is worth implementing for any task that takes more than a few minutes to run.

Common Mistakes

The first mistake is treating the summary as optional. If the progress summary is listed as one of several things the agent should do, it will often skip it under time or token pressure. The protocol must specify that the summary is required at every interval, not recommended.

The second mistake is a PENDING list that is too vague. "Complete the analysis" is not a pending item. "Analyze company B's pricing structure" is a pending item. Vague pending lists do not catch scope creep and do not support recovery. Push for specific, checkable items.

The third mistake is not reading the summaries. In automated systems where summaries are generated but not reviewed, drift can persist uncorrected. If you are running agents in automated pipelines, build a summary reviewer step that flags ALIGNMENT CHECK: No or Partially or scope expansion for human review before the agent continues.

The discipline to stop every ten steps and account for progress is as valuable in automated agents as it is in human project management. It is not overhead. It is how complex tasks stay on track.

For controlling agent behavior at the constraint level, see Prompt to Define What an Agent Must Never Do. For preventing runaway token usage on long tasks, see Prompt to Reduce Token Usage Without Losing Quality.

Frequently Asked Questions

How often should an agent summarize its progress?

Every 5-10 actions is a good starting point for most agents. For tasks with clear phases (research, analysis, output), summarize at phase transitions. For tasks with no natural phases, use a fixed step count. The summary should not be so frequent that it becomes noise, nor so infrequent that drift goes undetected.

What should the progress summary contain?

Three things: what has been completed and confirmed, what is still pending, and whether the current trajectory is still aligned with the original goal. The third element is the most important and the most often omitted.

Can these summaries be used to resume a failed agent run?

Yes. A well-structured progress summary contains enough state information to restart an agent mid-task without starting over. This is especially valuable for expensive tasks where a failure partway through would otherwise require a full restart.

Does self-summarization help with context window limits?

Directly. When an agent summarizes its own progress, the summary can replace detailed action logs in the context window, compressing prior steps into dense, structured state. This is a standard technique for extending effective context in long-running agents.

What is agent drift and why does progress summarization prevent it?

Agent drift is when an agent gradually shifts focus away from the original objective while pursuing sub-tasks. The agent is not failing, but each step takes it slightly further from the goal. Progress summaries force explicit comparison between current state and original objective, making drift visible before it becomes significant.

Prompt to Make an Agent Summarize Its Own Progress