Today, we’re introducing a new review step inside the IndexerLabs subject indexing pipeline: Checkpoints.
Checkpoints are built around a simple idea: AI book indexing has two different problems.
The first problem is knowing what to index.
The second problem is indexing it correctly.
These two problems are related, but they are not the same task.
A great deal of discussion around AI book indexing, automated book indexing, and automatic subject indexing treats the process as a single generation problem. A manuscript is handed to a model, the model is asked to produce a finished back-of-book index, and the system is judged entirely by the final output.
We think that framing misses something important.
A serious AI book indexing tool has to separate editorial judgment from mechanical extraction. It has to decide what belongs in the index before it can reliably index those subjects across the manuscript.
Checkpoints give authors, editors, and publishers a way to review the candidate term list at exactly that moment: after the system has identified likely index entries, but before full-scale locator extraction begins.
Automated book indexing has two parts
A professional back-of-book index is a selective editorial structure.
It has to decide which topics deserve entries, which ideas can be omitted, how headings should be phrased, which discussions should be grouped together, and how much space should be allocated to different parts of the book.
That is the first problem in automated book indexing: knowing what to index.
Once those decisions have been made, the second problem becomes more technical. The system has to find the right passages, assign the right locators, preserve structural relationships, verify the evidence, and deliver the result in a format that remains reliable in production.
That is the second problem: indexing it correctly.
We believe that a strong AI book indexing tool has to solve both problems, and that the workflow is much better when those problems are handled as separate stages.
The hard part of AI book indexing is knowing what to index
In our earlier benchmark work on indexing the same book 120 times, we argued that subject indexing is shaped by judgments about significance, wording, structure, emphasis, and space.
Many automated book indexing systems still treat indexing as though it were mainly an extraction problem. In practice, extraction is only one part of the workflow. The harder challenge is editorial judgment.
A system can generate thousands of plausible terms and still produce a weak index if it does not know what to keep, what to remove, how to compress an overgenerated candidate pool, and how to preserve the topical core of the book under realistic space constraints.
That is why our work on IndexLM-1.0 has focused so heavily on knowing what to index.
We believe that a strong AI book indexing tool can learn editorial judgment from real back-of-book indexes. Indexing has a kind of professional taste: the ability to recognize which topics matter, which phrasings are useful to readers, which entries are redundant, and which terms should survive when the draft has to be reduced to a realistic length.
This is difficult to achieve with prompting alone. Keyword frequency is not enough either. A name, place, or concept can appear many times and still be unimportant to the final index. Another idea may appear only briefly but still deserve an entry because it is central to the argument of the book.
As we showed in our benchmark work, the central question is not simply whether a model can generate many plausible index terms. The more important question is whether it can preserve the kinds of topics that a professional human indexer would judge worth retaining once the index has to fit within a realistic budget.
We believe this is one of the central problems in AI book indexing, and we believe our custom index model has already made substantial progress toward solving it.
The second problem is indexing it correctly
Once a system knows what belongs in the index, the remaining task becomes much more concrete.
At that point, an AI book indexing tool has to extract locators accurately, verify the evidence behind those locators, and deliver the finished index in a format that works inside real publishing workflows.
A large part of our recent work has focused on this second half of automated book indexing.
We care deeply about accuracy and verifiability. Our Quick Check workflow was designed to make locator review faster, more transparent, and more useful for human reviewers. Instead of asking a reviewer to inspect an opaque final index, the system surfaces evidence for each proposed locator so that entries can be accepted, rejected, or corrected efficiently.
We have also approached the same problem from the delivery side.
In our recent post on embedded index support for Word .docx documents, we explained why embedded Word indexing is so useful for AI book indexing.
One of the hardest practical problems in automatic book indexing is page-number reliability at the final stage of production. Even when a model identifies the right passages, static page numbers can drift if the document layout changes.
Embedded DOCX indexing gives us a much better solution.
Instead of asking AI to guess the final page numbers that will appear in the delivered Word index, we can use AI to determine what should be indexed and where the index markers belong. Those markers can then be inserted directly into the original .docx file. After that, Microsoft Word regenerates the final page numbers from the real layout of the document itself.
That means the final delivered page numbers are grounded in the actual Word document, rather than in a model’s prediction.
For us, that is a major part of solving the second half of AI book indexing. Once the system knows what to index, indexing it correctly should be grounded in verification, evidence, and production-safe delivery.
So where does human review belong?
This is where Checkpoints come in.
We think the most useful place for human intervention is between the two stages: after candidate term generation, but before full extraction begins.
At that stage, the candidate pool already exists. The system has already done the discovery work. It has proposed the headings that it believes belong in the index.
The expensive extraction stage has not yet run across the entire reviewed list.
That gives the reviewer a high-leverage moment to improve the index before the rest of the pipeline commits to a final set of entries.
The reviewer can:
- remove weak or unnecessary headings
- add missing terms
- adjust phrasing
- narrow or broaden the scope of the draft
- improve the quality of the downstream extraction stage
This is very different from cleaning up a finished AI output after the fact.
With Checkpoints, the reviewer intervenes while editorial judgment still has the greatest effect on the final index. The system proposes a candidate structure, the human can refine that structure, and the downstream pipeline then performs the extraction, verification, and delivery work against the approved list.
A better workflow for an AI book indexing tool
We think this creates a better workflow for automated book indexing.
A weak automated book indexing workflow forces the human to repair a bloated final draft after the most important decisions have already been made.
A stronger AI book indexing tool lets the human intervene earlier, while the entry list is still editable and before locator extraction has begun. This is human-in-the-loop indexing at its best: the reviewer is placed where their judgment has the greatest leverage over the final index.
That is what Checkpoints are designed to do.
Once the reviewed list is approved, the pipeline continues using that revised set of entries. The system then performs the extraction, verification, and delivery work needed to turn that editorially reviewed candidate list into a finished back-of-book index.
This means the human and the system are not doing the same job.
The model proposes what should be indexed.
The reviewer can correct, add, remove, and reshape the candidate list.
The extraction and delivery system then handles the mechanical problem of indexing that material correctly.
AI book indexing should separate judgment from extraction
We think this is a more realistic way to build an AI book indexing tool.
The editorial problem and the extraction problem are different enough that they should be handled as different stages in the workflow.
The first stage is about significance, omission, phrasing, and selection.
The second stage is about locators, verification, and reliable output.
A strong automatic book indexing system should be able to do both.
That is the direction we have been building toward across the platform: better editorial judgment through IndexLM-1.0, stronger locator verification through structured extraction, production-safe delivery through embedded DOCX support, and cleaner manual oversight through features like Quick Check and Checkpoints.
A good AI book indexing tool must know what to index.
Then it must index it correctly.
Checkpoints sit directly between those two problems, and we think they make the entire automated book indexing workflow stronger.