Reconciling conflicting meeting notes using AI document comparison
Three sets of notes from the same meeting will agree on most of it and disagree on the parts that matter most: who owns what, what was decided, what the dates are. If you ask AI to ‘merge these notes,’ it blends the disagreements into smooth prose that hides them. The right move is to ask the model to compare, not combine, and to surface conflicts as a separate column so you can resolve each one yourself.
Why 'merge these for me' produces dangerous output
Blending is the default, and blending erases information.
A merge prompt treats agreement and disagreement as the same kind of input. If two notes say the meeting decided Tuesday and one says Wednesday, a polite blend writes ‘the team discussed Tuesday or Wednesday.’ That sentence sounds reasonable and is actively wrong, because there was a decision and the AI just hid it. Worse, you lose the signal that the notes disagreed at all, which is exactly the information you needed to chase down.
Comparison is a different shape. The model reads each source, identifies what’s shared and what’s unique, and lists conflicts as conflicts. The output isn’t a polished memo; it’s a working document you can act on. The polished memo comes later, after you’ve decided whose version of Tuesday-or-Wednesday is correct.
The conflict-resolution prompt map
Three prompts in sequence beat one big merge.
Figure 1. Extract, compare, resolve, in that order [PASS 1] EXTRACT (separately, per source) For each set of notes, pull a clean list of: • decisions made • action items (owner, task, date if stated) • open questions • risks or blockers raised Output: one structured list per source. │ ▼ [PASS 2] COMPARE (across sources) Take the three structured lists and produce: • AGREED: items that appear in all sources • PARTIAL: items in some sources but not others • CONFLICT: items where sources disagree on a fact (owner, date, decision, number) Each item carries the source(s) it came from. │ ▼ [PASS 3] RESOLVE (with human in the loop) You resolve each conflict by deciding which source to trust. The AI’s only job here is to rewrite the final memo from your resolved list, NOT to guess the resolution itself. |
The reason this works is that the model never has to choose between conflicting versions. It only has to find them and surface them. The choosing is yours, which is correct, because only you know which note-taker was actually in the meeting for the part that’s in question.
The extraction prompt
Run this once per source, separately, before anything else.
Pass 1 prompt, run once per source Read the meeting notes below. Extract a structured list of: decisions, action items, open questions, and risks.
Rules: – Use only what’s in the notes. Invent nothing. – For each action, capture: owner (name), task, due date IF stated, source line. If a field is missing, write null. Do not guess. – For each decision, quote the exact decision phrasing. – Tag every item with this source label: SOURCE_A.
Output JSON only.
Notes: “”” [notes from one attendee] “”” |
Running this per source, separately, is the part most people skip. Pasting all three sources into one prompt and asking for a combined list is faster and produces a worse output, because the model starts blending before the comparison ever happens. Keep the sources isolated until Pass 2.
The compare prompt
Now the model gets all three structured lists at once and runs the comparison.
Pass 2: compare, never merge Below are three structured lists, one from each attendee of the same meeting. Compare them and output:
AGREED: items present in all three sources (consensus) PARTIAL: items in 1 or 2 sources but not all (might be a real point the others missed) CONFLICT: items where sources contradict each other on a specific fact: owner, date, number, or the decision itself. Show each version with its source label.
Rules: – Do not pick a winner. Show every version. – Two items are the same if they refer to the same action, decision, or topic, even if worded differently. – Quote the source phrasing in conflicts so the reviewer can judge which to trust.
Output JSON only, with three top-level sections.
Lists: “”” [paste the JSON from Pass 1, all three sources] “”” |
The ‘do not pick a winner’ line is the one that earns its keep. Without it, the model will quietly choose one version, usually the most plausible-sounding one, and the conflict disappears into a confident merge. With it, every disagreement lands on your desk for a one-second decision.
A worked example you can picture
Pull it out of the abstract for a moment.
Three people take notes in a 30-minute planning meeting. Source A writes ‘Maria will draft the launch email by Friday.’ Source B writes ‘Maria to draft launch comms, deadline TBD.’ Source C writes ‘Raj is sending the launch announcement next week.’ A merge prompt blends these into a polite sentence that hides three problems: the owner is contested (Maria vs Raj), the artifact differs (email vs announcement), and the date is a guess from one source and absent from two. The comparison prompt instead returns one conflict block showing all three versions, source-labeled, which lets you ping the room and ask who’s right. The whole resolution takes three minutes; the merge prompt would have hidden the issue until the deadline.
What goes in the final memo
Once you’ve resolved the conflicts, the polished memo writes itself.
- Agreed items go in unchanged. These are facts every source confirmed.
- Partial items get a one-line note about which source raised them, since they may need a second confirmation.
- Resolved conflicts go in with the winning version, and a small footnote on what was contested if the conflict was material.
- Unresolved conflicts go in a separate ‘needs clarification’ section, with a name attached for follow-up. Don’t bury them in the body.
The memo is shorter than a free-merge one and far more useful, because the reader can trust every sentence. The model wrote the prose; you owned the decisions.
Common reconciliation mistakes
Three patterns produce most of the rework.
- Asking for a merge in one prompt. The conflicts hide; you ship a memo with quiet wrong facts.
- Letting the model resolve conflicts. The model picks based on which version sounds more confident, not which is true. That’s worse than no resolution.
- Treating the AI’s output as the final memo. The extraction, compare, and resolve passes are working steps. The polished memo is a separate pass once you’ve made the calls.
What 'conflict' actually means here
Not every disagreement is a conflict worth raising.
Two notes saying ‘we decided to ship Tuesday’ and ‘team agreed to launch on Tuesday’ aren’t in conflict; they’re saying the same thing with different words. The compare prompt should fold those into the agreed bucket. Real conflicts are factual disagreements: different owner, different date, different decision, different number. Setting that line clearly in the prompt prevents the compare output from drowning you in cosmetic differences that don’t matter. Tell the model to ignore wording variation and to flag only contradictions on the specific fields, the names, dates, decisions, and numbers, that you’d act on differently depending on which version was right.
Where comparison falls short
The model can find disagreements; it can’t tell which note-taker was paying attention.
A confident note that’s wrong looks identical to a confident note that’s right. The pipeline surfaces conflicts; it doesn’t tell you whose version of an action item to trust. That’s a judgement call, and it usually depends on who was in the room for that specific decision. For high-stakes meetings, the right pattern is to circulate the resolved memo to all attendees for a final check, with the resolved conflicts called out so they have a chance to push back. The pipeline saves you the conflict-finding step, not the human verification.
Questions people actually ask
Can I just have the AI compare to an audio transcript?
If you have a transcript, that’s a better source than the notes, since it’s a record of what was said. Treat the transcript as a fourth source in the compare pass, and let it win on factual disagreements about who said what. It can’t tell you what was decided when the room nodded silently; it can tell you which name was actually spoken.
What if one set of notes is much more detailed than the others?
Run the extraction prompt with the same rules across all sources, so detail differences become ‘partial’ items rather than dominating the comparison. Don’t weight the longer source higher just because it has more text; long notes can carry more noise as well as more signal.
Should I tell the AI which source to trust?
Only if you’ve decided in advance that one source is the official one (a designated note-taker, for instance). Even then, the comparison pass should still surface conflicts; the trust hierarchy applies at the resolution step, not at the comparison step. Otherwise you’ve quietly turned this back into a biased merge.
How long should each source be?
Whatever the note-takers actually wrote. The pipeline doesn’t care about length; it cares about coverage. A two-line jotting and a two-page set of notes can still be compared cleanly if the structured extraction pulls the same fields from each.
Can I run this on chat logs instead of notes?
Yes, and it’s often even cleaner than handwritten notes because the chat is verbatim. The same three-pass shape applies: extract a structured list from the chat log, compare against any other accounts of the meeting, resolve conflicts yourself. Chat logs win on exact wording and lose on context that wasn’t typed, so they’re a strong source on what was said and a weak one on what was meant. Combine them with at least one human’s notes when you can.
What about meetings with no notes at all, just memory?
Have each attendee write what they remember as if they were taking notes after the fact, then run the pipeline as normal. Memory-based notes carry more risk of contradiction, which is exactly the case where comparison earns its biggest gain over a merge.
Run the three passes on the next contested meeting
Next time you’re handed two or three sets of notes from the same meeting, resist asking the AI to merge them. Run the extraction prompt on each source separately. Run the compare prompt on the structured outputs. Resolve the conflicts yourself, then ask the model to write the final memo from your resolved list. The whole exercise takes longer than a one-shot merge and produces a memo your readers can actually trust, which is the point of having notes at all.
