Building Discharge Summary Agent
Last Updated on June 14, 2026 by Editorial Team
Author(s): Paramveer Singh
Originally published on Towards AI.
Building Discharge Summary Agent
Problem statement
We are given a patient’s folder in which their exist diffrent sorts of data in pdf format related course of actions happened during his admittance in the hospital. And we need to summarize it and tell what actually changed after he/she got discharged.
Why even need an agent for this?
you might be thinking why do we need an agent, we can just give it any LLM and it will do it, right? No! there are lots of problems here, first LLM don’t understand structure by self, data maybe lab-reports, tables, handwritten, etc. LLM would hallucinate and mix up the things so need to make sure we provides very refined information to the LLM and let it reason in very controlled fashion.
Safety Principles
We need to make sure our agent follow these rules, our data will pass through the agent and will pass on to the LLM very refined version.
- never guess
- never fabricate (never create anything by self)
- unknown > wrong
- if data not available -> say not available
- conflicts -> flag, don’t decide
- provenance for every fact (is the source of truth exist)

Ingestion
Our agent would take the pdf and extract structured text out of it.
We will start by dividing the pages and storing the text inside it one-by-one, if possible or we would flag it for ocr extraction.
We would store the text in the format:-
{
page_number,
text,
has_text,
ocr_applied
}
Data in the folder could be of two types:-
— digitally typed
- it’s easy to extract text from these ones. we can use pdf parsers like pymupdf or doclin for this use case.
— handwritten notes
- it complex in case of hospitals as doctors tends to written in very weird fashion that not even humans can understand easily. For this one we will use OCR(optical character recognition), we can use various tools like trOCR or Tesseract, I am using tesseract as it’s free and open-source. So, OCR will take each page and extract the text out of it and store it.
Evidence Extraction
After Ingestion we would have and array in which we would have a long string of text, we would pass that string to the LLM to generate our initial state in structured format as:-
{{
"diagnoses": [{{"fact": "...", "source_text": "..."}}],
"medications": [{{"name": "...", "dosage": "...", "frequency": "...", "route": "...", "status": "...", "source_text": "..."}}],
"allergies": [{{"fact": "...", "source_text": "..."}}],
"procedures": [{{"fact": "...", "source_text": "..."}}],
"pending_results": [{{"fact": "...", "source_text": "..."}}]
}}
Gemini or whatever model we would use would go either page by page or in batched format whatever way wish to give it to the LLM and would return back to us in JSON format like above. Following are a template prompt we would pass, can change however we wish as per preference and the amount of stricter guardials we want.
You are an expert clinical data extractor prioritizing safety over completeness.
Unknown > Wrong. Never fabricate, never infer, never guess.
Extract the following clinical entities from the text provided:
- Diagnoses
- Medications (with dosage, frequency, route, and status like Admission/Discharge/Discontinued)
- Allergies
- Procedures
- Pending Results
For each entity, you MUST extract the exact "source_text" snippet from the text that proves it.
If a field or entire category is missing, use "Not Documented".
Respond strictly in JSON format with the following exact structure:
{{
"diagnoses": [{{"fact": "...", "source_text": "..."}}],
"medications": [{{"name": "...", "dosage": "...", "frequency": "...", "route": "...", "status": "...", "source_text": "..."}}],
"allergies": [{{"fact": "...", "source_text": "..."}}],
"procedures": [{{"fact": "...", "source_text": "..."}}],
"pending_results": [{{"fact": "...", "source_text": "..."}}]
}}
We would take the output and pushed it to our state variable and hence our initial state will be generated. Hence we will aggregate states, group up together like we will put all the diagnosis in the same list. suppose at page 4, it says diabetes and at page 15 it says hypertension then normal human would think how one person can have two things, let’s pick the first one or latest one. but we are not gonna such mistakes we would just put it there, we should just collect things and bundling them without much reasoning. Along with this we would also track and store all the conflicts like one i just mentioned, flags for review, missing fields, logs of traces, step count of agent loop. And final state would look something like following: —
{
"patient_info": {},
"diagnoses": [],
"procedures": [],
"medications": {
"admission": [],
"discharge": []
},
"allergies": [],
"follow_up": [],
"pending_results": [],
"hospital_course": [],
"conflicts": [],
"missing_fields": [],
"flags_for_review": [],
"evidence": [],
"trace": [],
"step_count": 0
}
Agent Loop
After we have our initial state, we are gonna start our loop.
— Reconciliation
- We are gonna check what all things changed during admittance and after discharge, meds change or what. We would check out all the difference. We would send it for review via storing it in the state variables if we finds some difference and reason is not justified. Like meds changed, added, stopped, etc.
— Conflict Detection
- If we find any conflicts like in the case of multiples diagnosis, we would make it for review(escalate).
— validate
- if some fields are missing or source don’t exist, we are gonna mark it for review.
We are gonna keep running this loop for the finite steps or unless complete state is validated.
LLM
Finally We would pass our validated state after all the review to the LLM and let it generate the output. Following are the prompt template with guardials: —
You are an expert clinical documentation assistant.
Write a clear, structured Markdown Discharge Summary based ONLY on the provided clinical evidence.
CRITICAL RULES:
- NEVER infer, guess, or fabricate clinical information.
- If a field or category is missing or empty, explicitly state "Not Documented".
- Organize into standard clinical headings: Diagnoses, Medications, Allergies, Procedures, Pending Results.
- Do not invent patient names, dates, or hospital locations unless they are strictly provided in the text.
Provided Evidence:
{state_text}
Architecture Design

Implementation — https://github.com/paramcodes/junior-doctor
Thanks for making it till here.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI
Towards AI Academy
We Build Enterprise-Grade AI. We'll Teach You to Master It Too.
15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.
Start free — no commitment:
→ 6-Day Agentic AI Engineering Email Guide — one practical lesson per day
→ Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages
Our courses:
→ AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.
→ Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.
→ AI for Work — Understand, evaluate, and apply AI for complex work tasks.
Note: Article content contains the views of the contributing authors and not Towards AI.