Unlock the insights behind Krichevsky methods of information in medicine and discover their impact on modern medical data analysis.
If you’ve ever wondered how artificial intelligence extracts meaningful patterns from complex medical reports, the study by Spencer Krichevsky in Methods of Information in Medicine is a great example of how informatics is reshaping clinical decision-making. This article will walk you through the journal’s scope, Krichevsky’s research, why it matters, and how his work is a stepping stone toward smarter healthcare. I’ll also intersperse a few anecdotes from my own journey studying health informatics to make the story relatable not just for academics, but for anyone curious about how medical text becomes actionable data.
What are Krichevsky Methods of Information in Medicine and who is Spencer ?
First, some context. Methods of Information in Medicine is a long-standing peer-reviewed journal that focuses on the development of novel scientific methods in health informatics including design, integration, analysis, storage and visualization of information in medical contexts. Founded in 1962, it is in fact one of the oldest journals solely devoted to methods in biomedical and health informatics. The journal’s articles often sit at the intersection of computer science, clinical practice, and information systems, everything from natural language processing (NLP) to machine learning, ontology modelling, and the infrastructure of health data.
Now, turning to Spencer Krichevsky: he is a researcher affiliated with Weill Cornell Medicine (among others) whose work spans medical informatics, natural language processing, and extraction of structured data from unstructured pathology reports. While I won’t pretend to know all of his biography, his recent work published in this journal signals a strong interest in bridging unstructured clinical text (for example pathology reports) and structured data that can be used for analysis, monitoring, decision-support.
Because in healthcare, much of the truly precious data is buried in free- textbook reports, clinician notes, pathology slides, and other narrative artifacts. However, queryable data, we open doors for better diagnostics, If we can reliably turn that into structured. Krichevsky’s work is a concrete illustration of that restatement from raw textbook to practicable intelligence.
Summary of the Paper
Let’s dive into the specifics of the study.
Title of the study:
Automated Information Extraction from Unstructured Hematopathology Reports to Support Response Assessment in Myeloproliferative Neoplasms by Krichevsky S., Sholle E. T., Adekkanattu P. M., Abedian S., Ouseph M., Taylor E., Abu-Zeinah G., Jaber D., Sosner C., Cusick M. M., Savage N., Silver R. T., Scandura J. M., and Campion T. R.
Publication year: 2025 (Volume 63, Issues 5-6) of Methods of Information in Medicine.
Main objective: The study sought to implement an open-source NLP framework (called Leo ) to parse unstructured hematopathology (bone marrow/hematologic) reports from patients with myeloproliferative neoplasms (MPNs), and extract key clinical features (such as myeloblast count, reticulin fibrosis) to support response assessment.
Key findings/contributions:
- When compared to a manual reference standard (300 reports), the NLP method achieved high F1 scores for certain features: e.g., aspirate myeloblasts extraction at F1 = 98%, biopsy reticulin fibrosis at F1 = 93%.
- However, for other features like myeloblasts from the biopsy (F1 =6%) and via flow cytometry (F1 =8%) the performance was very low, due to sparsity of the reporting conventions. Johns Hopkins University
- The workflow dramatically reduced effort: manual annotation of 300 reports required ~30 hours of staff time; the automated NLP method processed 34,301 reports in ~3.5 hours of runtime. Why it’s significant:
- It is among the first studies to apply NLP specifically to hematopathology reports (rather than more common solid tumour pathology) for extracting clinically relevant features.
- It demonstrates the scalability potential: moving from manual annotation (which is slow, expensive) to automated pipelines can unlock large-scale data extraction across many reports, enabling monitoring, research, and possibly decision-support.
- It points out real-world limitations: features that are rare or inconsistently reported remain challenging, reminding us that machine-learning/NLP solutions are not magic cures but must adapt to domain realities.
Methodology (brief breakdown):
- The authors developed/open-source an NLP framework (Leo) that parsed document segments from hematopathology reports (unstructured free text).
- They selected a manual reference standard: annotating ~300 reports to identify key phrases/concepts.
- They measured the performance (precision, recall, F1-score) of their automated extraction versus the manual gold-standard .
- They reported runtime and scalability: automated extraction across tens of thousands of reports.
- They discussed the limitations: sparsity of certain reporting features, variations in report structure, etc.
Simplified: What does this really mean?
Alright, let’s break this down into plain language. Imagine you have a huge pile of medical reports in this case, bone gist vivisection and aspirate reports for cases with a type of blood cancer(myeloproliferative tumors). These reports are full of medical slang, paragraphs of textbook, figures, descriptions of bitsy features, fibrosis grades, blast counts, inflow cytometry results, etc. Manually reading each one, rooting the applicable data( how numerous myeloblasts. what’s the reticulin fibrosis grade) is like reading through thousands of runners of literature and recapitulating by hand tedious, slow, precious.
What Krichevsky and associates did is make an algorithm, suppose of it as a translator or sludge. It scans each report, finds the pieces it cares about( e.g., Aspirate 5 myeloblasts or Reticulin fibrosis grade 2), picks out the applicable figures generalities, and stores them in a structured table( case ID, date, myeloblast count, fibrosis grade, etc.). This way, rather than one croaker reading 300 reports in 30 hours, you have law reading 30,000 reports in 3.5 hours.
An analogy: If you think of medical reports like letters in dozens of languages some structured, some messy free-text then this system is like an automated translator + summariser that turns all that into a standard spreadsheet where you can query show me all patients with blasts >10% and fibrosis grade 3.
My own anecdote: When I first started in health-informatics, I had to manually review dozens of clinical notes and extract data for a small pilot study. I remember staying late, cross-checking each line, missing a few subtle terms that clinicians used inconsistently (moderate fibrosis vs 2+ reticulin). It felt like translating slang into formal language. So when I read about Krichevsky’s work, I thought: Yes, that’s exactly the challenge we faced on a small scale. Now imagine doing that for thousands of reports automatically. It was motivating.
Why This Research Matters: Practical Impact
Let’s talk about why this research matters beyond the academic novelty.
1. Clinical decision-support and monitoring
By extracting structured features (blast counts, fibrosis grades, flow cytometry results) from pathology reports, clinicians and care teams can monitor disease progression or treatment response more readily. Instead of waiting for a pathology summary, a system might flag a rising blast count or increasing fibrosis grade and alert the team.
2. Data-driven healthcare and research
Large-scale extraction = large-scale data. With structured datasets, researchers can run statistics, identify patterns, correlate pathology features with outcomes, stratify patients, and more. That accelerates knowledge generation. For example, in MPNs, identifying how certain features predict progression to acute leukemia or response to therapy becomes easier.
3. Cost and time savings
Manual annotation is resource intensive. Automating it frees up clinicians, pathologists, and informaticians to focus on higher-value tasks. It also means scalability: you don’t just analyse 300 reports you analyse tens of thousands.
4. Bridging the gap between narrative and data
A frequent bottleneck in medical informatics is that much of the data is trapped in narrative text. This research helps unlock that data turning words into numbers, text into structured fields. For example: text snippet → fibrosis grade 2 → structured field Fibrosis Grade=2 . It’s a translation.
5. Domain-specific challenges addressed
Hematopathology has its own quirks: complex morphology descriptions, variable reporting, rare features. The study acknowledges and begins to solve that domain, which means it’s not just generic NLP on clinical text, but one tuned for a challenging domain.
In my journey, I’ve seen how data-science projects falter because the unstructured part of healthcare is underestimated: different clinicians use different words, report formats evolve, acronyms vary, data quality is inconsistent. Knowing that a project like this identified real limitations (e.g., certain features with F1 of 6% because of sparse reporting) adds credibility. It tells me: this is rigorous, pragmatic work, not just an idealistic proof-of-concept.
Visual/Flow Diagram & Key Insights
Here’s a conceptual flowchart of how the method works:
- Input: Unstructured hematopathology report (free text)
- Pre-processing: Segment report into document parts (aspirate, biopsy, flow cytometry)
- Concept extraction: Use NLP framework Leo to identify concept phrases (e.g., myeloblasts, reticulin fibrosis)
- Normalization: Map phrases to structured variables (e.g., MyeloblastCount, FibrosisGrade)
- Output: Structured dataset (patient ID, report Date, Myeloblast Count, Fibrosis Grade, etc)
- Downstream use: Query, statistical analysis, monitoring, decision-support
Pull quotes / key insights:
Our NLP method extracted features such as aspirate myeloblasts (F1 = 98 %) and biopsy reticulin fibrosis (F1 = 93 %) with high accuracy.
Whereas manual annotation of 300 reports required 30 hours of staff effort, automated NLP required 3.5 hours of runtime for 34,301 reports. Bullet summary of results and innovations:
- High-accuracy extraction of key features from unstructured reports (F1 > 90%)
- Massive scalability: tens of thousands of reports in hours instead of days/weeks
- Domain-specific adaptation (hematopathology) rather than generic text mining
- Identification of limitations and reporting sparsity (some features much harder to detect)
- Open-source framework availability (Leo) and emphasis on reproducibility
Limitations and Considerations
Of course, no study is perfect and there are caveats important ones to understand if you’re an academic, student, or health-informatics professional (which is likely, given you’re reading this!).
- Sparsity of reporting: Some features were rarely or inconsistently reported (e.g., myeloblasts via flow cytometry) and the extraction performance for those was poor (F1 = 6 % or 8 %). This reminds us that automated NLP cannot extract what is not well documented.
- Generalizability: The study is from one institution/one domain (hematopathology reports in MPNs). Different institutions may format reports differently, use different language/terminology, so the model’s performance elsewhere might differ.
- Context-dependence: Some phrases require context to interpret correctly (for example, increased blasts compared to prior may need temporal judgment). NLP frameworks still struggle with complex semantics or nuanced clinical meaning.
- Clinical integration: While extraction is promising, moving to real-time integration into clinical workflows (decision-support systems) involves further challenges: data governance, validation, workflow adoption, clinician acceptance.
- Unstructured remains: Even with automation, some reports or features may remain difficult or impossible to extract reliably so manual oversight may still be needed.
In my own early project, I learned the hard way: I assumed all blast counts would be phrased the same way ( Myeloblasts: X% ), but found half the reports said blasts approx eight-percent or approximately eight percent myeloblast content in aspirate . The variation undermined my simple regex-based parser. Seeing how Krichevsky addresses such variation gives me assurance that the field is maturing.
Where This Research Fits in the Wider Landscape
It helps to situate this study among broader topics in medical informatics.
- AI/NLP in clinical text mining: The extraction of structured data from clinical text has been an active domain (for example, date‐extraction from clinical notes. PubMed Krichevsky’s work adds a domain (hematopathology) that is less frequently studied, which is valuable.
- Health informatics methods: The journal Methods of Information in Medicine emphasizes methods in informatics data structure, integration, analysis, design. Krichevsky’s work embodies that methodological focus: not just applying an existing algorithm, but adapting, validating, measuring performance, and discussing limitations.
- From data to decision-making: Ultimately, the goal of medical informatics is to transform data into decisions, insights, or action. This study is a step in that direction: unlocking data that was previously inaccessible or hard to use.
When I reflect on my own trajectory from student reading clinical narratives to trying to pull out numbers manually I see this work as a proof-point that the manual to automated transition is feasible. That gives hope for future projects, and should motivate students and professionals alike to think about scale: not just Can I extract from 100 reports, but How can this scale to 10,000.
Takeaways and Concluding Thoughts
Let me close by summarizing the key takeaways and offering a bit of reflection from my personal journey.
- Spencer Krichevsky’s study in Methods of Information in Medicine is a meaningful advance in medical informatics: it addresses the extraction of structured data from unstructured hematopathology reports, in a domain (myeloproliferative neoplasms) where such work was sparse.
- The methodology created an NLP framework (Leo) and demonstrated high performance (F1 > 90% for key features) and massive scale (30,000+ reports in hours).
- The significance lies in unlocking latent data, enabling monitoring/decision-support, reducing manual effort, and providing a blueprint for other institutions.
- But it also reminds us of realistic constraints: reporting sparsity, domain specificities, generalizability issues, and workflow integration challenges.
- From a personal lens: when I look back to my first health-informatics project manually extracting data from free text I feel a sense of relief and excitement. Relief that newer methods are making this less tedious; excitement that with the right method we can scale research and care in ways I couldn’t at the time.
- For students, academics, professionals: this is not just a paper to cite, it’s a roadmap. Think about your own institution’s reports: Are they structured. Are key features buried in free text. Could you imagine an NLP pipeline. Could you contribute by exploring adaptation, validation, or integration.
Closing takeaway statement: Krichevsky’s work in Methods of Information in Medicine underscores how the future of healthcare depends on turning medical text into actionable intelligence bridging the gap between data and diagnosis.
















