Reproducible, parser-agnostic benchmarks for turning PDFs into Markdown—and measuring downstream usefulness with retrieval-QA, not just visual fidelity. We use two human-in-the-loop methods plus one ...
A focused pipeline to parse medical guidelines (PDF/HTML) into structured JSON for downstream clinical RAG or summarization. This implements models, parsers, normalization utils, and a CLI to ingest ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results