Skip to content

Release history

docling releases

Get your documents ready for gen AI

All releases

77 shown

No immediate action
v2.97.0 Breaking risk

Parameter rename

No immediate action
v2.96.1 Bug fix

FFmpeg error + DrawingML text

No immediate action
v2.96.0 Mixed

PDF backend + JSON fix + docs update

No immediate action
v2.95.0 Bug fix

Preserve DOCX text on DrawingML images

No immediate action
v2.94.0 New feature

TikZ rendering + new options + Vision 4.1 + HF

v2.93.0 New feature
Notable features
  • Upgraded Granite Vision model to version 4.1 for enhanced table and chart extraction
Full changelog

Feature

  • vlm: Upgrade Granite Vision model to 4.1 for table + chart extraction (#3382) (24f2d14)

Fix

  • docx: Fix OMML equation handling and improve type safety (#3381) (e00735d)
v2.92.0 New feature
Notable features
  • Multi-lingual support for kserve-triton OCR
  • Checkbox parsing support for DOCX
  • Modular docling-slim package
Full changelog

Feature

  • Extend the kserve-triton OCR model to have multi-lingual support (#3368) (8b67fae)
  • docx: Add checkbox parsing support (#3349) (c455a65)
  • Introduce modular docling-slim package (#3285) (ed32c5e)
  • Add ResponseFormat.DOCLANG and parsing branch in VLM pipeline (#3350) (0f6f8d0)

Fix

  • pptx: Skip malformed picture shapes instead of aborting conversion (#3372) (7294248)
  • docx: OMML conversion failures for unsupported limit functions (#3359) (3df80e7)
  • Make VLLM model_impl configurable (#3358) (a6a37ca)
v2.91.0 Bug fix
Security fixes
  • Path traversal prevention in LaTeX macro handlers
Notable features
  • VML image extraction with v:imagedata elements
Full changelog

Feature

  • docx: Extract VML images with v:imagedata elements (#3343) (2ddaa3b)

Fix

  • Strengthen input validation for METS‑GBS processing (#3336) (c1dbac2)
  • EasyOCR model downloading (#3339) (5e161ac)
  • vlm: Remove bogus preamble from VLM chat template (#3351) (c190ba2)
  • html: Refine image URL and size handling (#3348) (cd0cb69)
  • Fixes to html_backend (#3342) (9813190)
  • pptx: Assign pptx notes to ContentLayer.NOTES (#3341) (3a3c8f6)
  • Prevent path traversal in LaTeX macro handlers (#3330) (65ef180)
  • service: Add explicit usage exceeded exception handling (#3325) (075fa69)

Documentation

  • uspto: Improve documentation of USPTO XML parser security config (#3338) (09de7f9)
v2.90.0 New feature
Notable features
  • GraniteVisionTableStructureModel for VLM-based table extraction
Full changelog

Feature

  • Implement GraniteVisionTableStructureModel for VLM-based table extraction (#3323) (1569e42)

Fix

  • latex: Fully unwrap deeply nested formatting macros (#3249) (101233e)
  • docx: Handle inline formulas in list items (#3304) (c761512)
  • format: Add MD fallback for .txt files in _guess_from_content (#3311) (3bab6b4)
  • Strip soft hyphen when joining merged text elements (#3232) (8274892)
  • pptx: Handle NotImplementedError from shape.shape_type (#3309) (043ed2d)

Documentation

  • Fix nanonets_ocr2 runtime support matrix (#3317) (8ec14f2)
v2.89.0 New feature
Notable features
  • Explicit TikZ environment handling in LaTeX backend
  • Aligned RapidOCR english assets with 3.8 mobile models
  • Fixed list state isolation in table cells for DOCX documents
v2.88.0 New feature
Notable features
  • Client SDK for docling serve
  • Support for rapidocr 3.8 mobile model naming
v2.87.0 Mixed
Notable features
  • Nanonets OCR2 onboarding
  • Transformers v5 compatibility for AUTOMODEL_CAUSALLM VLMs
  • VLM tool-calling API responses support
v2.86.0 New feature
Notable features
  • Support for GraniteVision v4
  • Add signature/stamp html block to DC document
  • Add PARTIAL_SUCCESS status for VLM pipeline pages
v2.84.0 New feature
Notable features
  • GLM OCR support
  • DocumentFigureClassifier v2.5
Full changelog

Feature

v2.83.0 New feature
Notable features
  • Upgrade to transformers v5
  • OCR model for remote KServe v2 API
v2.82.0 New feature
Notable features
  • Implementation of HTML backend with headless browser
v2.81.0 New feature
Notable features
  • Route plain-text and Quarto/R Markdown files to the Markdown backend
v2.79.0 New feature
Notable features
  • Add fact metadata and linkbase relationships for XBRL
v2.78.0 New feature
Notable features
  • TableFormer v2 support
  • gRPC transport for KServe v2 API
v2.77.0 New feature
Notable features
  • VLM inference time tracking for mlx_model
  • Configurable ONNX Runtime graph optimization
v2.75.0 New feature
Notable features
  • XBRL instance report backend parser
  • KServe v2 API support
Full changelog

Feature

  • Create a backend parser for XBRL instance reports (#3017) (334ba6e)
  • Unified model-family inference engines (including image-classification) and KServe v2 API support (#2979) (0353293)

Fix

  • Skip ASR segments when length is zero (#2998) (6b824f8)
  • docx: Guard against None hyperlink address in _get_paragraph_elements (#2367) (#3022) (236216e)
v2.74.0 Security relevant
Security fixes
  • XML External Entity and related attack vulnerabilities
Notable features
  • docling-parse v5 released
v2.73.0 New feature
Notable features
  • LaTeX document parsing
  • Inference engines abstraction for object detection
  • Pluggable VLM runtime with preset configuration
v2.71.0 New feature
Notable features
  • Word document comments extraction
  • WebVTT and source tracker
v2.69.0 New feature
Notable features
  • Picture classifier v2.0
  • Classification filters for picture description
v2.67.0 New feature
Notable features
  • XPU device support for Intel GPUs
  • Enrichment annotations in meta format
Full changelog

Feature

Fix

  • Lock new deps and update python 3.14 warnings (#2844) (d9295df)
  • Correct type hint for table_structure_options usage (#2823) (a0530a2)
  • Transformers models lazy-loaded (#2826) (3ef4525)
  • Font download by passing font_path to RapidOcr (#2822) (ffafe58)
  • cli: Add Layout and Table models to --show-external-plugins (#2832) (ed57089)
v2.66.0 New feature
Notable features
  • Add preset for using granite-docling via vllm and other apis
v2.64.0 New feature
Notable features
  • Add experimental TableCropsLayoutModel
  • Factory and plugin-capability for Layout and Table models
v2.63.0 New feature
Notable features
  • Add save and load for conversion result
  • Enable GPU for RapidOCR when available
v2.62.0 New feature
Notable features
  • Add the Image backend
  • Layout + VLM model with layout prompt (experimental)
v2.61.0 Bug fix

Minor fixes and improvements.

Full changelog

Feature

  • vlm: Track generated tokens and stop reasons for VLM models (#2543) (6a04e27)

Fix

  • Temporarily pin NuExtract to working revision (#2588) (fa92574)
  • ocr: Use PSM integer values directly instead of constructor (#2578) (1a5146a)
v2.60.0 New feature
Notable features
  • Threading in standard pipeline
Full changelog

Feature

  • Use threading in the standard pipeline and move old behavior to legacy (#2452) (268d027)

Fix

Documentation

v2.59.0 New feature
Notable features
  • Python 3.14 support
  • Added num_tokens attribute for VlmPrediction
v2.58.0 New feature
Notable features
  • Password-protected PDF document support
  • MLX Whisper support for Apple Silicon ASR
  • Generic options support and HTML image handling modes
v2.57.0 New feature
Notable features
  • Process DrawingML objects in DOCX
Full changelog

Feature

Fix

  • Use proper page concatentation in VLM pipeline MD/HTML conversion (#2458) (cd7f7ba)

Documentation

v2.56.0 Breaking risk
Notable features
  • AutoOCR model selecting best available OCR model, deprecating EasyOCR
  • Tesseract PSM options support
v2.55.0 New feature
Notable features
  • Rich tables support for HTML backend
  • Repetition-based StoppingCriteria for GraniteDocling
v2.54.0 New feature
Notable features
  • Rich tables support for MSWord backend
  • New WebVTT file backend parser
v2.53.0 New feature
Notable features
  • Granite-docling model for document understanding
  • Generic extra arguments support for RapidOCR
v2.52.0 New feature
Notable features
  • Enrichment steps on all convert pipelines (incl docx, html, etc.)
v2.51.0 New feature

Improved performance with updated docling-parse backend and optimized default parameters.

v2.47.0 New feature
Notable features
  • VLM batching in transformers backend
  • VLLM backend
  • HTML formatting tags support
v2.42.0 Bug fix
Notable features
  • Option to control empty clusters in layout postprocessing
v2.41.0 New feature

Adds image-text-to-text models and enables layout model configuration.

v2.40.0 New feature
Notable features
  • Introduce LayoutOptions to control layout postprocessing behaviour
  • Integrate ListItemMarkerProcessor into document assembly
v2.38.0 Breaking risk
Notable features
  • Support audio input
  • Add formatting & improve inline support
  • Maximum image size for Vlm models
v2.37.0 New feature
Notable features
  • Support xlsm files
  • Make Page.parsed_page the only source of truth for text cells

Beta — feedback welcome: [email protected]