Skip to content

PaddleOCR

Developer Productivity
Python Latest v3.6.0 · 6d ago Security brief →

Features

  • Intelligent document parsing with a lightweight vision‑language model (PaddleOCR-VL-1.6) delivering Markdown and JSON outputs at 96.3% accuracy on OmniDocBench v1.6.
  • Universal scene text recognition supporting over 100 languages via the PP-OCRv5 single‑model solution, including IDs, street signs, books, and industrial components.
  • Production‑ready efficiency: achieves commercial‑grade accuracy with a small footprint suitable for edge or cloud deployment.

Recent releases

View all 5 releases →
No immediate action
v3.6.0 New feature

PaddleOCR-VL-1.6 release + SDKs

v3.5.0 New feature
Notable features
  • Deep integration with Hugging Face ecosystem supporting 20 major models via Transformers as inference backend
  • Flexible switching among PaddlePaddle static graph, dynamic graph, or Transformers inference engines
  • Conversion of Word, Excel, PowerPoint documents to Markdown
Full changelog

2026.4.21 v3.5.0 released

  • Deeply integrated with the Hugging Face ecosystem, with 20 major models supporting Transformers as the inference backend. Supports flexible switching of inference engines, including PaddlePaddle static graph, PaddlePaddle dynamic graph, or Transformers.
  • Supports conversion of common document formats (Word, Excel, Powerpoint) to Markdown.
  • The PaddleOCR-VL series, PP-StructureV3, and PP-DocTranslation support exporting parsing results to DOCX format, making it convenient to view and edit in Word.
  • Official browser inference SDK PaddleOCR.js is released, supporting running PP-OCRv5 in the browser.

2026.4.21 v3.5.0 发布

  • 深度适配 Hugging Face 生态,20 个主要模型支持以 Transformers 作为推理后端。支持灵活切换推理引擎,可选飞桨静态图、飞桨动态图或 Transformers。
  • 支持常见文档格式(Word、Excel、Powerpoint)转 Markdown。
  • PaddleOCR-VL 系列、PP-StructureV3、PP-DocTranslation 支持将解析结果导出为 DOCX 格式,便于在 Word 中查看和编辑。
  • 发布官方浏览器推理 SDK PaddleOCR.js,支持在浏览器中运行 PP-OCRv5。

Full Changelog: https://github.com/PaddlePaddle/PaddleOCR/compare/v3.4.1...v3.5.0

v3.4.1 New feature
Notable features
  • Added `llama-cpp-server` backend support for PaddleOCR‑VL
  • Added AMD GPU and Intel Arc GPU hardware support
  • Removed default maximum pages per request limit in Docker Compose service
Full changelog

2026.4.14 v3.4.1 released

  • PaddleOCR-VL adds llama-cpp-server backend support.
  • PaddleOCR-VL adds AMD GPU and Intel Arc GPU hardware support.
  • Fixed dependency issues in the PaddleOCR-VL images for Huawei NPU and KunlunXin XPU*
  • For the PaddleOCR-VL Docker Compose service, the default configuration no longer limits the maximum number of pages per request.

2026.4.14 v3.4.1 发布

  • PaddleOCR-VL 新增 llama-cpp-server 后端支持。
  • PaddleOCR-VL 新增 AMD GPU、Intel Arc GPU 硬件支持。
  • 修复 PaddleOCR-VL 华为 NPU、昆仑芯 XPU 镜像中的依赖问题。
  • 对于 PaddleOCR-VL Docker Compose 服务,默认不限制单请求的页数上限。

Full Changelog: https://github.com/PaddlePaddle/PaddleOCR/compare/v3.4.0...v3.4.1

v3.4.0 New feature
⚠ Upgrade required
  • Fixed error when accessing the `/docs` endpoint in the official PaddleOCR‑VL image
Notable features
  • PaddleOCR-VL-1.5 supports irregular‑shaped bounding box localization and achieves 94.5% accuracy on OmniDocBench v1.5
  • Adds seal recognition and integrates spotting tasks into PaddleOCR-VL-1.5
  • PP-StructureV3 gains `format_block_content` and `markdown_ignore_labels` parameters
Full changelog

2026.1.29 v3.4.0 released

  • Release the PaddleOCR-VL-1.5 complex document parsing solution.

    PaddleOCR-VL-1.5 is a new iterative version of the PaddleOCR-VL series. Based on comprehensive optimization of the core capabilities of version 1.0, the model achieves 94.5% accuracy on the authoritative document parsing benchmark OmniDocBench v1.5, surpassing top global general-purpose large models and document parsing–specific models.

    PaddleOCR-VL-1.5 innovatively supports irregular-shaped bounding box localization of document elements, enabling excellent performance in real-world application scenarios such as scanning, skew, warping, screen-photography, and complex illumination, achieving comprehensive SOTA performance. In addition, the model further integrates seal recognition and spotting tasks, with key metrics continuing to lead mainstream models.

    You can use it online on the PaddleOCR official website or call the model API.

  • Add support for calling MLX-VLM inference services.

  • PaddleOCR-VL now supports cross-page table merging and multi-level heading reconstruction.

  • PP-StructureV3 adds support for the format_block_content and markdown_ignore_labels parameters.

  • Fixed an issue where accessing the /docs endpoint in the official PaddleOCR-VL image would result in an error.

2026.1.29 v3.4.0 发布

  • 发布 PaddleOCR-VL-1.5 复杂文档解析方案。

    PaddleOCR-VL-1.5 是 PaddleOCR-VL 系列的全新迭代版本。在全面优化 1.0 版本核心能力的基础上,该模型在文档解析权威评测集 OmniDocBench v1.5 上斩获了 94.5% 的高精度,超越了全球的顶尖通用大模型及文档解析专用模型。

    PaddleOCR-VL-1.5 创新性地支持了文档元素的异形框定位,使得 PaddleOCR-VL-1.5 在扫描、倾斜、弯折、屏幕拍摄及复杂光照等真实落地场景中均表现卓越,实现了全面的 SOTA。此外,模型进一步集成了印章识别与文本检测识别任务,关键指标持续领跑主流模型。

    您可以在 PaddleOCR官网 在线使用或者调用该模型的API。

  • 新增对 MLX-VLM 推理服务的调用支持。

  • PaddleOCR-VL 支持合并跨页表格、多级标题重建功能。

  • PP-StructureV3 支持 format_block_contentmarkdown_ignore_labels 参数。

  • 修复 PaddleOCR-VL 官方镜像访问 /docs 接口报错的问题。

New Contributors

  • @AmirHosseinOmidi0 made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/16659
  • @ZhangX-21 made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/16745
  • @AdlerFleurant made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/16756
  • @tianyuzhou668 made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/16518
  • @shiyuasuka made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/17041
  • @1250890838 made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/16996
  • @Ihebdhouibi made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/16994
  • @Ghazi-raad made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/17201
  • @orbisai0security made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/17289
  • @danghoangnhan made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/17019
  • @Luxorion-12 made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/17158

Full Changelog: https://github.com/PaddlePaddle/PaddleOCR/compare/v3.3.3...v3.4.0

v3.3.3 New feature
Notable features
  • PP-StructureV3 MCP Server can use Qianfan platform hosted services as the inference engine
  • Documentation for PP-OCRv5 and PaddleOCR-VL comprehensively improved with error fixes
  • Added inference support for Muxi GPUs
Full changelog

2026.1.20 v3.3.3 released

  • PaddleOCR-VL now supports specifying custom model names and API keys, and can seamlessly integrate with inference services from third-party platforms such as SiliconFlow and Novita AI.
  • The PP-StructureV3 MCP Server supports using hosted services on the Qianfan platform as the underlying inference engine.
  • The documentation for PP-OCRv5 and PaddleOCR-VL has been comprehensively improved, with known errors fixed to enhance readability and accuracy.
  • Added support for inference on Muxi GPUs, further expanding hardware compatibility and deployment flexibility.

2026.1.20 v3.3.3 发布

  • PaddleOCR-VL 现已支持指定自定义模型名称与 API Key,并可无缝对接硅基流动、Novita AI 等第三方平台的推理服务。
  • PP-StructureV3 MCP Server 支持基于千帆平台的托管服务作为底层推理引擎。
  • 全面优化 PP-OCRv5 与 PaddleOCR-VL 相关文档,修复已知错漏,提升可读性与准确性。
  • 新增对沐曦 GPU 的推理支持,进一步扩展硬件兼容性与部署灵活性。

New Contributors

  • @metax666 made their first contribution in https://github.com/PaddlePaddle/PaddleOCR/pull/17269

Full Changelog: https://github.com/PaddlePaddle/PaddleOCR/compare/v3.3.2...v3.3.3

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

About

Stars
79,213
Forks
10,546
Languages
Python C++ TypeScript
Downloads/week
1,308 ↑86%
NPM Maintainers
1
Contributors
295

Install & Platforms

Platforms
linux macos windows

Beta — feedback welcome: [email protected]