header-langage
简体中文
繁體中文
English
Tiếng Việt
한국어
日本語
ภาษาไทย
Türkçe
Scan to Download the APP

Baidu Releases PP-OCRv6: Million-Level Parameters Comparable to Billion-Level VLM, Single Model Supports 50 Languages

According to Dongcha Beating monitoring, the Baidu PaddlePaddle team has released the new generation OCR system PP-OCRv6, offering three versions: Tiny 1.5M, Small 7.7M, and Medium 34.5M, covering edge devices, browsers, and cloud deployment scenarios. Compared to the previous generation PP-OCRv5, the detection accuracy has improved by 4.6%, recognition accuracy by 5.1%, and it has integrated Chinese, English, Japanese, and 46 other Latin languages into a single unified model.

PP-OCRv6 has redesigned the detection and recognition network, introducing a unified module structure and Structural Reparameterization technology to improve accuracy while reducing computational costs. With OpenVINO optimization, the Medium version has achieved up to a 5.2x increase in end-to-end CPU inference speed.

Official test results show that PP-OCRv6 has achieved performance close to or even surpassing some billion-level visual language models VLM on various OCR benchmarks with a parameter scale in the tens of millions. The team has also performed specialized optimizations for scenarios such as handwriting, industrial component identification, digital displays, PCB silk screens, and CAD drawings. The relevant code has now been merged into the PaddleOCR project and released as open source.

举报 Correction/Report
Correction/Report
Submit
Add Library
Visible to myself only
Public
Save
Choose Library
Add Library
Cancel
Finish