NewsFlash Articles Data Fundraising Skill&API

Baidu Releases PP-OCRv6: Million-Level Parameters Comparable to Billion-Level VLM, Single Model Supports 50 Languages

According to Dongcha Beating monitoring, the Baidu PaddlePaddle team has released the new generation OCR system PP-OCRv6, offering three versions: Tiny 1.5M, Small 7.7M, and Medium 34.5M, covering edge devices, browsers, and cloud deployment scenarios. Compared to the previous generation PP-OCRv5, the detection accuracy has improved by 4.6%, recognition accuracy by 5.1%, and it has integrated Chinese, English, Japanese, and 46 other Latin languages into a single unified model.

PP-OCRv6 has redesigned the detection and recognition network, introducing a unified module structure and Structural Reparameterization technology to improve accuracy while reducing computational costs. With OpenVINO optimization, the Medium version has achieved up to a 5.2x increase in end-to-end CPU inference speed.

Official test results show that PP-OCRv6 has achieved performance close to or even surpassing some billion-level visual language models VLM on various OCR benchmarks with a parameter scale in the tens of millions. The team has also performed specialized optimizations for scenarios such as handwriting, industrial component identification, digital displays, PCB silk screens, and CAD drawings. The relevant code has now been merged into the PaddleOCR project and released as open source.

Source

Correction/Report

On-Chain Activity