NewsFlash Articles Data Fundraising Skill&API

Liquid AI Open-Sources Lightweight Multi-Modal Model: Extracts Images to JSON Structured Data Directly on the Edge

According to Dynamic Beating monitoring, Liquid AI has open-sourced two lightweight multimodal models, LFM2.5-VL-1.6B-Extract and LFM2.5-VL-450M-Extract. The new models are specifically optimized for extracting structured data from images. They can directly convert images to JSON-formatted data on the device based on a user-specified list of fields, eliminating the traditional process of generating full text with a multimodal model and then performing secondary parsing.

The new models come in two parameter configurations: 1.6 billion (1.6B) and 450 million (450M), released under the LFM Open License v1.0. Official benchmarks show that the new models excel in document scanning, in-vehicle cabin understanding, industrial inspection, and other scenarios. In benchmark evaluations, the performance of the 1.6B model is on par with a 4 billion (4B) scale general multimodal model, while the 450M model rivals a 2 billion (2B) level model.

On the deployment side, the new models have been adapted for various intelligent hardware and edge device chips (SoC) for offline deployment in edge scenarios such as in-vehicle cabin understanding, document scanning, and industrial inspection. Liquid AI has now made the model weights available for download on the Hugging Face platform.

Source

Correction/Report

On-Chain Activity