
olmOCR-7B-0225-preview
OlmOCR-7B-0225-preview is a document image recognition model based on Qwen2-VL-7B-Instruction fine-tuning, used for efficiently converting documents into plain text.
0
- Support single page document image input, with a longest edge of 1024 pixels
- Generate high-quality text output by combining document metadata
- Provide manual prompt generation method for users to customize and use
- Supports batch processing and can efficiently handle large-scale documents
- Compatible with multiple document formats, including PDF and image files
Product Details
OlmOCR-7B-0225-preview is an advanced document recognition model developed by Allen Institute for AI, aimed at quickly converting document images into editable plain text through efficient image processing and text generation techniques. This model is fine tuned based on Qwen2-VL-7B-Instruction and combines powerful visual and language processing capabilities, making it suitable for large-scale document processing tasks. Its main advantages include efficient processing capabilities, high-precision text recognition, and flexible prompt generation methods. This model is suitable for research and educational purposes, follows the Apache 2.0 license, and emphasizes responsible use.