Aya Vision 8B

Aya Vision 8B

An 800 million parameter multilingual visual language model that supports OCR, image description, visual inference, and other functions

0
  • Supports 23 languages, including Chinese, English, French, etc., covering multiple language scenarios
  • Having strong visual language understanding ability, it can be used for tasks such as OCR, image description, visual reasoning, etc
  • Supports 16K context length, capable of handling longer text input and output
  • Can be directly used through the Hugging Face platform, providing detailed usage guidelines and example code
  • Support multiple input methods, including images and text, to generate high-quality text output

Product Details

CohereForAI's Aya Vision 8B is an 800 million parameter multilingual visual language model optimized for various visual language tasks, supporting OCR, image description, visual reasoning, summarization, question answering, and other functions. This model is based on the C4AI Command R7B language model, combined with the SigLIP2 visual encoder, supports 23 languages, and has a context length of 16K. Its main advantages include multilingual support, strong visual comprehension, and a wide range of applicable scenarios. This model is released in the form of open-source weights, aiming to promote the development of the global research community. According to the CC-BY-NC license agreement, users are required to comply with C4AI's Acceptable Use Policy.