AI enterprise Cohere has unveiled a new collection of multilingual models during the India AI Summit. Known as Tiny Aya, these models are open-weight, meaning their source code is accessible for public use and modification. They support over 70 languages and can operate on standard devices like laptops without needing an internet connection.
Developed by Cohere Labs, the models cater to South Asian languages including Bengali, Hindi, Punjabi, Urdu, Gujarati, Tamil, Telugu, and Marathi.
The primary model features 3.35 billion parameters, indicating its complexity and size. Additionally, the company has introduced TinyAya-Global, a version optimized for broader language command adherence, suitable for applications with extensive language requirements. The family also includes regional variants: TinyAya-Earth for African languages, TinyAya-Fire for South Asian languages, and TinyAya-Water for Asia Pacific, West Asia, and Europe.
Cohere stated that this approach enhances each model's linguistic understanding and cultural sensitivity, resulting in systems that resonate more naturally with the communities they serve. Despite their regional focus, all Tiny Aya models maintain extensive multilingual capabilities, serving as adaptable foundations for further research and development.
The models, trained using a single cluster of 64 H100 GPUs from Nvidia, are tailored for researchers and developers creating applications for diverse language speakers. They can function directly on devices, enabling offline translation capabilities. Cohere has designed its software to be efficient for on-device use, requiring less computing power compared to many similar models.
In linguistically rich countries like India, this offline functionality can unlock a variety of applications without the constant need for internet access.
Available on HuggingFace, a well-known platform for AI model sharing, as well as the Cohere Platform, developers can download these models for local deployment. The company is also releasing training and evaluation datasets on HuggingFace and plans to publish a technical report outlining its training methods.