Egypt’s TokenAI Releases Open-Source LLM Horus, Outperforming Larger Global Models

4 Min Read

Egyptian AI startup TokenAI, founded by developer Assem Sabry, has launched Horus 1.0-4B, a fully open-source large language model developed entirely in Egypt. The new model marks a significant step for the country’s local AI development scene, demonstrating that smaller, regionally-focused models can compete with and even surpass their larger, globally-developed counterparts on key performance metrics.

Quick Facts

  • Egyptian-built 4 billion parameter model.
  • Outperforms Llama 3.1 and Gemma-2 on MMLU benchmark.
  • Fully open-source under an accessible MIT license.

Punching Above Its Weight on Global Benchmarks

Despite its relatively small 4 billion parameter size, Horus 1.0-4B delivers impressive results. On the MMLU benchmark, which tests multilingual knowledge across 57 academic subjects, Horus achieves a score of 88%. This places it well ahead of much larger models, including Qwen 3.5-4B (73%), Llama 3.1-8B (69%), and Google’s Gemma-2-9B (71%).

The model was trained on trillions of tokens and released this month in seven variants, making it adaptable for different hardware capabilities.

A Strong Contender for Arabic Language Tasks

Horus was specifically optimized for Arabic language and cultural contexts, and its performance reflects this focus. On the ArabicBench benchmark, it scores 67%, surpassing Qwen 3.5-4B (65%), Gemma-2-9B (60%), and significantly outperforming Llama 3.1-8B (40%). It also leads on ERQA, an Arabic question-answering benchmark, scoring 67% against Qwen’s 60%.

However, the developer acknowledges room for improvement in mathematical reasoning. On the AraMath benchmark, Horus scores 33%, trailing Qwen (40%) and Gemma (35%). Similarly, on the GSM8K grade-school math test, its 67% score is behind Gemma-2-9B’s 88% and Llama 3.1-8B’s 84%.

Designed for Accessibility and Real-World Use

A key advantage of Horus is its small footprint. TokenAI has released seven variants, from a full 16-bit version of around 8GB to a highly compressed 4-bit variant at just 2.3GB. This range makes the model accessible for researchers, developers, and startups with limited compute budgets, allowing deployment on everything from GPU servers to personal computers and edge devices.

The model supports chain-of-thought reasoning and tool use and is available for download on Hugging Face. It can also be accessed through TokenAI’s proprietary neuralnode Python framework.

Building an Egyptian Open-Source AI Ecosystem

The release of Horus is a notable event in an ecosystem where homegrown, open-source models are rare, despite Egypt graduating approximately 60,000 technology students annually. It follows the February launch of Karnak, the government’s first national 41 billion parameter model.

TokenAI aims for Horus to become a foundational piece of an Egyptian open-source AI infrastructure. The startup is already planning to release a text-to-speech model called Replica, which will feature 20 voices across 10 languages, including Arabic, further expanding its suite of AI tools.

About TokenAI

TokenAI is a Cairo-based AI startup founded by developer Assem Sabry. The company focuses on developing open-source, multilingual large language models from scratch with a specific optimization for Arabic language and cultural contexts.

Source: Middle East AI News

Share This Article