Khalifa University Co-Leads Launch of New Global AI Benchmark for Telecom Sector

Abu Dhabi’s Khalifa University has announced the launch of GSMA Open-Telco LLM Benchmarks 2.0, a significant new framework for evaluating the performance of large language models (LLMs) on real-world telecommunications tasks. Developed in collaboration with the GSMA Foundry community and hosted on the Hugging Face platform, the initiative aims to create an industry-wide standard for assessing AI capabilities in mission-critical network operations.

Contents

Addressing a Critical Industry Gap
A Comprehensive Evaluation Framework
Global Collaboration and UAE Leadership
Initial Benchmark Results
About the GSMA Open-Telco LLM Benchmarks

Addressing a Critical Industry Gap

The telecom sector is investing billions in AI, yet a significant gap remains between the capabilities of general-purpose AI models and the deep domain expertise required for complex network management. The new benchmarks address this shortfall by providing a systematic, evidence-based framework for telcos to evaluate AI models. This allows companies to move beyond vendor claims and make informed decisions about deploying AI for network automation, where errors can lead to significant financial loss and service disruption.

A Comprehensive Evaluation Framework

The benchmark rigorously assesses AI models across five complementary dimensions, covering 34 use cases submitted by global telecom operators. These dimensions test for a wide range of industry-specific skills:

TeleYAML: Evaluates the model’s ability to generate intent, translating operator goals into standards-aligned YAML configurations for 5G core functions and network slicing.
TeleLogs: Assesses network troubleshooting skills, using synthetic data from real network traces to measure root-cause analysis capabilities.
TeleMATH: Measures mathematical reasoning through 500 expert-curated, telecom-specific engineering problems.
3GPP-TSG: Tests the model’s comprehension of complex technical standards from bodies like the 3GPP.
TeleQnA: Provides 10,000 multiple-choice questions to gauge knowledge of telecom terminology, research, and technical details.

Global Collaboration and UAE Leadership

This global initiative involves 15 leading mobile network operators, including AT&T, Deutsche Telekom, Orange, Vodafone, and the UAE’s du. Khalifa University’s 6G Research Centre plays a pivotal role, co-leading the Network Management & Configuration track alongside major technology partners. This track focuses on developing datasets like TeleYAML to automate the translation of operator intents into valid network configurations, a key challenge for the industry.

Initial Benchmark Results

Initial results show that frontier models like GPT-5, Grok-4-fast, and Claude Sonnet 4.5 achieve the highest overall performance. GPT-5 led with a score of 65.55%, excelling in network troubleshooting and domain-specific Q&A. However, domain-specific models demonstrated competitive performance on specialised tasks. AT&T’s customised Gemma model, for instance, outperformed all other systems in network troubleshooting scenarios. The results also highlighted that intent-to-configuration tasks remain a significant challenge, with even the top models scoring below 28%, underscoring the need for further innovation in network automation.

About the GSMA Open-Telco LLM Benchmarks

The GSMA Open-Telco LLM Benchmarks is an initiative by the GSMA Foundry to establish a systematic, industry-wide framework for evaluating the performance of AI models on telecommunications-specific tasks. Co-led by Khalifa University, the project brings together mobile network operators, research institutions, and technology companies to develop robust standards for AI deployment in critical network operations, including configuration, troubleshooting, and automation. The benchmarks are publicly available on the Hugging Face AI community platform to foster transparency and collaboration.

Source: Middle East AI News

TRENDING

Syria Revokes Cybersecurity Accreditations In Push For Digital Sovereignty

Lebanon’s HAQQ Legal AI Secures $3 Million To Digitize The Legal Sector

Shorooq Partners Unveils $200 Million Late-Stage Fund Backed By QIA

Saudi’s Viero Raises $1.2 Million Seed to Unify Fleet Management

Browse Categories

About

Addressing a Critical Industry Gap

A Comprehensive Evaluation Framework

Global Collaboration and UAE Leadership

Initial Benchmark Results

About the GSMA Open-Telco LLM Benchmarks

POPULAR

Follow US

RELATED NEWS

EWEC And Khalifa University Partner To Build AI-Powered Grid Technologies in UAE

UAE’s Federal Government Adopts AE Coin For Nationwide Service Payments

Iraq Rolls Out Dhamani Digital Health Platform For Three Million Citizens

New Streaming Service Dubai+ Launches With 30,000 Hours Of Free Content in UAE

Browse Categories

About

TRENDING

Browse Categories

About

Addressing a Critical Industry Gap

A Comprehensive Evaluation Framework

Global Collaboration and UAE Leadership

Initial Benchmark Results

About the GSMA Open-Telco LLM Benchmarks

POPULAR

Never miss a beat!

Follow US

Trending

RELATED NEWS