Research

Our research pioneers AI for low-resource African languages, from fine-tuning LLMs to building real-time voice systems.

Research Areas

LLM Fine-Tuning for Kikuyu

We adapt state-of-the-art models like Llama 3.1 8B for Kikuyu using Parameter-Efficient Fine-Tuning (QLoRA/DoRA). Our approach transforms translation data into instruction-response pairs and uses synthetic data generation to build conversational capability.

Key areas: Cross-lingual transfer, noise injection, synthetic data via GPT models

Speech-to-Speech Models

We are building end-to-end voice AI using the Llama-Omni architecture and Mimi neural codec. This enables real-time, full-duplex conversations in Kikuyu with sub-500ms latency.

Key areas: Mimi codec adaptation, tonal fidelity, WebRTC streaming

Dataset Engineering

We work with the African Next Voices initiative (750+ hours of Kikuyu audio) and build synthetic instruction datasets to train robust conversational models.

Key areas: InstructS2S pipelines, noise augmentation, MasakhaNER integration

Inference & Deployment

We optimize for production using vLLM, quantization, and serverless GPU infrastructure to serve Kikuyu AI at scale with minimal latency.

Key areas: vLLM serving, AWQ quantization, RunPod serverless

Publications

Our technical papers on LLM fine-tuning and Speech-to-Speech architectures will be published here. Preprints coming soon.

Publications coming soon

Collaborate With Us

We partner with universities, language communities, and funders to advance African language AI.

Interested in collaboration or funding our research? Contact us at research@c-elo.com