LLM Fine-Tuning for Kikuyu
We adapt state-of-the-art models like Llama 3.1 8B for Kikuyu using Parameter-Efficient Fine-Tuning (QLoRA/DoRA). Our approach transforms translation data into instruction-response pairs and uses synthetic data generation to build conversational capability.
Key areas: Cross-lingual transfer, noise injection, synthetic data via GPT models
Speech-to-Speech Models
We are building end-to-end voice AI using the Llama-Omni architecture and Mimi neural codec. This enables real-time, full-duplex conversations in Kikuyu with sub-500ms latency.
Key areas: Mimi codec adaptation, tonal fidelity, WebRTC streaming
Dataset Engineering
We work with the African Next Voices initiative (750+ hours of Kikuyu audio) and build synthetic instruction datasets to train robust conversational models.
Key areas: InstructS2S pipelines, noise augmentation, MasakhaNER integration
Inference & Deployment
We optimize for production using vLLM, quantization, and serverless GPU infrastructure to serve Kikuyu AI at scale with minimal latency.
Key areas: vLLM serving, AWQ quantization, RunPod serverless