Biography
Darshani Persadh is a software engineer specialising in quantum computing for agriculture and efficient inference for large-scale language models, with a focus on quantization strategies and Mixture-of-Experts (MoE) architectures. Her work bridges production-ready model deployment and academic research, particularly on compressing ultra-large models (up to 284B parameters) into GGUF format while preserving code generation capabilities and long-context reasoning.
She is actively involved in the open-source community, contributing to llama.cpp and Hugging Face ecosystems. Her recent work on quantized DeepSeek-V4 variants has gathered significant attention, with nearly 1000 downloads within the first 24 hours, reflecting the growing need for practical MoE inference solutions.
Selected publications & models
DeepSeek-V4-Flash-GGUF: A Quantized 284B Mixture-of-Experts Language Model
DeepSeek-V4-Flash-IQ1_S-XL: Merged single-file quantized MoE
Quantum for Agriculture: Optimizing Crop Yields on Q29, South Coast, Durban
Benchmarking IQ1_S Quantization for Code-First LLMs
Research impact & adoption
The DeepSeek-V4-Flash-GGUF model gained rapid traction within the open-source community: 799 downloads in the first 9 hours after release, surpassing 985 downloads within 24 hours. This reflects strong demand for efficient, deployable MoE models that balance size (284B parameters) with practical hardware requirements (80GB+ RAM, RTX 3090 class). The model's two-shard GGUF format and custom V4-aware fork enable researchers to experiment with state-of-the-art MoE architectures on commodity hardware.
Talks & tutorials
- "Quantizing Mixture-of-Experts: Lessons from DeepSeek-V4" – Hugging Face LLM Efficiency Meetup (April 2026)
- "PM Modi Meets Indian-Origin Tech Leaders in South Africa" – G20 Tech & Innovation Leaders Summit (November 2026)
- "Forbes Sustainability Summit" - Forbes On 5th (October 2025)
- "CARTIER Womens Initiative" - Science & Technology AMA Session (September 2025)