“dRAG Race: Benchmarking Open Source Vector Databases” presents the findings of Kwaai’s intern-led Vector DB Performance project, now accepted for publication in the Journal for Big Data and AI. A cross‑functional cohort of data science and engineering interns—guided by a PhD AI‑robotics advisor and program coordinator—designed and ran a rigorous benchmark of seven open source vector databases under realistic RAG workloads, from corpus design and chunking through automated multi‑run experiments and visual analysis.
This session traces their journey from first principles to publishable research: how they selected metrics, balanced latency versus recall, debugged surprising results, and turned a GitHub project into a reproducible framework that others can extend. Attendees will walk away with practical guidance on choosing and tuning vector backends for retrieval‑augmented applications—and a concrete example of how a mission‑driven internship can produce real, peer‑reviewed contributions to the open AI ecosystem.
The presentation will take place in Ballroom F on Thursday, March 5, 2026 - 15:30 to 16:00



