Most Recent

Newsroom

A New Look for Koel Labs

By Koel Labs ·

November 2, 2025

You might have noticed that we’ve given Koel Labs a fresh new look. This change is intended to better align us with our mission of pioneering inclusive speech technology, with our goals as a research-focused startup, and with our belief in openly sharing our work.

Announcement

The Underlying Intuition of Wav2Vec2’s Transformer

By Koel Labs ·

October 25, 2025

Wav2Vec2’s Transformer handles encoded audio features and aligns them to text. Building on our blog post about the feature extractor, this post dives into positional encodings tailored to audio and how CTC loss solves alignment without frame-level labels.

Technical Report

Technical Reports

The Underlying Intuition of Wav2Vec2’s Transformer

By Koel Labs ·

October 25, 2025

Technical Report

The Underlying Intuition of Wav2Vec2's CNN

By Koel Labs ·

October 19, 2025

Typically, every explanation of the Wav2vec2 architecture begins with the iconic diagram, but without extensive background, it is hard to know what the cones labeled as the CNN are really doing. What does it actually mean to extract features from audio? Let's find a stronger visual intuition for this.

Technical Report

Building Open Source Hugging Face Leaderboards

By Koel Labs ·

January 11, 2025

Sometimes, the best machine learning models are hidden in plain sight. During our work on phonemic transcription, we stumbled upon a specialized ginic model that had been finetuned on Facebook's XLSR-53 model using the Buckeye corpus. This discovery proved significant: Ginic performs 1.2x better than Facebook, and iterating on their approach, our m... Read More →

Technical Report

A Deep Dive into Phonemic Transcription Metrics

By Koel Labs ·

December 30, 2024

The International Phonetic Alphabet (IPA) is like the Swiss Army knife of pronunciation—it gives us precise symbols to represent every sound humans make in language. In recent years, predicting these phonemic transcriptions from audio has become a popular machine learning task. But how do we calculate the accuracy of these models?

Technical Report

Announcements

A New Look for Koel Labs

By Koel Labs ·

November 2, 2025

Announcement

Hello World! — Our Open Source Project Launch

By Koel Labs ·

December 23, 2024

At Koel Labs, our goal is to make pronunciation learning more accessible and inclusive. To represent the diversity of language and dialects, we're excited to announce that everything from model weights and training code to datasets, research papers, and the frontend UI is officially open source!

Announcement

Early Access

Be First in Line

We’re inviting a small group for early access to our research previews. Reserve your spot today.