11 June 2026

Introducing FFASR Leaderboard with Hugging Face

Hear from Industry Leaders

This webinar will feature a range of industry leaders.

Standard ASR benchmarks often miss the conditions that matter most in real-world use: background noise, competing speech, reverberation, distance from the microphone, and other acoustic effects.

The Far-Field ASR Leaderboard, built by Treble and Hugging Face, gives developers and users of ASR engines a more realistic way to evaluate model performance in the environments where speech technology is actually deployed.

Alongside the launch of FFASR, hear from leading voices in speech AI. Dr. George Saon at IBM will share perspectives on the challenges of far-field speech recognition and the importance of robust evaluation in noisy environments. Nithin Rao Koluguri, Senior Research Scientist at NVIDIA, will present NVIDIA’s work in ASR, discuss far-field robustness, and explore the Parakeet family of ASR models. Julian Mack from Cohere will present Cohere’s work in speech modeling, including Cohere Transcribe, and discuss the challenges of evaluating ASR systems in real-world conditions. Dr. Shinji Watanabe, Professor at Carnegie Mellon University will also share insights from his research in speech and audio processing.

Sign up to the webinar

Join the webinar to see how FFASR works, why far-field evaluation matters, and how the community can use it to benchmark ASR models under real-world acoustic conditions.

June 11th, 2026
- US Time slot: 9:00am PDT / 12:00pm EDT
- EU Time slot: 9:00am UTC / 11:00am CEST

Sign up to secure your place.

On the agenda for this Webinar:

1. Introducing the Far-Field ASR Leaderboard

Hugging Face will introduce FFASR, explain how the leaderboard works, and show how ASR developers and users can evaluate models under realistic far-field conditions.

2. Why real-world acoustic evaluation matters

Treble will explain the acoustic data behind FFASR, why far-field conditions are essential for robust ASR, and how realistic simulations can reveal performance gaps hidden by traditional benchmarks.

3. How to use FFASR and go further

Learn how to submit models, interpret benchmark results, and explore how Treble’s far-field datasets and simulation platform can support model training, evaluation, and custom real-world test scenarios.

4. IBM: Why far-field ASR matters

Dr. George Saon, IBM will introduce IBM’s ASR work, discuss the challenges of far-field and noisy speech recognition, and share perspectives on how realistic benchmarking could influence the future of ASR model development.

5. NVIDIA: Building robust ASR for real-world deployment

Nithin Rao Koluguri, Senior Research Scientist at NVIDIA, will present NVIDIA’s work in ASR, discuss challenges around far-field robustness, and provide insights into the Parakeet family of ASR models.

6. Cohere: Real-world ASR performance and benchmarking

Julian Mack, Staff, Member of Technical Staff, Foundations at Cohere, will introduce Cohere’s speech modeling work, including Cohere Transcribe, and discuss the challenges of far-field ASR and the importance of robust benchmarking.

7. Shinji Watanabe: Advances in speech and audio research

Dr. Shinji Watanabe, Associate Professor at Carnegie Mellon University, will share insights from his work in speech and audio processing and perspectives on the future of ASR research.

Go beyond near field data

For ASR models used in smart devices, meeting rooms, automotive systems, robotics, and other hands-free applications, clean near-field benchmarks only tell part of the story.

FFASR evaluates models against realistic far-field acoustic scenarios, helping the community understand where models perform well, where they fail, and how they compare under conditions that are difficult to reproduce manually.

Why FFASR matters

Easy to use

Submit a model through Hugging Face and view benchmark results in a transparent public leaderboard. FFASR makes advanced far-field evaluation accessible without requiring teams to build complex acoustic test environments themselves.

Comprehensive

The leaderboard evaluates ASR performance across realistic end-user far-field scenarios, including different levels of acoustic difficulty. Instead of reducing performance to one generic score, FFASR helps users understand how models behave across easier, moderate, and more challenging conditions.

Community-driven

FFASR extends existing evaluation approaches by adding realistic scenarios that have previously been difficult to test at scale. Community members can submit models, compare results publicly, and rely on a hidden test dataset designed to support impartial benchmarking.

Reachy's thoughts on far field data

Available on Hugging Face

Hugging Face has become a central platform for the machine learning community, setting the standard for open, collaborative development in AI. Known for its model hub, evaluation tools, and commitment to open benchmarks, Hugging Face plays a key role in shaping how models are shared, tested, and improved across the ecosystem. The introduction of the Far Field ASR Leaderboard continues this direction, bringing more realistic evaluation practices into the open.

FFASR Beta is available on Hugging Face for community members to evaluate their own models and for ASR users to compare the latest benchmark results.

Treble also provides:

Meet the speakers

Audio ML Engineer - Hugging Face

Dr. Eric Bezzam

Eric Bezzam is an audio ML engineer at Hugging Face. He received his PhD from EPFL, and previously worked at Snips, Sonos, DSP Concepts, and Fraunhofer IDMT. He was one of the main developers of pyroomacoustics.

Senior Product Manager for the Treble SDK

Dr. Daniel Gert Nielsen

Dr. Daniel Gert Nielsen is a specialist in numerical vibro-acoustics, with a PhD focused on loudspeaker modeling and optimization. His expertise spans acoustic simulation for communication devices and synthetic data generation for machine learning applications. With a strong background in numerical methods and audio technology, he plays a key role in shaping advanced acoustic modeling solutions at Treble.

Manager Speech Technologies - IBM Research, AI Language Technology

Dr. George Saon

George Saon received the Engineer Diploma from the Polytechnic University of Bucharest, Bucharest, Romania, in 1995, and the M.Sc. and Ph.D. degrees in computer science from Henri Poincare University, Nancy, France, in 1994 and 1997, respectively. He is currently managing the Speech Technologies Group at the IBM T. J. Watson Research Center and is responsible for the development of Granite Speech, a suite of advanced open-source LLM-based ASR and speech translation models. Since joining IBM in 1998, he has worked on a variety of problems spanning several areas of large vocabulary continuous speech recognition such as discriminative feature processing, acoustic modeling, speaker adaptation and large vocabulary decoding algorithms. He has authored or coauthored more than 150 conference and journal papers and holds several patents in the field of speech recognition. He was an Elected Member of the IEEE Speech and Language Technical Committee and became an IEEE Fellow in 2026.

Senior Research Scientist at NVIDIA

Nithin Rao Koluguri

Nithin Rao Koluguri is a Senior Research Scientist at NVIDIA, where he works on automatic speech recognition and speech-language modeling. He created TitaNet, a speaker-recognition architecture now widely used across the field (~1.5M downloads/month on Hugging Face), and co-built NeMo's first speaker diarization system. He led the billion-parameter scaling of FastConformer ASR and built the Parakeet V2 and V3 models, which are among the most performant ASR systems available and now widely used in production dictation apps. His current work centers on the modeling and data efforts behind NVIDIA's next-generation speech-language models, including the Parakeet and Nemotron SpeechLLM models. He holds a master's degree from the University of Southern California, where he conducted research at the Signal Analysis and Interpretation Laboratory (SAIL) under Prof. Shrikanth Narayanan, and has authored numerous papers at ICASSP, Interspeech, and other leading venues.

Staff, Member of Technical Staff, Foundations

Julian Mack

Julian Mack is a staff researcher working on multimodal LLMs and audio at Cohere. His primary focus is on speech modeling and audio understanding and he led the development of Cohere Transcribe. He also contributed to Cohere's Command A Vision model release. Previously, at Myrtle.ai he led a team training ultra low-latency ASR systems. He has a masters in Machine Learning from Imperial College London and a BA in physics from Cambridge University.

Professor at Carnegie Mellon University

Shinji Watanabe

Shinji Watanabe is an Associate Professor at Carnegie Mellon University, where his research focuses on automatic speech recognition, speech enhancement, spoken language understanding, and machine learning for speech and audio processing. He previously held research positions at NTT Communication Science Laboratories, Mitsubishi Electric Research Laboratories (MERL), and Johns Hopkins University. He has authored over 500 papers in speech and audio processing and received multiple awards, including the Best Paper Award at Interspeech. He is a Fellow of both IEEE and ISCA and serves as a Senior Area Editor for IEEE Transactions on Audio, Speech, and Language Processing.

On-demand webinar: The Next Era for the Treble Web App

Sign up for our on-demand webinar focused on the latest advancements in the Treble Web Application, covering new features that improve analysis workflows, visualization, and overall usability. The session includes live demonstrations of upcoming capabilities, a brief overview of Revit and Rhino integrations to streamline design processes, and the introduction of a new materials initiative aimed at improving simulation accuracy and industry alignment. Attendees will gain a clear understanding of how these updates enhance efficiency, reduce setup time, and enable more reliable acoustic design decisions within modern, cloud based simulation workflows.

18 May 2026

FFASR Leaderboard Early Access

Go beyond near-field ASR evaluation Treble and Hugging Face are launching FFASR The first public, interactive far-field ASR leaderboard

25 March 2026

Warner Theatre Rehearsal Hall - Acentech Case Study

This study helped to refine Acentech's understanding of the performance of the ceiling system, which absorbs sound quite effectively at most frequencies, and contributes valuable reflections for ensemble intelligibility. Treble helped Acentech to translate between measurements of room decay times and up-close mockup measurements to create a coherent analysis and understanding.

Back to insights