Curated Audio and Acoustic Datasets for Audio AI
Direct Access to Large Scale, Physically Accurate Spatial Audio Data
Treble Datasets give Audio AI teams immediate access to curated spatial impulse responses generated with our hybrid wave based and geometrical acoustics engine. The data is physics grounded, simulation native, and designed for scalable machine learning workflows.
Controlled Acoustic Variation at Scale
Treble Datasets are organized across controlled room volume categories to ensure balanced variation and solver fidelity across scales. This enables robust training and validation across fundamentally different acoustic regimes.
Small Rooms
- 50 to 180 cubic meters
- 10,000 impulse responses
- Transition frequency up to 2 kHz
- Ambisonics order 16
- Strong modal behavior and low frequency room effects
Designed for near field speech applications, consumer electronics, and compact acoustic environments.
Medium Rooms
- 90 to 390 cubic meters
- 10,000 impulse responses
- Transition frequency up to 1.5 kHz
- Ambisonics order 16
- Balanced modal and reflective behavior
Suitable for meeting rooms, classrooms, and mid scale architectural spaces.
Large Rooms
- 300 to 1600 cubic meters
- 10,000 impulse responses
- Transition frequency up to 1 kHz
- Ambisonics order 16
- Increased reflection complexity and spatial diffusion
Optimized for performance spaces, public environments, and large volume acoustic modeling.
Treble Datasets span diverse real world environments relevant to modern Audio AI.
Current environment categories include:
- Meeting Rooms — 15,000 IRs
- Restaurants and Bars — 25,000 IRs
- Apartments — 35,000 IRs
- Bathrooms — 10,000 IRs
- Living Rooms — 15,000 IRs
- Bedrooms — 20,000 IRs
- Virtual Test Rooms — 100 IRs
- Trains — 5,000 IRs
- Car Cabins — 1,000 IRs
These environments span domestic, commercial, transportation, and controlled virtual spaces, enabling training and validation across realistic acoustic scenarios.
- Wave based modeling of low to mid frequency behavior including diffraction and modal behavior
- Phased geometrical acoustics for accurate high frequency reflections
- Spatial receivers means that the dataset is ready to render in your specific microphone array geometry
- A notebook that shows you how to work with the dataset
- Controlled variation across room size, geometry, materials, and source configurations
This is not approximated room acoustics. It is accurate full bandwidth, hybrid simulations designed for Audio AI training and validation
25% Lower Word Error Rate
Training the same multichannel speech enhancement model on hybrid wave based simulation data generated with Treble reduced Word Error Rate by 25% compared to training on typical open source simulation data. The model and inference setup were identical.
The only difference was the acoustic training data.
