Skip to content

HRTF Setup

This guide explains how to set up HRTF (Head-Related Transfer Function) spatialization in your Amplitude project for immersive 3D binaural audio. HRTF uses measured or simulated impulse responses from human ears to reproduce spatial audio over standard stereo headphones.

Prerequisites

Before you begin, ensure you have:

  • Amplitude Audio SDK v1.0 or later
  • An .amir HRIR sphere file (see Generating an AMIR file)
  • Headphones for testing (HRTF is optimized for headphone playback)

What is HRTF?

HRTF captures how sound waves interact with the human head, torso, and pinnae (outer ears) as they travel from a sound source to each ear. By convolving audio with these direction-specific impulse responses, Amplitude can create a convincing 3D audio experience using only two audio channels.

Amplitude supports multiple HRIR datasets:

Dataset Description Source
IRCAM LISTEN database with multiple subjects IRCAM
MIT KEMAR dummy head measurements MIT Media Lab
SADIE Spatial Audio Database with high resolution University of York
SOFA Custom measurements in SOFA format SOFA Conventions

Step 1: Generate or Obtain an AMIR File

Amplitude uses its own optimized .amir format for HRIR data. You can generate one from supported datasets using the amir CLI tool:

# Convert a SOFA file to AMIR (model 3 = SOFA)
amir -m 3 input.sofa output.amir

# Convert IRCAM data to AMIR (model 0 = IRCAM)
amir -m 0 /path/to/ircam/ output.amir

# Resample to a target sample rate during conversion (model 1 = MIT)
amir -m 1 -r 48000 /path/to/mit_kemar/ output.amir

The amir tool will:

  1. Read the HRIR dataset
  2. Compute interaural time differences (ITD)
  3. Triangulate the sphere mesh
  4. Output an optimized .amir file for runtime use

Sample rate matching

For best performance, generate the .amir file at the same sample rate as your engine output (e.g., 48000 Hz). Mismatched sample rates will trigger runtime resampling.

Step 2: Configure the Engine

Add the HRTF section to your engine configuration file:

{
  "driver": "miniaudio",
  "output": {
    "frequency": 48000,
    "buffer_size": 4096,
    "format": "Float32"
  },
  "mixer": {
    "panning_mode": "BinauralHighQuality",
    "active_channels": 50,
    "virtual_channels": 100,
    "pipeline": "default.ampipeline"
  },
  "hrtf": {
    "amir_file": "data/sadie_h12.amir",
    "hrir_sampling": "Bilinear"
  }
}

Configuration Options

Option Values Description
amir_file File path Path to the .amir HRIR sphere file.
hrir_sampling NearestNeighbor, Bilinear Sampling method for HRIR lookup. Bilinear is smoother but slightly more expensive.

Panning Modes

Choose a binaural panning mode in the mixer configuration:

Mode Description Use Case
Stereo Standard stereo panning Speakers, no spatialization
BinauralLowQuality Fast HRTF with lower accuracy Performance-constrained devices
BinauralMediumQuality Balanced quality and speed Mobile devices
BinauralHighQuality Full HRTF convolution Desktop, high-end experiences

Step 3: Configure Sound Objects for HRTF

Ensure your sound objects use HRTF spatialization:

{
  "id": 10,
  "name": "footsteps",
  "path": "sounds/footsteps.wav",
  "spatialization": "HRTF",
  "attenuation": 1
}

Available spatialization modes:

Mode Description
None No spatialization; same gain for all channels.
Position 2D position-based panning.
PositionOrientation 2D position and orientation panning.
HRTF Full 3D binaural spatialization using the HRIR sphere.

Step 4: Set Up Listeners and Entities

HRTF spatialization depends on accurate listener and entity positions. Ensure your game updates these each frame:

// Update listener position and orientation
listener.SetLocation(playerPosition);
listener.SetOrientation(playerOrientation);

// Update entity (sound source) position
entity.SetLocation(enemyPosition);
entity.SetOrientation(enemyOrientation);

// Optional: set directivity for more realistic radiation patterns
entity.SetDirectivity(0.5f, 2.0f);

Step 5: Verify the Pipeline

The default pipeline includes HRTF processing through the AmbisonicBinauralDecoder node. If you are using a custom pipeline, ensure it includes the necessary nodes:

{
  "id": 1,
  "name": "hrtf_pipeline",
  "nodes": [
    { "id": 1, "name": "Input", "consume": [] },
    { "id": 2, "name": "Attenuation", "consume": [1] },
    { "id": 3, "name": "AmbisonicPanning", "consume": [2] },
    { "id": 4, "name": "AmbisonicRotator", "consume": [3] },
    { "id": 5, "name": "AmbisonicBinauralDecoder", "consume": [4] },
    { "id": 6, "name": "Output", "consume": [5] }
  ]
}

Runtime API

You can query and configure HRTF settings at runtime:

// Get the current HRIR sphere (returns std::shared_ptr<const HRIRSphere>)
auto sphere = amEngine->GetHRIRSphere();

// Check if the sphere is loaded
if (sphere != nullptr && sphere->IsLoaded())
{
    amLogInfo("HRIR sphere loaded with %u vertices", sphere->GetVertexCount());
}

// Query the current sampling mode
eHRIRSphereSamplingMode mode = amEngine->GetHRIRSphereSamplingMode();

Performance Considerations

Factor Impact Recommendation
HRIR length Longer IRs = more convolution CPU cost Use 128–512 sample IRs for games
Vertex count More vertices = smoother spatialization 1000–5000 vertices is usually sufficient
Sampling mode Bilinear is ~2× more expensive than NearestNeighbor Use NearestNeighbor on mobile
Panning mode BinauralHighQuality applies full convolution Use BinauralMediumQuality for 20+ simultaneous sources

Troubleshooting

Issue Cause Solution
No spatialization spatialization set to None or Position Change to HRTF in the sound asset
Flat sound panning_mode set to Stereo Change to BinauralHighQuality
Crackling HRIR sample rate mismatches output rate Regenerate .amir at the engine output sample rate
Front/back confusion HRIR dataset lacks elevation resolution Use SADIE or high-resolution SOFA data

Next Steps