Intro to Audio in Unreal Engine 5
A high-level overview of key audio features in UE5
Intro
This is an introductory guide intended for highlighting the core audio systems in Unreal Engine 5. It is essentially a literature review filtered through my experience with it in the last year. I will cover some pretty basic audio concepts because I believe there are subtleties that are often missed, but feature breakdowns will not be exhaustive.
Supplementary context will be inlined, but anything related to Unreal Engine will be laid out in the Links section. This will hopefully make this blog more useful to refer back to later.
If you’ve got 2 hours to spare and want to delve even further, Dan Reynolds’ Unreal Audio Essentials talk1 is a fantastic resource.
Unreal - Engine
There is no decoupling Unreal’s audio systems from its wider engine, and we need to understand how Unreal operates to avoid common pitfalls. Let’s take a rapid-fire look at some fundamentals.
Architecture
Starting with a Hello World style example » if you drag a Sound Wave asset into a level and press the play icon to start PIE, you should hear that Sound Wave playing. To achieve this, Unreal has automatically:
- Created an Actor
- Attached an Audio Component to it
- Set its Sound to the dragged in Sound Wave
- Started playback (due to Auto Activate being enabled)
With a single action, Unreal revealed some foundational concepts for us to investigate.
Actors are what Unreal terms ‘objects that can be placed in a level’. They are augmented with modular functionality by attaching Scene Components and Actor Components, where the former has a transform and the latter does not. And transforms are a set of values that define the location, rotation, and scale of an object. So in brief » Actors exist at some location in the level, Scene Components exist at some location on the Actor, and Actor Components simply house logic.
Hierarchy showing a Character type Actor with various Scene Components, some of which are nested within other Scene Components; as well as an Actor Component (shown under the separator)
Object Lifetime
Objects can be created statically (via the editor) or dynamically (via code). They can later be destroyed directly by calling their destroy function, or indirectly as a result an upstream object being destroyed (such as a level ending). The important thing we need to think about is » what should happen to our sound if it’s attached to a Component that is being destroyed?
The image below shows a function that spawns and returns an Audio Component and plays the defined sound. Each pin I’ve highlighted plays a role in lifetime considerations.
‘Sound’, ‘Attach to Component’, ‘Stop when Attached to Destroyed’, and ‘Auto Destroy’ are important pins to note when considering audio lifetime
Sound lifetime comes in two flavors » oneshot or looped. A oneshot sound will stop when it finishes playing, but a looped sound needs to be stopped manually. This may seem obvious, but when we dig into MetaSounds later, we’ll find that we can have complex loop-containing MetaSounds that Unreal treats as oneshot.
Attach to Component sets which Component the Audio Component will be attached to (since an Actor can be made up of many Scene Components, we need to be specific). Stop when Attached to Destroyed then does what it says (referring to the Component in the above pin), and Auto Destroy controls whether the Audio Component should be destroyed when the sound stops.
There are many configurations we can set from the above, and our choices will depend on what we intend to achieve. It’s enough for now to simply be aware of these concepts.
If a sound stops suddenly, or continues playing longer than expected, it may be that it’s lifetime is set up incorrectly.
Threading
If you’re not familiar with the concept of threads, you can think of them as domains. It’s not necessary to have a deep understanding of the topic, just know that parts of the audio engine are split across these distinct domains.
| Thread | Sync | Domain |
|---|---|---|
| Game Thread | With Video (FPS) | Blueprints, Audio Component, Garbage Collection |
| Audio (Logic) Thread | With Game Thread | Audio Logic, Render Candidacy |
| Audio Render Thread | Async | Audio Rendering, Mixing, DSP |
To illustrate how these domains are traversed » when we request an Audio Component (on the Game Thread) to Play, this request is sent to the Audio Thread for evaluation. If deemed successful, the Audio Render Thread processes the audio and sends it to the hardware.
Thread separation is why we can play music that persists uninterrupted between levels.
Audio Components
As you’ve likely gleaned by now » Audio Components are the objects that allow us to issue instructions to our sounds. They may seem like speakers or jukeboxes, but its better to think of them as remote controls. This is because:
- As alluded to in object lifetime, an Active Sound can outlive its Audio Component
- As discussed in threading, our actual sound lives on a different thread
Audio Components are of the Scene Component variety » an important point, as it’s the transform that holds the data needed to process 3D audio. Adding an Audio Component to an Actor’s Blueprint is simple: click the Add button in the Components Tab and select Audio. If we drag the Audio Component into the Event Graph, then drag from its output pin into some empty space, we can see the functions we can call.
A small selection of functions that can be called on Audio Component in Blueprints. Some other important functions include Play, Stop, and the various parameter setting functions
Components cannot be added to other Components in Blueprints as they can to Actors.
Below is an example of how to spawn an Audio Component from within a Component on an Actor. Saving the output to a variable is referred to as ‘caching’. This allows us to refer to this variable elsewhere in our Blueprint later.
Spawning and caching an Audio Component. This approach assumes we want to call this within a Component attached to an Actor. We can set ‘Attach to Component’ input pin to any publicly accessible Scene Component, but all Actors have a Root Component, so this is a safe starting point
Unreal - Audio
The below diagram is only concerned with the shape of the audio engine. How this all hooks into code is a little more complex. Also, this represents a sort of - path of least resistance. For example, it’s possible to create feedback loops in many cases.
---
config:
flowchart:
curve: "linear"
nodeSpacing: 20
rankSpacing: 50
htmlLabels: true
---
flowchart
classDef tugger stroke-dasharray: 9,5,stroke-dashoffset: 250,animation: dash 2s linear infinite alternate-reverse;
att["Sound<br>Attenuation"] --> sg_snd
con["Sound<br>Concurrency"] --> sg_snd
cls["Sound<br>Class"] --> sg_snd
fx["Source<br>Effect"] src2fx@<--> sg_snd
class src2fx tugger
subgraph sg_snd [ <b>Sound Base</b> ]
snd_wav["Sound<br>Wave"] --> snd_ms["MetaSound"]
snd_bus["Source<br>Bus"]
end
subgraph sg_mod [ <b>Audio Modulation</b> ]
mod_param["Modulation<br>Parameter"] --> mod_cb["Control<br>Bus"]
mod_cbm["Control<br>Bus Mix"] --> mod_cb
end
sub_sub["Submix"]
sub_fx["Submix Effect"] sub2fx@<--> sub_sub
class sub2fx tugger
bus_bus["Audio Bus"]
sg_snd snd2bus@--> bus_bus
snd2bus@{ animate: true }
bus_bus bus2sub@--> snd_bus
bus2sub@{ animate: true }
sg_snd snd2sub@--> sub_sub
snd2sub@{ animate: true }
mod_cb -.-> sg_snd
mod_cb -.-> sub_sub
snd_bus sbus2sub@--> sub_sub
sbus2sub@{ animate: true }
Sound
A Sound Base, generally just referred to as a Sound, is to a type of audio asset that can be played back. It is the base class for Sound Cues, Sound Waves, MetaSound Sources, and Source Buses. You won’t interact directly with the Sound Base class, but it can be preferable to use this type when creating public variables, as it offers the flexibility to swap between the above asset types without code changes.
Sound Cues
This is the legacy system for handling audio playback logic. They have been deprecated in Unreal Engine 5, so I won’t discuss them further. And unless you’re supporting a legacy project, you should avoid using them. Instead, you should use MetaSounds, because MetaSounds:
- Support all the same features as Sound Cues (and a lot more)
- Are generally more efficient
- Can be further optimized for different hardware using MetaSound Pages2 3
MetaSounds
MetaSounds come in Source and Patch varieties. The main difference is that Sources can be played and Patches are reusable code blocks. For brevity, I am only referring to the Source variety and simply calling them MetaSounds. The distinction is otherwise unimportant.
Despite being a flagship feature, some people seem to misunderstand what MetaSounds is. I’ve heard the question “how does MetaSounds compare to Wwise?”, but it’s a bit like asking how a Casio FX-CG100 stacks up to Excel. MetaSounds is only the source renderer.
To elaborate » MetaSounds is a graphical patching environment for controlling audio playback at a very low level. Where the authoring side of middleware abstracts away fine-grained control, MetaSounds instead offers deep customization. In practice, it’s more similar to PureData or Max, but tailored to game audio and tightly integrated with Unreal.
There are already great guides online that cover general usage45, so I’m not going to retread that ground. Instead, I want to provide a general structure to help you approach MetaSounds with the right mindset.
Shape
First off, MetaSounds doesn’t function like Blueprints, even if it looks like it should (a closer parallel is Unreal Engine’s Material shaders). Blueprints can have their Tick function dynamically changed at runtime or disabled entirely. They can also run async code, and generally offer a level of flexibility that is ill-advised when creating DSP graphs.
On the other hand, a MetaSound has a fixed update rate for the duration of its playback, and is always updating as long as it’s active. This duration is determined by its Block Rate which is set to 100Hz by default. This is a really important point to keep in mind, as it underpins a lot of nuance when dealing with MetaSounds.
A key difference between MetaSounds and PureData/Max is that you cannot address individual samples in MetaSounds. You operate at the block level.
Sample (Audio) Rate
The primary function of a MetaSound is to generate Audio. For this we have a wave player and several synthesizers. Output signals from these nodes can be further processed by a suite of filters and effects, or modulated by other signals such as envelopes. These signals operate at Sample Rate » 48kHz by default.
Block (Control) Rate
To dynamically control audio, MetaSounds offers a slew of other data types including Bool, Int32, Float, and even entire objects. These types are processed at Block Rate » 100Hz by default.
The Trigger type is special because it is a control type that can address specific samples within a buffer. It is able to do this because it is effectively an array that stores an index when triggered. Most of the time, we’re simply adding 0 to the array and being done with it, intending the Trigger to fire at the beginning of a buffer. However, for something like the Wave Player’s On Nearly Finished output Trigger, the index of the final buffer of the wave file is stored. When coupled with a delayed variable (mentioned below), this can be used to enable sample accurate looping.
Input / Output
Inputs & Outputs are how we get data into our MetaSound from the editor or game. We can also send and receive data between patches when nesting them. Values stored in these Inputs can be changed:
- Statically – directly in the MetaSound or by overriding values in Presets
- Dynamically – by setting parameters via code
Adding an Interface generates Inputs and Outputs that automatically get the specified data at runtime. These operate like the Wwise’s or FMOD’s Built-In Parameter options. For example, UE.Attenuation gets the distance between the listener and sound location in game units (cm by default).
Variables
Variables are a great tool for cleaning up MetaSounds and reducing the inevitable clutter that can result from visual scripting. MetaSounds are code after all, and clear descriptive naming goes a long way.
However, the true utility of variables comes from their ability to get a signal from one frame prior. MetaSounds does not allow infinite loops, so if we want to pass an output back into an input, we need to delay it by one frame.
Game
With the above established, let’s briefly look at getting 6
Notes & Observations
Middleware uses the term ‘voice’ to refer to a unit of active audio. Unreal doesn’t have consistent verbiage for this. In Project Settings, Max Channels controls what Wwise would consider the Voice Limit, In Sound Concurrency the term ‘voice-stealing’ is used, but in code, so-called voices are termed ‘active sounds’. Since channels can also refer to routing (such as stereo or 5.1), and voices can refer to human vocals, I’ll stick to using ‘active sounds’ when referring to this concept in Unreal.
1 Audio Component = 1 Sound Base = 1 Active Sound
This is different than in Wwise where one Game Object can be responsible for many voices. If you need to coordinate several distinct but related Audio Components, you will need to manage that in code. Alternatively, you can create a monolithic MetaSound.
It is possible for an Audio Component to play simultaneous sounds by enabling ‘Play Multiple Instance’, but this is intended for retriggering the same sound multiple times.
MetaSounds Do Not Have Local Scope
If you’ve created custom functions in Blueprints, you’ve likely used local variables before. These get reset when the function finishes i.e. when they ‘go out of scope’. This is not true in MetaSounds when nesting patches. Variables are only reset when a MetaSound stops, and Inputs are set to the most recent value stored in their Audio Component, or default if unset.
If a MetaSound behaves correctly when played directly, but doesn’t when nested within another MetaSound » it may need its internal states resetting.
Oneshots Are Not Virtualized
If a oneshot is evaluated and deemed inactive, it will be stopped. Only loops can be virtualized. Sound Waves can be manually defined as looping via their settings, but MetaSounds are inferred to be looping if they do not implement the UE.Source.OneShot Interface.
If your ambience stops when you go out of range and doesn’t restart when you get near » it is likely defined as a oneshot.
MetaSounds Are OneShot by Default
When you create a new MetaSound Source, you’ll find that the UE.Source.OneShot is implemented by default. Taking the above into account, you’ll need to decide whether that is valid for your sound. If you do stick with the default, just be sure that the UE.Source.OneShot.OnFinished output is triggered eventually. This can be tricky to get right for complex patches, so test thoroughly.
If sounds are cutting out unexpectedly, it could be that active but inaudible (i.e. orphaned) oneshots are maxing out your Active Sound budget. Audio Insights7 and Unreal’s various console commands8 should help you to rout out these issues.
Wave Files Are Streamed by Default
Unreal stores the first chunk of audio for a given file in memory and streams subsequent data from disk. Storing the first chunk ensures fast playback without needing to load everything into memory. Behavior for when the chunk is loaded and how long it is kept in memory for can be defined on a Sound or Sound Class as follows:
| Loading Behavior | Tooltip |
|---|---|
| Load on Demand | Chunk is loaded when played or primed |
| Prime on Load | Chunk is loaded when asset is loaded |
| Retain on Load | Chunk is kept in memory |
| Force Inline | Non-streamed |
Use MetaSound Presets!
This is just a PSA to use presets. Right-click any MetaSound Source and select Create Preset, then open them up, enable Override Inherited Default on any inputs you would like to customize, and set new values as appropriate.
If you ever find yourself copy-pasting MetaSounds, consider whether a Preset would be more efficient.
Routing
Once we’ve got source audio, we’ll be able to route to various systems.
Buses
Buses are the means by which we can move audio around Unreal. They come in Audio Bus and Source Bus variants. The former simply carries audio and the latter provides an end-point to play Audio Buses back in game.
The only setting which can be changed on an Audio Bus is its channel count. Otherwise, it simply holds audio data.
Source Buses are similarly simple, with the only distinct setting being the Audio Bus input. Otherwise, they operate like any other sound. The point of this asset is to allow playing bussed audio in game » a feature which is used in Fortnite to play music diegetically via in-world speakers9.
Submixes
This is the final link in the audio chain, representing the point at which audio signals are mixed down. The process in Unreal is paralleled by the audio engineering practice. Submixes can receive inputs from Sounds or from other Submixes, both statically and dynamically.
Opening a Submix asset reveals a graph where you can define the relationship between other Submixes. For example, you can create a ‘bird’ Submix which routes into an ‘ambience’ Submix which finally routes into an output Submix.
---
config:
flowchart:
curve: "linear"
---
flowchart LR
Birds[🐦 Birds] --> Ambience[🍃 Ambience] --> Main[🔊 Main]
Insects[🪲 Insects] --> Ambience
Music[🎼 Music] --> Main
Another notable feature of Submixes is the ability for their outputs to be analyzed10. The data from this analysis is accessed in code, so it is aimed more towards driving gameplay from audio. You can send this data back into the audio domain, but you will incur some latency.
Submixes supersede Sound Classes as a means to mix audio in Unreal Engine 5.
Render Order
Understanding the audio render order is important for establishing proper gain staging. It also serves as a structured checklist for if we need to debug. Below is a diagram transcribed from Dan Reynolds’ talk1 that visualizes the source render pipeline.
---
config:
flowchart:
nodeSpacing: 20
rankSpacing: 20
nodePadding: 100
---
flowchart LR
src[Sound<br>Base<br> ]
srcvol[Source<br>Volume<br> ]
prefsnd[Pre<br>Effect<br>Send<br><br>Bus<br>↓]
srcef[Source<br>Effect<br>Chain]
att[Distance<br>Attenuation<br> ]
poefsnd[Post<br>Effect<br>Send<br><br>Bus<br>↓]
subsnd[Submix<br>Send<br><br><br>Submix<br>↓]
spat[Spatial-<br>ization<br><br><br>Submix<br>↓]
src --> srcvol
srcvol --> prefsnd
prefsnd --> srcef
srcef --> poefsnd
poefsnd --> att
att --> subsnd
subsnd --> spat
And the above is only the first part of a larger render pipeline which is processed in the following order:
flowchart LR
Source[Source] --> Bus[Bus] --> Submix[Submix]
Render order isn’t something that usually requires much conscious thought. However, there are some cases where it might catch you out. For example, if you want to send a Submix signal to a Bus, you will incur 1 frame of latency. This is because it has to queue it up for the next render pass in order for signal to be passed back down the chain.
I found this out when attempting to bus a 2D ambience Submix to play as a spatial sound emanating from a window when indoors. The source and bussed signals could not be layered as the buffer delay caused a comb filtering effect. I got the desired outcome by bussing directly from the MetaSound.
Other Audio Essentials
Sound Classes
Sound Classes offer a means of categorizing sounds separately from how they are mixed. They can be used for batch applying settings, with two notable settings being Apply Ambient Volumes and Loading Behavior Override. The former relates to Audio Gameplay Volumes11 and the latter sets how streaming/loading is managed (mentioned above under note and observations).
You may want to use Sound Classes to ensure short frequently played sounds such as UI or impacts are always primed and ready.
There are a few more settings worth investigating, but I don’t consider Sound Classes to be a hugely important feature. Still, you can access the Sound Class of a sound in code, so you could augment them with custom functionality.
If you’re using Audio Gameplay Volumes11 and find that transiting a volume isn’t affecting audio, the sound either doesn’t have the appropriate Sound Class set, or its
Apply Ambient Volumessetting hasn’t been enabled.
Sound Attenuation
Sound Attenuation is responsible for enabling 3D audio playback, but it’s also where we can enable a few other features such as quick and cheap audio occlusion. It’s a well documented12 system and it’s been around a while, so here’s the TLDR:
Enable Volume Attenuationmakes sound quieter over distanceEnable Spatializationpans sound relative to the listener’s orientationEnable Air Absorptionallows reducing high frequencies over distance to mimic how sound behaves in reality
Sound Concurrency
Sound Concurrency limits the number of active sounds playing at any given moment, which improves performance and ensures a cleaner mix. Like the above, it’s an old feature that’s intuitive and well documented13. The TLDR is:
Max Countsets the max limit for all sounds which have this Sound Concurrency appliedResolution Ruledetermines how to sounds are prioritized when the max count is exceeded
The rest of the features largely control the qualitative aspects of the virtualization/realization process.
I’m not sure why this cannot be set at the Sound Class level, but you could script this behavior yourself. Alternatively, you could use the Property Matrix14 feature to batch apply Sound Concurrency.
Effects
These don’t seem to be well documented, but they are simple to set up and use. They come in Source Effect and Submix Effect variants, with the name implying where they are inserted.
Source Effects may seem less necessary now that MetaSounds exists, as many have MetaSound node equivalents, such as Filter, Delay, and Stereo Panner. However, some Effects (such as Convolution) do not have an equivalent. Additionally, it’s not possible to apply MetaSound nodes outside of MetaSounds, so Source Effects are still required for processing other types (such as Source Buses). And as highlighted in routing, signal can be sent to an Audio Bus pre and post Source Effect, which may be an important consideration in certain situations.
If you want your sounds to Doppler shift » add a Source Effect of type
Simple Delayand ensureDelay Based on Distanceis enabled.
Submix Effects are as important as ever as they allow us to apply EQ, dynamics processing, and reverb to entire groups of audio. This is better for performance and easier to manage than having loads of Source Effects.
Audio Modulation
On the surface, this system’s name is pretty self-explanatory, but it’s impressively deep and complex15. The potential applications for this system are broad, but the key use-case is dynamic mixing16. For example, you can:
- Hook up Submixes to user-facing volume sliders
- Create mix ‘snapshots’ that dynamically reflect gameplay state
- Process parts of the mix depending on time of day
Control Buses are the core of this system. They operate similar to Audio Buses, but carry control data rather than audio (similar to a VCA in the analog world).
A Control Bus’s type is determined by its Modulation Parameter. This affects both what it can modulate and how it combines when several modulators are applied to a Control Bus simultaneously. For example, volume is additive, but filter cutoffs are determined by taking the most extreme value.
Control Bus values can be set at a global level, which is fine for simple always-active types such as time of day. But in most other cases, Control Bus Mixes are the way to go. These encapsulate and create relationships between Control Buses, with the added benefit of making code implementation cleaner.
- Here’s a worked example of how these can be linked together:
You want to allow the user to adjust their volume so you set up a Control Bus Mix with Control Buses for Music, SFX, and Ambience, and their Modulation Parameters set to Volume. Later, you decide you need a ‘boss fight’ mix that increases the volume of Music and SFX, but decreases the volume of Ambience. Since you’ve already set up the Control Buses for these, you simply need to create a new Control Bus Mix, tag up the Control Buses, and set appropriate values.
Because the Modulation Parameters are Volume types, if the user sets the Ambience volume to -6dB, and the ‘boss fight’ Mix also sets it to -6dB, then the final adjustment will be -12dB. This is really neat! The values are not being overwritten, and instead work in an intuitive manner.
If you attempt to use a modulator with an incompatible type, you will get the following warning:
ControlBusName (InputType), Expected: (OutputType)
To get around this you can use a Parameter Patch, which allows you to remap Control Buses to different types, as well as create custom curves.
Links
Unreal Audio Essentials (Abridged) - Dan Reynolds
https://youtu.be/swS6Zch4m90 ↩︎ ↩︎2MetaSound Pages - Official Docs
https://dev.epicgames.com/documentation/en-us/unreal-engine/metasound-pages-in-unreal-engine ↩︎MetaSound Pages Optimization Guide - Dan Reynolds
https://youtu.be/MjoqY_cUUNQ ↩︎Introduction to MetaSounds - Dave Raybould
https://dev.epicgames.com/community/learning/tutorials/BKPD/unreal-engine-introduction-to-metasounds ↩︎Understanding MetaSounds: A Technical Guide For New Designers - Dan Reynolds
https://youtu.be/zS422olBeG0 ↩︎MetaSounds - Output Watching - Official Docs
https://dev.epicgames.com/documentation/unreal-engine/metasounds-reference-guide-in-unreal-engine#outputwatching ↩︎Audio Insights - Official Docs
https://dev.epicgames.com/documentation/unreal-engine/audio-insights-in-unreal-engine ↩︎Audio Console Commands - Official Docs
https://dev.epicgames.com/documentation/unreal-engine/audio-console-commands-in-unreal-engine ↩︎Unreal Engine Audio Systems In Fortnite - Seth Weedin
https://youtu.be/MGcZM4luz5g ↩︎Real Time Analysis - Official Docs
https://dev.epicgames.com/documentation/unreal-engine/overview-of-submixes-in-unreal-engine#realtimeanalysisinblueprints ↩︎Audio Gameplay Volumes - Official Docs
https://dev.epicgames.com/documentation/unreal-engine/audio-gameplay-volumes-quick-start ↩︎ ↩︎2Sound Attenuation - Official Docs
https://dev.epicgames.com/documentation/unreal-engine/sound-attenuation-in-unreal-engine ↩︎Sound Concurrency - Official Docs
https://dev.epicgames.com/documentation/unreal-engine/sound-concurrency-reference-guide?lang=en-US ↩︎Property Matrix - Official Docs
https://dev.epicgames.com/documentation/unreal-engine/property-matrix-in-unreal-engine ↩︎Audio Modulation - Official Docs
https://dev.epicgames.com/documentation/unreal-engine/audio-modulation-reference-guide-in-unreal-engine ↩︎Audio Modulation Quick Start - Official Docs
https://dev.epicgames.com/documentation/unreal-engine/audio-modulation-quick-start-guide ↩︎