
Q. Next-Gen Audio:
What key audio trends should the industry be monitoring and how will they change the viewer (and listener) experience?
Next-gen audio is being redefined by the rise of AI-generated voices, and the impact on the industry will be profound. AI can now produce voices with astonishing realism, matching emotional depth, tone, and lip-syncing. In the coming years, we will see AI-enabled dubbing take center stage in sports media, enabling the rapid localization of content for global audiences. The speed, scalability, and precision of AI will streamline production while enhancing viewer immersion through seamless, authentic-sounding translations in any language. This evolution will transform how fans engage with content, offering a truly global experience without compromising quality.
Big Tech is present on all the major sports broadcasts. Ball tracking, swing analysis, and statistics capture are adding to the video side of the broadcast and translate well to most viewing devices. Audio is currently providing multichannel effects and consistent audio levels, but not all viewers can experience the best audio possible due to the different devices and audio-output abilities. We feel the microphone can be more than a passive audio-capture device, mics can host a suite of sensors located close to the action and can output a stream of data. Big Tech can analyze this data, which can be stored in the vast data pools for future searchability.
AI’s role in opening up audio workflows is worth watching closely. I spoke about AudioShake’s sound separation earlier; that’s relevant not just to current content but also in helping open up archival audio for new uses. Also, lots of exciting work is happening in automatic highlights generation, with the likes of Magnifi, and in using computer vision to help content owners search through their vast content libraries.
Trends in next-gen audio are enhancing immersion, personalization, and accessibility. Immersive-audio formats like Dolby Atmos provide 3D soundscapes, enriching viewer engagement across devices. Personalized audio lets users control elements like commentary or ambient sound, while AI-driven audio processing enables features like noise reduction and real-time translation, boosting clarity and inclusivity. New audio codecs (for example, EVS, LC3) optimize sound quality even on limited networks, making them ideal for 5G and mobile streaming. Binaural audio elevates VR/AR experiences with 360-degree sound, and cloud-based production supports remote live mixing and collaboration. And accessibility features — such as enhanced audio descriptions and speech-to-text innovations — are making content more inclusive, positioning audio as a crucial component of modern media experiences.
As we hear athletes voices and game effects being brought ever closer to the viewer, new narratives are being established, and will continue to be realized. High-quality microphones, IP+DSP equipped and analog, with increasingly-miniaturized, remote controlled transmitters/receivers are bringing these new fresh audio perspectives to the forefront — and viewers just want more. In person viewing experiences are already replicating the broadcast experience. The increased realization of (previously low level) audio constituents is just going to increase — expect to hear more!
The rise of Next Gen audio technologies is transforming how audiences experience content, delivering more-immersive and -dynamic soundscapes. Key trends include the adoption of spatial audio formats like Dolby Atmos, which creates three-dimensional sound that enhances cinematic, streaming, gaming, and live-event experiences. Accessibility innovations, such as advanced audio descriptions and dialogue-intelligibility tools, are improving inclusivity with clearer dialogue and personalized options tailored to viewer preferences. AI-driven tools are also revolutionizing workflows, enabling automated dialogue cleanup, noise reduction, and precise sound reconstruction. These advances boost efficiency and empower creators to push creative boundaries. At MPE, we leverage our Dolby Atmos Certified HE Studios to deliver cutting-edge sound design, staying ahead of AI and accessibility trends to craft soundscapes that inspire, immerse, and connect audiences.
Miked-player audio is rapidly expanding among professional sports leagues globally for creation of in-game video packages and social-media video clips. PlayerMic, RefMic, and CoachMic audio are fan favorites and have been the main stay of Q5X’s wireless microphone transmitters. Recently, there has been a trend toward live athlete audio, particularly in streamed events versus broadcast. Additionally, the use of IFB systems for conducting live athlete interviews has been increasing. This addition of live voices and immersive ambient venue sounds brings an entirely new dimension to the fan experience, placing them beside the athlete they are viewing. Fundamental to all of this is the athlete’s recognition that being miked drives a valuable connection to their fan base.
The industry is moving toward immersive audio experiences, such as spatial audio formats, that enhance viewer engagement. RTS’s intercom solutions — including the ODIN matrix — support advanced audio protocols, enabling production teams to manage complex audio setups that deliver rich, immersive soundscapes to audiences.
The trend toward immersive and personalized broadcasting is aimed at enhancing the listener experience by using objects and metadata to package content that can be customized by the user. This technology can be deployed in real-world scenarios, improving bandwidth usage and facilitating distribution of customizable content while maintaining original-production creative intent. Because of this, traditional broadcast facilities are doubling as postproduction facilities, with increased use of immersive and music-production functionality integrated into broadcast consoles. In a world with increasing amounts of content, we believe the listener requires enhanced experiences and personalization together with immersive workflows capable of delivering interactive content.
Next-generation audio (NGA) is about to reshape the media landscape through delivery of immersive, personalized experiences. The rise of NGA is a significant trend, offering enhanced dialogue intelligibility so that every word is heard with crystal clarity, from whispers to action scenes. This technology also provides customizable audio tracks, allowing users to select different commentators or language options. With more than a billion devices supporting NGA, the demand for extraordinary audio experiences is growing tremendously. Embracing NGA technology can help broadcasters meet this demand, transforming viewer and listener engagement with unparalleled sound quality.
Immersive audio continues to gain traction as broadcasters seek ways to enhance value and deliver premium experiences. UHD workflows, such as those supported by TSL’s PAM2-12G Audio Monitoring Unit, offer a future-proof option for managing 12G/UHD signal streams with precision across varied production setups, leveraging the reliable PAM platform. Similarly, IP-audio monitoring has seen significant growth, and the award-winning MPA1-MIX-NET-V-R has risen to meet these needs. Excelling in ST 2110-30 IP-audio monitoring, it offers intuitive multichannel mixing and ST 2022-7 redundancy. Built for adaptability, this system ensures seamless interoperability with third-party platforms and enables remote-monitoring capabilities.
How do we provide the best spatial experience without complicating the production workflow and provide high quality at each level (from immersive to stereo)?
The industry should monitor the rise of AI-driven tools, which are transforming audio production and enhancing audience experiences. These technologies automate workflows, optimize production efficiency, and elevate the quality of audio produced. Companies like Waves, a leader in audio-processing innovation, are driving these advances and shaping the future of next-generation audio tools.
Capturing audio immersively in the field is the next step in bringing the viewer/listener closer to the action. People want to feel like they are “inside” whatever event they are watching, whether it's a car race, baseball game, or concert. And although post-processing using immersive technology is good, such as with the many Dolby Atmos plug-ins on the market, the real heightened viewer experience comes from capturing the audio in an immersive format from the start. The resulting soundscape from using an immersive mic setup that effectively captures the event exactly the way the playback system is set up to reproduce it is a premium viewer experience that could set one broadcast head and shoulders above another.
Wohler sees an upgrade to the customer experience with the emergence of technologies like HDR and 4K from a video-quality perspective, as well as with immersive-audio technologies like Dolby Atmos and MPEG-H. These new technologies are designed to enhance the audio-visual experience for consumers of sporting events. Streaming platforms are able to customize viewer experiences and even ad presentations fully aligned with customer expectations. These new technologies demand aligned monitoring capabilities, and, keeping with our tradition at Wohler, we offer products to monitor immersive-audio signals like Dolby Atmos and MPEG-H, to complement our video-monitoring capabilities.
Showing 15 records of total 15
Browse more Perspectives Go Back to Survey Questions