Audio Tech Focus: AI Has Plenty of Potential — and Potential Pitfalls — for Broadcast Sports
Most opine that AI has a place in production but so do people
Story Highlights
—Artificial intelligence (AI) is either the greatest thing since sliced bread or the Y2K of the professional-audio sector. Or perhaps it could have an Oppenheimer effect: it could go either way, saving or destroying its creators while looking both menacing and alluring at the same time.
AI products designed for professional audio applications are already having an impact, such as Respeecher’s use for ADR (automated dialog replacement) in films and video. In music, television, and live event production, AI is being used to automatically mix audio — when it’s not being used to literally create it. In the process, it’s also threatening the employment of the carbon-based creators who increasingly deploy it.
More than anything, however, AI’s full potential for sound applications, including in broadcast and live sports production, remains ambiguous. A recent Sportico article about how the technology could be applied to FOX Sports’ recent Super Bowl production placed “AI” prominently in its headline, only to then just vaguely reference the use of machine learning — considered a subset of AI — in an indeterminate future. AI has become its own meme, albeit a multibillion-dollar one.
SVG asked several audio gurus to assess AI’s potential impacts when it comes to the sound for broadcast sports. Here’s what they said.
The Need for Humans

Quintar’s Tom Sahara: “Companies will need to invest before AI will consistently produce tangible results.”
Tom Sahara, SVP, production technology, Quintar (a spatial-experience developer) and former VP, Turner Sports, sees both sides of the AI/audio coin. Its benefits include reducing the demands on an A1’s attention during games by, for instance, monitoring signal levels and applying level management in a deterministic, predictable manner or automatically mixing input sources for secondary uses, such as in-ear-monitors, translations and alternate languages. It can even improve existing auto-mixing processes by incorporating data from external and non-audio sources, such as tally, router activity, record-device status, and GPS.
In addition, automated lip-sync and delay adjustments can be stored on a channel-by-channel basis along with time, playlist/clip ID, physical location (GPS), router settings, and other metadata, enabling video sources with synchronization errors to be corrected without re-editing or constructing discrete workflows. Further, he says, IP-enabled audio devices will accelerate the advance of AI/ML (artificial intelligence/machine learning) because the A/D conversion is expensive and not easily integrated into legacy workflows.
On the other hand, Sahara observes, “there are myriad administrative, training, and support requirements that are not fully understood, and companies will need to invest before AI will consistently produce tangible results. For instance, training [AI-based] mixing and control agents to individual requirements can be expensive and time-consuming. We will have to see how DeepSeek-like approaches may affect this. And obtaining large numbers of training samples is difficult and can quickly exceed budget and time resources.”
More ominously, he adds, “Video hallucinations are easy to spot; however, audio is much more nuanced, making the verification process much more difficult. Humans still have to be involved.”
Chris Fichera, VP, U.S. operations, Calrec, is another observer who sees both sides of the AI coin, citing its ability to provide audio processing in real time to manage announcer commentary, crowd noise, effects, and on-field sound as well as to automate EQ adjustments and create immersive 3D mixes based on real-time data. He does note the possible danger in becoming too reliant on the automation capabilities in a fast-moving, unpredictable sports show.
However, he points out, those capabilities could help alleviate the looming loss of experienced A1s for broadcast sports, as retirements increase and the cohort continues to age out. “This could be very useful, particularly for an A1 with limited experience in doing a broadcast show.”
A Glass Mostly Full

AudioShake’s Suzanne Kirkland: “AI tools will enhance [human expertise], freeing audio professionals to focus on storytelling and fan engagement rather than labor-intensive cleanup.”
“Source separation, our bread and butter at AudioShake, “is helping leagues and broadcasters navigate the complexity of live sports audio,” she says, “where crowd noise, commentary, and in-game sounds compete for attention. Our dialog-isolation model enhances transcription accuracy by isolating clear speech from multiple speakers in noisy environments so that overlapping player, coach, and commentator dialog is captured with greater precision. That allows broadcasters to highlight what matters most, whether it’s action on the field or the sidelines.
“Music removal is another game-changer, helping teams and broadcasters avoid legal and monetization challenges,” she continues. “By stripping out copyrighted music while preserving speech and ambient sounds, our technology allows content to be shared more freely across platforms without risk of takedowns or licensing issues.”
However, AI is still not a magic bullet that will alone transform the industry. It won’t, she stresses, replace human expertise: “AI tools will enhance it, freeing audio professionals to focus on storytelling and fan engagement rather than labor-intensive cleanup. AI will help skim off the tedious work and give the people who know the fans and what they favor the opportunity to focus on creating and capitalizing on incredible content.”
It’s Already Happening

Salsa Sound’s Rob Oldfield: “More-advanced and more-efficient algorithms, coupled with hardware acceleration, have meant that real-time applications [of AI] are now possible.”
That said, he adds, there have been some significant advances that have made deployment and development of the algorithms easier and new approaches that have extended the scope of what can feasibly be achieved in real-time audio.
“Historically, AI for audio had been the preserve of non–real-time/offline applications,” he explains, “but more-advanced and more-efficient algorithms, coupled with hardware acceleration, have meant that real-time applications are now possible.”
Referencing the latency that AI’s processing can entail, he adds, “A great example of this is things such as automated closed captioning, translation, and revoicing, which [are] fast heralding new possibilities for accessible audio solutions, giving viewers access to multilanguage commentary feeds or audio description channels, which would have previously been too costly and personnel-intensive to produce at scale.”
Salsa Sound’s current plans include more development of autonomous mixing/production, Oldfield says. The UK-based company is also bringing to market a suite of automated quality-control tools that use machine learning to listen for specific features characteristic of certain audio faults or illustrating problems. This includes things like wind-noise detectors, phase anomalies, glitches/pops, and other artifacts, as well as such features as sonic quality, speech intelligibility, and keyword/language detection.
“There’s loads of things already possible and already happening with real-time audio AI,” he says, “but there is a lot more to come. It’s an exciting time to be in the live-sports industry.”
Be Careful What You Wish For

NBC Sports and Olympics’ Karl Malone: “I see AI for broadcast at the moment being ‘automated intelligence’ as long as there is someone leading it and not using [it] as a ‘set and forget.’”
“I see AI for broadcast at the moment being ‘automated intelligence’ as opposed to ‘intelligent,’” he says, citing Lawo’s KICK audio-mixing/ball-tracking technology currently deployed for soccer by Bundesliga and FIFA. “I am a proponent of having some of our tasks in broadcast audio being automated, as long as there is someone dedicated to the audio design of the production leading it and not using [it] as a ‘set and forget,’ because the ‘forgetting’ part is where we can run into problems.”
But the automated processes can have significant benefits. For instance, he suggests, it can be used to clean up announcer-mic channels in noisy sports venues or in the headsets of officials.
“And, when we get to more-personalized audio options for viewers of, say, motorsports,” he continues, “I can see automated or intelligent mixing of the audio stems from the A1 console into presentations matching the content. For example, choose an in-car camera and hear the ambience of that car, plus driver and crew communications, plus or minus program commentary. All those sources could be intelligently mixed using parameters that keep each presentation consistent with each other in mix quality, LKFS, and so on.”
Currently, Malone sees AI as another tool for the A1 and sound-supervisor toolbelt, though more capable than just auto mixing and dynamic noise suppression. The future, however, might be a bit harder to predict, particularly as consumers come to expect more from their broadcast audio and as media companies look for ways to better engage them.
“Ultimately, the artificial-intelligence nature of products will evolve into the ability to mix full fields of play consistently,” he predicts. “But, as more content is required to be aired using the direct-to-consumer model, we in the audio community are going to have to start defining the parameters of any intelligent mixing processes we are interested in before the video-centric industry companies start releasing all the new shiny AI audio-mixing tools.”
In other words, he’s cautioning against the potential for the hyperbole around AI and broadcast sound to ultimately work against the quality of the very audio it purports to enhance.
Ray Bradbury might have agreed.