Our products
  • Break communication barriers with real-time Accent Translation.
  • What sci-fi fans and global communicators have been dreaming about.
  • Noise cancellation with omni-directional capabilities and speech enhancement.
Who we serve
  • Hire talent, not accent. Enable new levels of cost performance while improving CSAT.
  • Get the confidence and tools you need to offshore in emerging territories
Hear from our customers and agents
  • Discover how Sanas is driving business growth.
  • Sanas is changing lives.
  • Hear the magic of Sanas Accent Translation - straight from our agents.
Resources
  • A hub for all things speech understanding technology and more
  • Explore the latest news, product launches, insights and more
Company
  • Breaking barriers one conversation ‍at a time
  • Join us and help change lives with AI that enhances, never replaces, humans.
  • At Sanas, privacy and security are top priorities.
By Purvi Agrawal
December 12, 2025

Inside Sanas ASR-Optimized Noise Cancellation for Agentic AI

Noise Cancellation
From the Desk of Purvi Agrawal

Background noise is one of the biggest barriers to accurate voice AI. Traditional noise cancellation makes speech sound cleaner to humans, but less interpretable to machine perception by stripping out subtle phonetic cues Automatic Speech Recognition (ASR) systems rely on — causing more errors instead of fewer.

Today we’re taking a deep dive into how Sanas addresses this challenge with ASR-Optimized Noise Cancellation: a noise cancellation system built specifically for machine understanding rather than human perception.

For anyone building or using voice AI systems, ASR-Optimized Noise Cancellation acts as a preprocessor that reduces mistranscriptions, expands accessibility, improves human-to-AI experiences, and strengthens real-world resilience in the noisiest environments.

In this article, we break down the core problem, explain how ASR-Optimized Noise Cancellation works, and show what it means for developers, enterprises, and end-users. You’ll also hear real audio samples and see transcript comparisons that demonstrate the impact in real-world conditions.

Curious how ASR-Optimized Noise Cancellation can strengthen your AI pipeline? Reach out to join the early access program.

The Challenge: Why Traditional Noise Cancellation Fails AI

Automatic Speech Recognition is the first link in many voice AI pipelines. When ASR mishears even a single word, every downstream step becomes less accurate and reliable — from natural-language understanding to text-to-speech.

To understand where noise cancellation fits in, it helps to look at a simplified voice AI pipeline.

Simplified Voice AI Pipeline

In a typical agentic AI system, three core models (Automatic Speech Recognition, Large Language Model, and Text-to-Speech) work together to enable two-way voice interaction.

  • ASR converts spoken language into text so the AI can understand it.
  • The LLM interprets meaning and decides how to respond.
  • TTS generates natural-sounding speech back to the user.

When background noise interferes — chatter, typing, traffic, machinery — the ASR system receives a degraded signal, leading to mistranscriptions. Once ASR makes an error, the rest of the pipeline propagates that error.

That’s the inspiration to add a noise cancellation module before ASR: to clean the audio and improve performance across the rest of the stack.

The Paradox: When Cleaning Hurts Clarity

The problem is that most noise cancellation systems were built for human perception, not machine comprehension.

They remove ambient sounds to make speech sound cleaner, but in doing so they also strip away the subtle details that ASR models rely on to distinguish similar-sounding words.

This mismatch happens because large ASR models are often trained on clean and real-world speech without denoising. When they encounter noise cancellation-processed audio, the acoustic patterns look unfamiliar, leading to mistranscriptions. Studies like Deepgram’s “Noise Reduction Paradox” have confirmed that conventional noise cancellation can actually reduce ASR accuracy, even when the audio sounds clearer to people.

That’s the core challenge Sanas set out to solve: designing noise cancellation that cleans the signal without destroying the features ASR depends on.

The Sanas Solution: ASR-Optimized Noise Cancellation Built for AI Understanding

To overcome this challenge, Sanas developed ASR-Optimized Noise Cancellation for Agentic AI. It's a system trained not just to remove background noise, but to understand how noise interacts with machine perception.

Unlike traditional noise cancellation that cleans audio for human perception, Sanas ASR-Optimized Noise Cancellation removes disruptive noise while preserving the phonetic and acoustic cues that ASR models rely on to understand speech accurately. The result is audio that might sound slightly less “perfect” to humans, but delivers dramatically lower Word Error Rates (WER) for AI systems.

How it works:

  • ASR-Optimized Training: The model is trained with a dual objective: suppress background noise while maintaining the frequency and energy patterns ASR models depend on for accurate transcription.
  • ASR-Agnostic Deployment: Sanas Noise Cancellation functions as an acoustic frontend, and can seamlessly plug into any ASR pipeline — open-source or commercial — without retraining or modifying existing infrastructure.
  • Real-World Optimization: The system is tuned for enterprise environments where speech clarity drives measurable outcomes, such as contact centers, healthcare, logistics, transportation, and automotive voice interfaces.

Sanas ASR-Optimized Noise Cancellation improves machine intelligibility so voice systems can respond accurately in more places, even when background noise is unavoidable.

In short: traditional denoisers improve how audio sounds; ASR-Optimized Noise Cancellation improves what AI can understand.

To make the results concrete, the Results section below includes audio examples from quiet, moderate, and loud environments to showcase how Sanas NC maintains intelligibility performs in challenging conditions.

Study Design and Evaluation

To measure the real-world impact of Sanas ASR-Optimized Noise Cancellation, our science team conducted a comprehensive evaluation across multiple speech-to-text systems and acoustic conditions. The goal was to determine whether Sanas ASR-Optimized Noise Cancellation could improve ASR accuracy, especially in noisy conditions and without regressing on already-clean audio.

Methodology at a Glance:

  • Position in the Pipeline: Sanas was applied before the ASR block, ensuring that the system received denoised yet acoustically intact audio input. This placement reflects how enterprises would deploy the system in production.
  • Datasets Evaluated:
    • Sanas Proprietary Dataset: Real-world recordings, balanced across accent, gender, pitch, speaking rate, and scenarios. This dataset was isolated from all training data to ensure an unbiased evaluation.
    • LibriSpeech “Clean Test” Dataset: A standard open-source benchmark used to confirm that Sanas ASR-Optimized Noise Cancellation maintains performance when input audio is already clean. This confirms that the model improves noisy speech without penalizing clean recordings.
  • Evaluation Metrics: We used Word Error Rate, the industry-standard measure of ASR accuracy. WER quantifies the percentage of words incorrectly recognized compared to a human-verified transcript, making it a practical indicator of downstream voice AI performance.
  • ASR Models Tested: To validate generalization, Sanas ASR-Optimized Noise Cancellation was evaluated across a mix of open-source and commercial ASR models, both offline and streaming:
    • Nova2: Deepgram Nova 2 (commercial)
    • AWS: AWS Transcribe (commercial)
    • Canary-1b-Flash (open-source)
    • ParakeetTDT_0.6bV2 (open-source)
    • Nova3: Deepgram Nova 3 (commercial)
    • DistilWhisper-Large V3.5 (open-source)
    • AWS-Str: AWS Transcribe Streaming (commercial)
    • Flux-Str: Deepgram Flux Streaming (commercial)
    • Nova2-Str: Deepgram Nova 2 Streaming (commercial)
    • Nova3-Str: Deepgram Nova 3 Streaming (commercial)

This cross-model approach demonstrates that Sanas ASR-Optimized Noise Cancellation is ASR-agnostic: it improves transcription accuracy regardless of which ASR engine is used, giving enterprises flexibility to use their preferred infrastructure.

By evaluating Sanas NC across multiple ASR systems, diverse noise conditions, and a mix of clean and real-world datasets, the study ensures that improvements reflect practical, end-to-end voice AI performance rather than laboratory edge cases. This approach shows whether the model strengthens ASR where it matters most: in unpredictable environments and across varied ASR architectures.

The Results: Audio Examples

Before you listen: Sanas NC is designed to make speech clearer to AI systems, which sometimes means keeping subtle sounds that human listeners might perceive as noise. If the audio doesn’t sound “cleaner,” that’s expected — and it’s exactly why the ASR results are dramatically better.

Below are two representative audio samples showing how Sanas ASR-Optimized Noise Cancellation improves intelligibility where generic solutions reduce it.

Each sample includes:

  • Oracle Transcript — Human-verified, ground-truth transcripts
  • Source Transcript — Deepgram Nova3 Streaming ASR output on raw audio with no noise cancellation applied
  • Generic BVC Transcript — Deepgram Nova3 Streaming ASR output using a non-Sanas background voice cancellation (BVC) tool
  • Sanas ASR-Optimized NC Transcript — Deepgram Nova3 Streaming ASR output after Sanas ASR-Optimized NC

In the first example, Sanas ASR-Optimized Noise Cancellation restores clarity significantly better relative to the source and moderately better relative to the generic solution.

Oracle Transcript
Source Transcript
Generic BVC
Sanas ASR-Optimized NC

it was in the spring of the year eighteen ninety four that all london was interested and the fashionable world dismayed by the murder of the honorable ronald adair under most unusual and inexplicable circumstances the public has already learned those particulars of the crime which came out in the police investigation but a good deal was suppressed upon that occasion since the case for the prosecution was so overwhelmingly strong that it was not necessary to bring forward all the facts only now at the end of nearly ten years am i allowed to supply those missing links which make up the whole of the remarkable chain the crime was of interest in itself but that interest was as nothing to me compared to the incons inconceivable sequence which afforded me the greatest shock and surprise of any event in my adventurous life

it was in the spring of the year one eight nine four that all london was interested and the fashionable world dismayed by the mother of the honorable rona adair under most unusual and those particulars of the crime which came out in the police investigation but a good deal was suppressed upon the occasion since the case for the prosecution strong that it was not necessary necessary to bring forward all the facts only now at the end of nearly ten years am i allowed to supplement these basic things which make the crime was of interest in itself but that interest was as nothing to me compared to the incon inconceivable sequence which afforded me the greatest shock and of any event in my adventurous life let's create magic

it was in the one eight nine four that all london was interested and the fashionable world dismayed by the mother of the honorable rona adair and the most unusual and inexplicable circumstances the public has already learned those particulars of the crime which came out in the police investigation but a good deal was suppressed upon the occasion since the case for the prosecution was so overwhelmingly strong that it was not necessary to bring forward all the facts only now at the end of nearly ten years number allowed to supply these basic wings which make up the board of the remarkable chain the crime was of interest in itself but that interest was as nothing to me compared to the incomes inconsistencyable sequence which afforded me the greatest shock and surprise of any event in my adventurous life

it was in the spring of the year one eight nine four that all london was interested and the fashionable world dismayed by the murder of the honorable rona adair and the most unusual and inexplicable circumstances the public has already learned those particulars of the crime which came out in the police investigation but a good deal was surprised upon the occasion since the case for the prosecution was so overwhelmingly strong that it was not necessary to bring forward all the facts only now at the end of nearly ten years am i allowed to supply these missing links of which make up the whole of the remarkable chain the crime was of interest in itself but that interest was as nothing to me compared to the inconceivable sequence which afforded me the greatest shock and surprise of any event in my adventurous life

22.9% WER

13.2% WER

7.5% WER

The result: audio that may sound less “filtered” to the human ear but delivers significantly higher transcription accuracy across multiple ASR systems.

This next example demonstrates the disastrous consequences that may occasionally occur with the use of generic noise cancellation or background voice cancellation solutions that may actually suppress desirable speech. Meanwhile, Sanas ASR-Optimized NC preserves critical phonetic cues, improving WER.

Oracle Transcript
Source Transcript
Generic BVC
Sanas ASR-Optimized NC

thank you for calling a b c services my name is c j how can i assist you for today i'm happy to help you with that can you please provide your account number so i can pull up your information thank you let me take a look at this account alright i see the issue it looks like there was an additional charge for the late fees would you like me to explain that i see that your payment was due on the first but it was processed on the third the system added a late fee automatically i'm happy to waive that for you i'll make adjustment for now and you'll see the corrected balance on your next bill

thank you for calling services my name is c j how can i assist you for today i'm happy to help you with that can you please provide your account number so i can pull up your information thank you let me take a look on this account alright i see the issue it looks like there was an additional charge for the late fees would you like me to explain that i see that your payment was due from the first but it was processed on the third system added an eight feet automatically empathy waived that for for now and you'll see the corrected balance on your next page week

thank you for calling services my name is c j how can i assist you for today i'm happy to have a good weekend alright okay would you like

thank you for calling a b u services my name is c j how can i assist you for today i'm happy to help you with that can you please provide your account number so i can pull up your information thank you let me take a look at this account alright i see the issue it looks like there was an additional charge for the leak fees would you like me to explain that i see that your payment was due on the first it was processed on the third the system added a late fee automatically m c a t lead that for you i'll make adjustment for now and you'll see the corrected balance on your next page

15.7% WER

79.3% WER

7.4% WER

This comparison highlights why generic noise cancellation technologies often hurt machine performance, and how ASR-optimized design makes all the difference.

In real-world settings like call centers, hospitals, or drive-thrus where background noise is unpredictable, even small reductions in WER can have major downstream impact.

Fewer misheard words mean better understanding, faster issue resolution, and higher customer satisfaction. For enterprises, that translates into measurable operational improvements.

The Results: Quantified Improvements

Across evaluations, Sanas ASR-Optimized Noise Cancellation improved transcription accuracy on noisy audio without degrading performance on clean speech. 

  • 5–30% average improvement in Word Error Rate (WER) across multiple ASR systems on noisy data, primarily over streaming ASRs.
  • No degradation in performance on clean audio (e.g. LibriSpeech Clean Test).
  • Consistent gains from low to and high noise levels: from mild background chatter to severe industrial noise.
  • ASR-agnostic performance across open-source and commercial systems.

These results demonstrate that ASR-Optimized Noise Cancellation enhances machine comprehension exactly where conventional denoisers fail.

The following charts capture how ASR-Optimized Noise Cancellation behaves across noisy enterprise data and clean test sets.

The first chart below compares average WER across multiple ASR systems when tested on Sanas’ real-world proprietary dataset. Each set of bar represents the difference between no noise cancellation and Sanas ASR-Optimized Noise Cancellation for the same ASR model.

The Takeaway: Sanas ASR-Optimized Noise Cancellation delivers a consistent ~5–30% improvement in WER across different ASRs on noisy, real-world speech data.

Sanas ASR-Optimized Noise Cancellation Metrics on Noisy Audio Samples

The next chart confirms that Sanas ASR-Optimized Noise Cancellation doesn’t degrade clean input using the open-source LibriSpeech Clean Test dataset.

The Takeaway: Even when audio is already clean, transcription accuracy remains stable with no loss in WER.

Sanas ASR-Optimized Noise Cancellation Metrics on Public Clean Audio Samples

Sanas ASR-Optimized Noise Cancellation doesn’t just make speech sound cleaner, it makes AI systems understand better.

By preserving the acoustic details that speech recognition models rely on, it bridges the gap between human engagement and machine comprehension.

For enterprises, clearer audio means smarter systems. With Sanas Noise Cancellation as a preprocessor before ASR, organizations gain higher transcription accuracy, fewer miscommunications, and more consistent performance across accents and environments. Small improvements in Word Error Rate can drive meaningful gains in trust, efficiency, and customer satisfaction.

For developers and researchers, this ASR-agnostic framework enhances accuracy without retraining. It demonstrates a new approach to denoising — one designed not just for perception, but for downstream performance.

More broadly, this breakthrough redefines what “clarity” means in the age of agentic AI. By improving the quality of what AI hears, Sanas ensures that voice technology remains powerful, inclusive, and truly human-centered.

Curious how ASR-Optimized Noise Cancellation can strengthen your AI pipeline? Reach out to join the early access waitlist.

Get in touch

Please fill out this form and a Sanas team member will reach out soon!