Risk 07 of 13 · AI Risk Series

Voice and video impersonation: when the call sounds right

The risky moment can look like a bookkeeper answering a call from the owner.

AI can clone voices and produce convincing video, so a familiar caller is no longer proof of identity. High-risk requests need verification through a known channel before action.

Where it comes from Public audio and video material plus business context lets attackers imitate an owner, executive, or manager convincingly.
What the business loses The old habit of trusting a familiar voice or face for time-sensitive financial or access requests.
What ends it Callback to a known number, two-person approval, and a pre-established verbal code phrase for high-risk requests.
← Series introduction Article 07 of 13

The risky moment can look like a bookkeeper answering a call from the owner.

The owner is travelling. The company is waiting on equipment for a job. The phone rings, the caller ID looks familiar, and the voice on the line sounds like the owner: same pace, same impatience, same way of saying the bookkeeper's name.

"I'm on a borrowed phone. Mine died. I need you to get a wire out before the supplier closes."

A familiar voice can feel like proof. High-stakes requests still need verification through a known channel.

Voice cloning and synthetic video have weakened one of the oldest business habits: recognizing a person by sound or face.

Staff are used to trusting a familiar voice, especially when the request appears to come from an owner, executive, manager, client, or long-time vendor. AI makes that habit dangerous for requests involving money, banking, payroll, credential resets, account access, or other irreversible action.

The reliable signal is still the action being requested. If a call, voicemail, video, or recorded announcement asks someone to move money, change payment details, approve a credential reset, disclose access, or bypass a normal process, it needs verification through a known channel.

What the risk is

Voice and video impersonation uses synthetic audio or video to make a person appear to say or request something they did not say. The target is the human verification habit: "I know that voice" or "I saw their face on the call."

Voice cloning is the more immediate SMB risk because it is cheap, quick, and works over ordinary phone calls or voicemail. Short clips of clean audio from podcasts, webinars, sales calls, voicemail greetings, LinkedIn videos, social media posts, or recorded meetings can be enough for some tools to imitate a person's voice.

Video impersonation is less common in routine SMB fraud, but the risk is moving into normal business tools. A fake participant may appear on a video call. A pre-recorded "CEO announcement" may ask staff to take an unusual action. A live video impersonation may be paired with a compromised account, making the meeting invitation or chat thread look legitimate.

Treat voice and video recognition as weak evidence for high-risk actions. A strange call may be ordinary, and a familiar call may still require verification.

Common patterns include:

  • A voice call from an owner or controller asking for an urgent wire transfer.
  • A voicemail from an executive asking staff to approve a banking change.
  • Caller ID spoofing paired with a cloned voice.
  • A video call where a familiar-looking participant asks for account access or payment approval.
  • A pre-recorded executive video that tells staff to ignore the normal process for one urgent item.
  • A compromised real account used to schedule a meeting where the audio or video is controlled by the attacker.

Phishing and payment fraud covered text-based fraud in email, chat, invoices, portals, QR pages, and landing pages. This article covers the same recognition problem when the apparent proof is a voice or face. Meeting AI covered legitimate meeting bots and recording sprawl, which is a different meeting risk.

How it happens in a normal SMB

A small Alberta industrial services company has an owner who is visible in the local market. He has appeared on a trade podcast, recorded short videos for LinkedIn, left voicemail greetings, and joined sales calls that were recorded by prospects.

That public and semi-public audio gives an attacker enough material to imitate him. The attacker also gathers business context from the company website, project announcements, job postings, and social media. The company is hiring technicians, working on a site outside Edmonton, and waiting on equipment for a customer deadline.

On Thursday morning, the bookkeeper receives a call. The caller ID shows the owner's name because the attacker has spoofed the number. The voice sounds right.

"It's me. I'm on a borrowed phone because mine died. We have a supplier issue on the Edmonton job. I need a wire sent before noon or they will release the equipment to someone else."

The caller knows the project name, the supplier category, and the owner's travel schedule. He sounds irritated in the way the owner sometimes sounds when a job is at risk. He says he will send the banking details by email and asks the bookkeeper to move quickly.

An email arrives two minutes later with wiring instructions. The name on the message appears to match the owner. The sender address is wrong, but the bookkeeper is on her phone and sees mostly the display name. She has also just heard the owner's voice.

The bookkeeper starts the wire process. The bank requires a second approval, so she messages the operations manager: "Owner called. Equipment hold. Need second approval on urgent wire."

The operations manager has also heard the owner talk about the delayed equipment. He approves in the banking portal. The bank workflow now has two approvals and still no independent check of the caller. Nobody calls the owner's known mobile number or uses the company's verification phrase because the voice seemed to settle the question.

The real owner calls later that afternoon about a different matter.

He has no idea what wire they are talking about.

The business now has a payment-fraud incident. The phone call started it, and the approval process failed when voice recognition was allowed to authorize an irreversible action.

The failure path

The failure path looks like this:

Case file Sequence 07 · Voice and video
  1. An attacker collects voice or video material from public posts, webinars, voicemail greetings, sales calls, recordings, or social media.

  2. The attacker gathers business context: roles, projects, travel, vendors, deadlines, payment habits, or reporting lines.

  3. AI helps create a voice clone, voicemail, video clip, or live impersonation that feels familiar.

  4. The attacker makes a high-risk request: wire transfer, banking change, payroll routing, credential reset, account recovery, or process bypass.

  5. Staff rely on the familiar voice, face, caller ID, meeting invite, or apparent account.

  6. The action is taken without callback to a known number or another approved verification channel.

  7. Money, account access, confidential information, or business authority goes to the attacker.

  8. The business discovers the fraud when the real person denies the request or the expected payment, access, or approval fails.

The same lesson from the previous article applies here. The request type drives the control. Text, voice, and video can all look authentic enough to fool a busy person.

Voice and video make the pressure worse because they feel personal. A staff member may hesitate to challenge a familiar voice, especially when the caller sounds annoyed, rushed, or senior. That is why the verification rule has to be normal before the call happens.

Business consequence

The first consequence is often direct financial loss.

In the industrial services company, the wire went to an attacker. The business still needs the equipment, still owes the real supplier if an invoice exists, and now has to work with the bank, insurer, IT provider, and possibly legal counsel to preserve evidence and attempt recovery.

The financial loss may be larger than a text-only fraud because a familiar voice can push staff past the hesitation they might feel with email. The staff member may also feel responsible because they "heard the owner" and acted quickly. That can create blame inside the company when the better fix is a stronger approval process.

Other consequences can follow:

  • Payroll or vendor banking details may be changed after a convincing call.
  • A credential reset may give the attacker control of email, payroll, banking, or vendor portals.
  • A fake executive video may cause staff, clients, or media to believe leadership said something damaging.
  • A video meeting may be used to pressure a staff member into sharing a screen, approving access, or skipping a normal step.
  • Insurance questions may focus on whether the business had callback, dual-approval, and social-engineering controls.
  • Staff may stop trusting phone and video communication for legitimate urgent work.

There is also an evidence problem. The business may not have a recording of the call. Caller ID may be spoofed. A video meeting may have been scheduled through a compromised account. The investigation may depend on phone records, banking logs, email headers, meeting metadata, chat messages, and staff recollection.

Controls that interrupt the failure path

The first control is a verification rule for high-risk voice and video requests.

Voice or video can start a conversation. A high-risk action still requires callback to a known channel, second approval, or another approved verification step. Staff need permission to pause and follow the process even when the voice sounds like the owner.

Start here

  • Require callback to a known number before acting on high-risk voice or video requests.
  • Use two-person approval for wire transfers, banking changes, payroll routing changes, and credential recovery.
  • Use a pre-established verbal code phrase as a supplemental check, especially when normal callback is temporarily unavailable.
  • Treat caller ID, display name, and meeting invitation source as weak signals.
  • Document who verified the request, which known number or channel was used, and who gave second approval.
  • Give staff explicit permission to slow down any request that sounds urgent and irreversible.

Add where needed

  • Require authenticated meeting join, lobby control, and participant approval for finance, leadership, legal, HR, security, or client-confidential calls.
  • Use identity-at-join questions for high-stakes video meetings, based on facts an attacker cannot find publicly.
  • Verify pre-recorded executive videos through a second channel before staff act on any unusual request.
  • Keep public executive audio and video exposure in mind when setting verification rules for owners and leaders.
  • Train finance, payroll, HR, executive assistants, and operations staff on voice-clone scenarios tied to their actual workflows.
  • Review banking and payroll approval limits so no single call can trigger a large irreversible action.

The callback rule should use known records. If the caller says their phone died and provides a new number, treat that number as part of the request. Call the number already on file or use another approved channel the business controlled before the call.

The code phrase should be boring and protected. Avoid public slogans, pet names, sports teams, family details, or anything likely to appear online. Change it when staff who know it leave the company or when the phrase may have been exposed.

Policy rule this creates

Rule 07 of 13

Wire transfers, payment redirects, banking changes, payroll routing changes, credential resets, account recovery, and other irreversible business actions require verification beyond voice or video. The same rule applies to requests for credentials, MFA codes, recovery codes, or access details. High-risk voice and video requests must be verified through callback to a known number, in-person confirmation, a pre-established code phrase, or another approved channel before action is taken. Staff are expected to pause and verify even when the caller sounds or looks familiar.

One of 13 rules for your AI usage policy

The rule above is one of 13 that make up a working AI Usage Policy. The SMB AI Policy Builder walks you through the full set of decisions and produces the policy, working documents, and a 90-day implementation plan.

Launching soon. Join the waitlist to be notified.

Get practical insights like this in your inbox

Occasional articles and updates on technology, risk, operations, and support.