Risk 10 of 13 · AI Risk Series

AI-generated scripts and automation: when the answer runs in your business

The risky moment can look like a shortcut that works.

When AI gives staff a script, command, macro, or workflow to run, the output stops being advice and becomes execution. Mistakes land directly in business data.

Where it comes from Staff treat AI as expert technical authority and run what it produces against systems they do not fully understand.
What the business loses Files, records, or data damaged by automation that worked on a six-file test but failed at scale.
What ends it Qualified review before AI-generated scripts run against business systems, tests on representative copies, and a confirmed rollback path.
← Series introduction Article 10 of 13

The risky moment can look like a shortcut that works.

Someone has a repetitive task: rename files, clean a spreadsheet, merge exports, update records, move documents, reconcile invoices, or build a small automation. They ask AI for help. The AI gives them a script, a command, a macro, a browser extension, a low-code flow, or installation steps.

The instructions are clear. The explanation sounds confident. The first test appears to work.

Then the person runs it against real business data.

That is the point where AI stops being advice and becomes execution.

If the script is wrong, the package is malicious, the command hits the wrong folder, or the automation behaves differently at scale, the damage lands directly in the business.

What the risk is

This risk is staff treating AI as expert authority on technical work and running what it produces without review by someone qualified to assess what it will actually do in the business's environment.

The output may be:

  • A PowerShell, Bash, Python, or JavaScript script.
  • A command copied into Terminal, PowerShell, Command Prompt, or an admin console.
  • An Excel macro, Office script, Google Apps Script, or SharePoint automation.
  • A Power Automate, Zapier, Make, or similar low-code workflow.
  • A browser extension, add-in, or desktop utility the AI recommends installing.
  • AI-generated application code shipped into a customer-facing system.
  • A package installation command such as pip install or npm install.

AI writes plausible technical instructions, and those instructions become dangerous when people execute them against systems staff do not fully understand.

AI can generate code without knowing:

  • Which folder contains production data.
  • Which files have backups or version history.
  • Which records are legally or contractually important.
  • Which applications use hidden IDs, links, metadata, or naming conventions.
  • Which user account has access to which systems.
  • Which package names are real and trustworthy.
  • Which command will behave differently on this endpoint, tenant, version, region, or configuration.

This is separate from Agentic AI. In agentic AI, the AI takes the action itself. In this article, a person takes the action by running the AI's output.

It is also separate from Developer workstation infrastructure. That article covers the persistent endpoint risk created when AI-driven work causes staff to install Python, Node.js, VS Code, package managers, browser automation, local model tools, or similar developer infrastructure. This article covers the immediate risk of executing the AI-generated output: the script, command, macro, installer, package, flow, or code.

It is separate from wrong or mixed AI output, which covers AI output that is wrong as information. Here, the output is dangerous because it runs.

How it happens in a normal SMB

A small law firm decides to clean up a shared client folder before moving to a new document-management system. The folder contains years of Word documents, PDFs, scanned letters, signed agreements, draft filings, and client correspondence.

The office manager has been asked to rename old files into a consistent format:

YYYY-MM-DD - Client Name - Matter Number - Description

Doing it manually would take days. She asks the firm's sanctioned AI assistant for a PowerShell script that can rename the files automatically. She gives the AI a few examples of old filenames and the desired new pattern.

The AI returns a script and explains each step. It says the script will extract dates from filenames, preserve the client name, and rename each file into the new convention. The explanation sounds reasonable. The office manager has used AI successfully for email drafts, spreadsheet formulas, and policy templates, so she trusts the output.

She tests it on six copied files. The test works.

Then she runs it against the shared folder.

A non-malicious script can still be wrong in ways the test did not reveal.

The filenames vary in ways the sample did not show: two dates, no date, underscores, client short names, matter numbers that look like dates, and generic scanner names. The script handles the simple cases and fails silently on the edge cases.

Hundreds of files are renamed into ambiguous names. Duplicate outputs receive suffixes, some matter numbers disappear from filenames, and other documents move into the wrong year folder because the script picked the wrong date.

The office manager does not notice immediately. The script finishes, prints a success message, and the folder looks cleaner at first glance.

The problem shows up the next morning: a paralegal cannot find the right version of a filing package, another staff member finds a client letter under the wrong matter, and a partner realizes that several renamed files no longer contain the information needed to match them to the file record.

The business now has to stop normal work and reconstruct what happened.

The damage came from unreviewed automation running with the office manager's normal write access to a shared business folder.

The failure path

The failure path looks like this:

Case file Sequence 10 · AI scripts
  1. A staff member has a repetitive business task.

  2. AI produces a script, command, macro, installer, package recommendation, or automation flow.

  3. The output sounds competent and may work on a small sample.

  4. The staff member runs it against real business systems, shared data, client records, financial records, or production workflows.

  5. The AI output behaves incorrectly at scale, in edge cases, or in the specific business environment.

  6. Data is corrupted, moved, renamed, overwritten, exposed, deleted, duplicated, or changed in a way the business did not intend.

  7. The business discovers the issue through broken workflow, missing records, customer impact, accounting discrepancies, unusual access, or a failed downstream process.

  8. Recovery depends on backups, logs, version history, and whether anyone can reconstruct exactly what the AI-generated technical output did.

The dangerous part is confidence. Several successful AI-assisted shortcuts make the next shortcut feel routine. That is when staff stop treating the output as code and start treating it as instructions from an expert.

The risk is not limited to formal programming. A Power Automate flow that updates the wrong SharePoint library, a spreadsheet macro that strips leading zeros from account numbers, a Zapier workflow that sends customer records to the wrong destination, or an AI-recommended browser extension that reads every page can cause the same class of damage.

Business consequence

The first consequence is operational loss.

In the law firm example, staff must restore files from backup, compare restored files against the current folder, identify which work was lost after the backup point, and reattach documents to the correct matters. That can consume days of partner, paralegal, and administrative time. It can also create missed deadlines, awkward client conversations, and uncertainty about whether the recovered folder is complete.

Other consequences depend on what the automation touched:

  • Financial records can be duplicated, misclassified, deleted, or imported with the wrong fields.
  • Client records can be merged, overwritten, exposed, or attached to the wrong file.
  • Email automations can send the wrong attachment, wrong recipient list, or wrong customer segment.
  • CRM automations can change opportunity stages, owners, notes, or follow-up dates at scale.
  • Payroll or HR exports can be transformed in ways that are hard to detect until a downstream process fails.
  • Customer-facing code can introduce security defects, broken workflows, or misleading behaviour.
  • Package installation can pull malicious code from public registries, especially when AI recommends a hallucinated package name that an attacker has registered.

The evidence problem is practical. After the script has run, the firm may not know which files were touched, which records changed, which package was installed, which command was copied, or whether the AI chat still contains the exact output. If the user edited the script before running it, the final executed version may be lost.

That makes this both a business-continuity problem and a security problem. The incident may be data corruption, unauthorized disclosure, credential exposure, malware installation, or a production defect, depending on what the AI-generated technical output did.

Controls that interrupt the failure path

Start with a simple rule: AI-generated technical output needs qualified review before execution, even when the AI explains it clearly.

Qualified review means IT, the MSP, a developer, or a designated technical reviewer who can read the script or automation, understand the system it will touch, and explain what will happen if it fails.

Start here

  • Require qualified review before AI-generated scripts, commands, macros, automations, installers, package commands, browser extensions, or code are run against business systems, shared resources, client data, financial data, regulated data, or production workflows.
  • Test on a representative copy before touching the live folder, live spreadsheet, live CRM, live accounting system, or live customer environment.
  • Confirm there is a current backup, version history, export, or rollback path before running any automation that can change or delete data.
  • Keep the final executed item: the script, command, macro, flow definition, package list, and AI prompt thread where practical.
  • Treat low-code automations as change-controlled systems when they can change client, financial, regulated, or operational data.
  • Limit write access to shared resources. A script can only damage what the user running it can write to.
  • Remove local admin rights from ordinary users as endpoint hardening for installation and system-change paths.

Add where needed

  • Use package allowlists or controlled proxies for public registries such as PyPI and npm.
  • Do not install packages, tools, extensions, or add-ins just because AI recommends them.
  • Prefer managed automation platforms with logging over one-off scripts run from a user's workstation.
  • Require pull request, peer review, testing, and deployment controls for AI-generated code in customer-facing applications.
  • Disable or restrict Office macros, unsigned scripts, browser extensions, and add-ins by default.
  • Use separate test locations for file operations, import jobs, data cleanup, and bulk record updates.
  • Ask the MSP or technical reviewer to check destructive commands, recursive operations, wildcard paths, credential handling, external network calls, and package provenance.

The standard should scale with consequence. A formula on a personal scratch spreadsheet needs a lighter process than a script that touches client files, accounting data, payroll records, or production systems. AI can be useful for technical work, but execution needs review when the blast radius is real.

Backups help, but they may not show which work disappeared after the backup, which files changed later, which clients were affected, or whether downstream systems consumed corrupted data before the restore.

Policy rule this creates

Rule 10 of 13

AI-generated commands, scripts, macros, automations, installers, package-install commands, browser extensions, add-ins, and application code may not be run against business systems, shared resources, client data, financial data, regulated data, or production workflows without review by IT, the MSP, a developer, or a designated technical reviewer. Any automation that can change, delete, move, expose, or transmit business data must be tested on non-production data first and must have a documented rollback path. Public-registry package installation requires approved sources or allowlisting where available. Local admin rights are limited to roles that require them.

One of 13 rules for your AI usage policy

The rule above is one of 13 that make up a working AI Usage Policy. The SMB AI Policy Builder walks you through the full set of decisions and produces the policy, working documents, and a 90-day implementation plan.

Launching soon. Join the waitlist to be notified.

Get practical insights like this in your inbox

Occasional articles and updates on technology, risk, operations, and support.