Confidential data entering AI: what happens after the prompt

← Series introduction Article 02 of 13

The risky moment is often ordinary.

Someone pastes a client email into an AI tool and asks for a summary. A manager uploads a draft contract and asks for cleaner wording. An accountant copies a financial statement into a chat window and asks for a plain-language explanation. A staff member uses an approved AI workspace because the business bought the right tier and told the team to use it.

The work may be legitimate, the tool sanctioned, and the account company-managed. The problem starts when the business assumes the data is gone because the tab is closed or because the vendor says the data is excluded from model training.

Once business content enters an AI tool, the vendor's data-handling architecture decides what happens next.

Retention settings, saved chats, memory features, subprocessors, support access, deletion options, and data residency all matter. Microsoft 365 controls, network filtering, and encryption inside the business no longer govern the data after it crosses into the AI vendor's systems.

Shadow AI covered invisible tools. This article covers a different problem: confidential data entering AI tools, including tools the business approved.

What the risk is

Confidential data entering AI means company, client, employee, financial, legal, or regulated information is entered into an AI system where the vendor's terms and technical design control what happens to it.

Approval is only the first question. The data-handling question is more specific:

What data can go into this AI tool, and what happens to that data after it enters?

The answer depends on the tool, the account tier, the contract, and the settings. A free personal account may have one set of terms. A paid individual plan may have another. A team plan may reduce training use but still retain chat history by default. An enterprise contract may offer stronger controls but still require configuration, review, and a retention decision.

For an SMB, the risk usually sits in a few places:

Saved chats and project workspaces keep client or company content long after the immediate task is done.
Memory or persistent-context features may store facts from previous conversations and reuse them later.
Vendor retention defaults may keep prompts, uploads, outputs, and logs longer than the business expects.
Subprocessors may handle storage, moderation, support, security, analytics, or model infrastructure.
Data residency may be unclear, especially when a product is sold locally but inference or support happens elsewhere.
Deleting one client's information may be difficult if it is mixed through months of general chat history.

None of this requires staff to be careless. It happens when the business gives staff a useful tool without deciding which data classes belong in it and without documenting the vendor's data-handling terms.

How it happens in a normal SMB

A small accounting firm pays for a team plan from a well-known AI vendor. The owner chose the team plan because the vendor says customer data from that tier is excluded from public model training. It was a responsible starting point and a clear improvement over staff using personal accounts.

The firm gives five staff members access. They use the AI workspace daily: summarizing client emails, drafting tax memos, reviewing financial statements, preparing plain-language explanations for owners, and drafting responses to CRA correspondence. Staff like it because it saves time and improves first drafts.

The tool becomes part of the firm's normal client work. A senior accountant creates one project for tax planning language. Another keeps a long-running chat for construction clients. The owner uses a saved thread to turn technical notes into client-ready explanations. Nobody is trying to hide anything. The business bought the tool, assigned seats, and told staff to use it.

The missing decision is data handling. The owner checked the training-use headline before buying, but the firm never documented the retention period, chat-history settings, memory features, deletion process, subprocessor list, or data residency position. The chat history stays on by default. Saved projects accumulate months of client information. Staff lack a simple rule for which client data can enter the tool.

The Shadow AI problem from the previous article is under control here. The unresolved question is what the vendor does with the data.

The failure path

The chain looks like this:

The business approves an AI tool and gives staff access.
Staff use it for real work because it saves time.
Client emails, financial details, HR material, contracts, or other confidential content enter prompts, uploads, saved chats, or projects.
The vendor's retention, memory, deletion, subprocessor, and residency rules govern that content.
The business has not documented those rules in a way it can explain to a client.
A client, insurer, buyer, regulator, or lawyer asks where the data went and whether it can be deleted.
The business can answer only in partial terms because it never mapped the tool to the data class.

The technical fact is simple: data entered into AI lives in the vendor's systems on the vendor's terms.

Vendor terms vary widely. Many sit in the middle: training use is addressed, while retention, memory, logs, support access, subprocessors, or deletion are still conditional. The business has to know which situation it is in before confidential data goes into the tool.

Business consequence

The first consequence usually shows up in client confidence.

The accounting firm's largest client is a regional construction company that represents about a third of annual billings. The controller calls one Friday before year-end and asks a direct question:

The owner says yes, in a controlled way. The firm uses a paid team plan.

The controller asks the follow-up questions:

Where does the AI vendor keep our information?
How long is it retained?
Is it used for training?
Can you delete our information from the AI tool if we ask?
Who at the vendor or its subprocessors can see our financial data?

The owner knows the tool has handled the construction company's work; staff use it daily. She chose the better tier because the vendor said the data was excluded from training. The unanswered items are retention, memory, processing location, subprocessors, and how to delete one client's information without deleting the firm's broader AI work history.

She tells the controller she will get back to her.

Over the weekend, she reads the vendor documentation. The answers are dense and conditional: settings that may be configurable, deletion options tied to whole chats or workspaces, logs that may persist for a period after deletion, a longer-than-expected subprocessor list, and residency language that is hard to match to the client's question.

On Monday, she sends a careful partial answer. The controller thanks her and asks for clearer assurance before the next engagement. The relationship continues with a different tone. The next renewal includes more security and vendor questions than usual, and the construction company asks two other firms how they handle AI data. When the work is reviewed later, the accounting firm is still invited to bid, with the AI answers now part of a broader confidence problem.

The client concern was the firm's inability to answer basic data-handling questions about a tool it used every day.

Other consequences follow:

Client questionnaires become harder to answer.
Contract terms around AI, data handling, deletion, or subprocessors become harder to meet.
Insurance and acquisition diligence ask questions the business cannot answer cleanly.
Future incidents become harder to scope because client data is spread through saved chats, projects, and vendor logs.
Staff may believe "no training" means "safe for all client data," which leaves retention and deletion questions unresolved.
Personal information may create privacy-law obligations if the AI use later becomes part of an incident.

For many SMBs, the first pain is commercial. A client loses confidence, a renewal becomes harder, or a bid tilts toward a firm that can explain its AI data handling with less scrambling.

Controls that interrupt the failure path

The first control is a data rule staff can understand. The business needs to decide which types of information may go into which AI tools.

Start here

Create a one-page AI data-use table: public, internal, confidential, regulated, and never-enter items.
For each approved AI tool, mark what is allowed, what is prohibited, and who can approve exceptions.
Review the vendor's data-handling terms before confidential or regulated data is allowed into the tool.
Treat passwords, API keys, tokens, recovery codes, and credentials as never-enter items.
Record the retention period, training-use status, memory settings, support access, subprocessors, deletion options, and data residency position.
Disable memory and persistent-context features by default until the business has reviewed the use case.
Set a chat-history purge cadence and make someone accountable for it.

Add where needed

Use endpoint DLP where available to detect confidential or regulated content moving into AI tools.
Use per-project or per-client workspaces only where the vendor supports deletion and ownership controls.
Tag conversations or projects by client or matter where the tool supports it.
Shorten retention settings where the vendor allows it.
Review subprocessor updates through a shared vendor-management mailbox.
Keep AI data-handling answers ready for client questionnaires and renewal conversations.

The table is the staff-facing control. It should tell a person whether the data in front of them can go into the AI tool they are using. If the rule requires staff to interpret vendor terms in the moment, it will fail.

The tier decision needs its own review. A paid team plan may be the right minimum for ordinary business use, but the business should not assume the tier answers every question. Read the terms. Confirm the settings. Write down the answer in business language.

The deletion question deserves special attention. Before confidential client data enters an AI tool, the business should know whether it can delete a specific chat, project, client workspace, uploaded file, or memory entry. If the only practical deletion option is to purge an entire account or workspace, that limitation belongs in the decision record.

Policy rule this creates

Rule 02 of 13

Approved AI tools must have documented data-handling terms before confidential business data is entered, including retention, training use, memory, support access, subprocessors, deletion, and data residency. Client, employee, financial, legal, or regulated data may only be entered into tools approved for that data class. Passwords, API keys, tokens, recovery codes, and credentials must not be entered into AI tools. Saved AI content must be purged on a defined schedule.

Common questions about confidential data in AI

The questions that come up most often when a business starts deciding what client information can enter which AI tool.

Is it safe to put client information into ChatGPT, Claude, or other AI tools?

Client information can go into AI tools when the business has decided which tool is approved for which type of client information and has reviewed the vendor's data-handling terms for that data. Safety here is a property of three things together: the tool, the account tier or contract, and the specific type of information being entered. A paid business or team tier of a well-known AI tool is usually a reasonable baseline for low-risk drafting work. Confidential client information, regulated data, and credentials require more deliberate decisions, including retention period, memory settings, deletion options, subprocessors, and where the data is stored.

Does an AI vendor actually train on our business data?

Most paid business and team tiers of major AI vendors do not use customer data to train their public models, and the vendor terms usually say so explicitly. Training use is only one of several data-handling questions, and answering it on its own does not make the AI tool safe for confidential business data. The other questions are retention period, memory and persistent-context features, support access, subprocessor list, deletion options, and data residency. A business that has confirmed 'no training' but not the other six is still relying on the vendor's defaults for most of what happens to its data.

Is paying for the business or team tier of an AI tool enough to make it safe for client data?

A paid business or team tier is usually a meaningful step up from staff using personal accounts, but the tier alone does not answer the data-handling questions that apply to confidential client data. The tier change typically improves training use, account ownership, and admin control. It does not change the business's responsibility to decide which client data may enter the tool, document the vendor's retention and deletion options, review subprocessors and support access, and configure memory and history settings. Confidential client data needs both the right tier and a documented decision about how that tier handles the data.

What types of business or client information should never go into an AI tool?

Passwords, API keys, tokens, recovery codes, multi-factor authentication codes, and any other credential should never be entered into an AI tool. Credentials are different from other confidential data because they grant access on their own. Once a credential has entered an external system, the safe response is to treat it as exposed and rotate it. Beyond credentials, the never-enter list depends on the tool and the business: regulated personal health information, client information under privilege, ownership and acquisition material, and confidential HR or payroll detail belong on the never-enter list for any AI tool the business has not specifically approved for that data class.

Can the AI tool's memory feature leak information from one client's work into another's?

Memory features in AI tools can carry context from one conversation into later conversations within the same account, which means information from one client's work can surface in unrelated drafts, summaries, or answers later. The leak is usually subtle: a phrase, a number, a name, a client-specific framing, or a piece of context that the AI tool treats as background knowledge for the next request. The risk is highest in account-wide memory features that are on by default. The simplest control is to leave memory off until the business has reviewed the use case and decided whether persistent context is acceptable for the data classes the team works with. Per-client or per-project workspaces, where the tool supports them, are a stronger boundary than a shared memory across all client work.

If we delete a chat in our AI tool, is the data really gone?

Deleting a chat in most AI tools removes it from the user view, but vendor systems often retain copies for a defined period before final deletion. The defined period is set in the vendor's terms and is usually a number of days, not immediate. Logs, abuse-monitoring records, and backups may follow a different schedule than the chat itself. Some tools support workspace-wide, project-wide, or account-wide deletion but not single-message deletion within a chat. A business that needs to delete one client's information should confirm before the data enters the tool that the vendor supports deletion at the level the business will need to use.

Does it matter that the AI vendor is based outside Canada?

Where the AI vendor actually processes and stores the data matters more than where the vendor is headquartered. A vendor headquartered outside Canada may still process and store Canadian customer data in Canadian data centers, and a vendor headquartered in Canada may rely on subprocessors in other countries. The questions that actually answer the concern are where the data is processed, where it is stored, where backups live, and which subprocessors can access it. For most Alberta SMBs, Alberta PIPA does not prohibit using AI vendors based outside Canada, but the business should document the data path and review any contractual commitments to clients about where their information is stored. Regulated industries and clients with explicit data-residency clauses in their contracts should treat this as a contract-by-contract decision rather than a general one.

What do we say when a client asks where their data goes after we used AI on it?

A clear answer names the AI tool, the account tier the business uses, the type of work the AI tool handles, and the vendor's documented data-handling terms for that work. The vendor's terms should be translated into business language the client can understand. Useful framing includes the retention period, the position on training use, the deletion options available to the business, the subprocessor list, the data residency position, and what the business can and cannot delete on request. A business that has documented these answers before the question arrives can respond with confidence. Documentation produced after the question arrives is usually slower, less complete, and visible to the client as catch-up work.

Turn this rule into a working AI policy

The Free AI Policy Kit turns the thirteen decisions from this series into editable documents: an AI usage policy, employee survey, tools register, incident checklist, and 90-day rollout plan.

Get the Free AI Policy Kit