Clean Data: A Cyber Requirement for Industrial AI

The phrase “garbage in, garbage out” has been used for decades, but it has new weight in manufacturing AI. An AI model used for predictive maintenance, quality analysis, or process optimization relies on data from machines, sensors, historians, files, inspection systems, maintenance platforms, and production environments. If that data is compromised, manipulated, or introduced through an unsafe process, the model may produce outputs that lead teams in the wrong direction. During a session at IIoT World’s AI Manufacturing Day 2026, James Turner Jr., Mark Toussaint, and Itay Glick of OPSWAT explained why protecting the inputs feeding AI systems is as important as protecting the models themselves.

Data Poisoning Is a Manufacturing Risk

AI engines can be affected by data poisoning, where manipulated inputs produce results that appear legitimate but are based on corrupted information. A model could miss an equipment issue, flag the wrong asset for maintenance, misread process conditions, or support a poor production decision. The question manufacturers need to ask: how do they know the AI data they are getting is good if the AI engine has been attacked?

Files Feeding AI Systems Need Scrutiny

Manufacturers often focus on live data from machines and sensors, but AI systems may also depend on files. These can include maintenance records, production reports, quality documentation, engineering files, configuration data, work instructions, compliance documents, or exported datasets.

Those files can become part of an AI workflow. They may be uploaded for analysis, used to train or tune a model, or combined with OT data to generate recommendations. If files are not scanned, sanitized, or validated, they can introduce risk into the AI environment.

A zero trust approach in manufacturing means scrutinizing everything, including the files feeding AI models.

This is especially relevant because file movement is common in manufacturing. Files move between vendors, engineers, maintenance teams, plant-floor systems, and enterprise platforms. If those files become part of an AI pipeline, manufacturers need to know where they came from, how they were checked, and whether they are safe to use.

AI Can Help Defenders, but It Also Raises the Standard

AI can also help defenders analyze files, detect anomalies, identify vulnerabilities, and understand network behavior.

Predictive AI tools can look at files entering the OT network and quickly determine whether a file poses risk before it is detonated. AI and machine learning can also analyze OT network traffic, learn normal patterns, and identify anomalies in traffic flow.

These capabilities improve speed and coverage. They do not replace the need for defined policies, clear data ownership, and repeatable validation processes.

Clean Data Requires a Clear Process

For industrial AI, clean data should not depend on informal habits or individual judgment. It needs a repeatable process.

Manufacturers should be able to answer:

What data sources are allowed to feed AI systems?
Which files can be uploaded or used in AI workflows?
How are files scanned or sanitized before use?
How is sensor or process data validated?
Who approves new data sources?
How are changes to data pipelines reviewed?
How does the organization detect manipulated, incomplete, or abnormal data?
What happens if a data source becomes unavailable or untrusted?

These questions help manufacturers move from informal AI experimentation to a more controlled industrial AI program.

The goal is not to make AI projects slower. The goal is to make the data behind them dependable enough for manufacturing environments, where decisions can affect uptime, quality, safety, and cost.

Related from IIoT World

This article is based on a session at IIoT World’s AI Manufacturing Day 2026, sponsored by OPSWAT. Speakers: James Turner Jr., Senior Solutions Engineer OT Cybersecurity; Mark Toussaint, Principal Product Manager; and Itay Glick, GM Hardware and OT Security, OPSWAT. Moderated by Tim Chase, Program Director, MFG-ISAC. AI tools were used to help summarize and organize the content. Reviewed and edited by the IIoT World editorial team.

Sponsored by OPSWAT. Editorially Independent.

FAQ

1. How should manufacturers protect AI data from cybersecurity threats?

Manufacturers should treat the data feeding AI systems as security-relevant inputs. That means scanning and sanitizing files before they enter AI workflows, validating sensor and process data, controlling which data sources are allowed, and monitoring for manipulated or abnormal data. Data poisoning, where compromised inputs produce misleading AI outputs, is a recognized risk in manufacturing environments that use AI for maintenance, quality, or production decisions.

2. What is data poisoning in manufacturing AI?

Data poisoning occurs when the data feeding an AI model is manipulated, corrupted, or introduced through an unsafe process. In manufacturing, this can cause a model to miss equipment issues, flag the wrong assets, misread process conditions, or support poor production decisions. The risk is that the AI produces outputs that appear legitimate but are based on corrupted information. Manufacturers need to validate data integrity as part of their cybersecurity program.

3. How should manufacturers validate files entering AI systems?

Files entering AI workflows should be scanned, sanitized, and validated before use. Manufacturers should know where each file came from, how it was checked, and whether it is safe for the AI environment. A zero trust approach means scrutinizing every file, whether it contains maintenance records, production reports, configuration data, or exported datasets. Dedicated scanning processes and clear ownership of data sources help reduce the risk of introducing unsafe files into AI pipelines.