Enabling free flowing data pipelines in industrial IoT settings

October 27, 2023

Enabling free flowing data pipelines in industrial IoT settings

Getting your Trinity Audio player ready...

Automation. Optimization. Efficiency. These are key concepts in industrial settings. As organizations incorporate more and more Industry 4.0 technology and practices into their operations, the ability to effectively realize these concepts boils down to data. Industrial equipment generates a lot of data, but creating value from that data has always been a challenge. Proprietary data protocols and interfaces can inhibit the free flow of data between IT and OT components in an industrial setting. The result is stasis at best, but, in reality, the promises of improvement that so many modern tools and solutions possess fail to materialize because they can’t access the data they need. Therefore, having a way to liberate that data, to dislodge the logjam and enable it to flow freely into those modern tools allow industrial operators to realize the advances they seek.

Industry 4.0 requires data

Simply having a lot of data doesn’t do anyone any favors. After all, data is data. The ability to act on data to derive insights is what makes it valuable. With the proper tools in place, having more data generates more insights. Artificial intelligence (AI) and machine learning (ML) tools can be game changers in the industrial space. But in order to truly benefit from these technologies, organizations need data to train the models that feed them. And those models require a lot of data. Seriously, a lot. And this is true on an on-going basis. That’s one reason why having free flowing data is so important.

Interoperability

Proprietary tools typically do a few things really well, despite this they can also create challenges. As closed systems, interoperability immediately becomes an issue. One of the foremost challenges in achieving interoperability between time series databases and industrial equipment lies in the diverse and often proprietary nature of IoT systems used in the industrial setting. Manufacturers typically deploy a multitude of sensors and devices from different vendors, each with unique data formatting and transmission protocols. Consequently, establishing a standardized communication framework becomes imperative to enable effective interoperability.

Interoperability is a critical aspect of modern data management strategies. Having the ability to use data in a nimble fashion enables businesses to harness the full potential of their data and drive transformative outcomes. Operators hoping for the vendors of these tools to add features that enable them to integrate with modern technologies likely have a very long wait ahead of them. Locking users into a closed system is great for the vendor, but this also limits the user to the development and innovation abilities of that single vendor.

Leveraging open source

Fortunately, adding open source tools into an OT stack can unlock the data siloed in these proprietary, legacy systems, creating much greater opportunity for interoperability. But this begs the question: what is the least invasive way to create usable, reliable, efficient data pipelines with open source tools? The answer will look different for every organization and tech stack, but it helps to think about the core processes that go into data management and identifying tools for each.

Data collection

Devices and systems generate data, which then needs collecting. So, data collection is the first obstacle to overcome. This is also where the various device and firmware versions and protocols create potential challenges. An open source data collection tool, like Telegraf, is a one-stop shop in this situation. A plugin-based tool with 300+ plugins to choose from (and growing!), Telegraf has both data input and output plugins for the most common industrial protocol, e.g., MQTT, Modbus, OPC-UA, etc. Users can also write their own custom Telegraf plugins for unique or proprietary devices.

Data storage

After collecting data from devices, you need to store it somewhere. Many industrial environments use data historians. Like other legacy technologies, these systems often do their specific job well, but lack the connectivity, security, and interoperability that can free up data in the modern factory. A time series database can replace or augment a data historian, depending on your needs and use case. A purpose-built time series database, like InfluxDB, can provide a single datastore for the full range of time series data, whether that includes metrics, events, or traces. Built using open source solutions, InfluxDB is designed to integrate with virtually any system or tool.

Just as important, it can ingest and query data in real-time. Another area where legacy systems fall behind is that many rely on batched data. InfluxDB, for example, can ingest millions of data points per second and query them in real-time. Operators know when something goes wrong right away. They don’t have to wait until their data historian processes the most recent batch of data, by which time the problem could already be much worse. InfluxDB also supports up to nanosecond precision, enabling users to take advantage of full fidelity data. Remember those ML models that require all that data? A time series database provides the storage and performance that allows operators to utilize that data and build those models.

Data visualization

Actionable insights from this data come through data visualization. To see what your equipment is doing, an open source data visualization tool, like Grafana, provides a best-in-breed solution for querying data and creating dashboards. (InfluxDB natively integrates with Grafana.)

Final thoughts

Put together, Telegraf, InfluxDB, and Grafana comprise the TIG stack, which has broad application across the IIoT/OT space. These open source tools are able to handle the volume and velocity of industrial data, in real-time, to support Industry 4.0 initiatives. Their ability to integrate with a wide range of tools and technologies enable operators to collect more data and derive real value from it, fueling predictive analytics and maintenance, OEE, and automation. If we accept that Industry 3.0 centered on leveraging computers and automation in an industrial context, the next evolution in Industry 4.0 is using raw data to feed autonomous system and train them to enhance manufacturing processes. The goal of automation at this level is to keep industrial production running efficiently and safely while minimizing downtime. Open source tools provide a cost-effective and extensible means to accomplish that.

About the author

Jason Myers is currently a Content Marketing Manager at InfluxData. He earned a PhD in modern Irish history from Loyola University Chicago. Since then, he has used the writing skills he developed in his academic work to create content for a range of startup and technology companies. When he’s not writing, you can usually find him playing music of some sort