Data from SCADA systems is incomplete

Continuing with our data problem blog series, we tackle the topic of incomplete data, another challenge many businesses face when considering digital technologies like SCADA systems to improve their business processes.

Incomplete data is distinct from the problem of missing data. Missing data is information that companies don’t collect because it’s either unavailable or too expensive to gather. Incomplete data, on the other hand, should be coming in, but it’s not. Digital infrastructure managers have put all of the necessary infrastructure in place, but some data comes in with holes.

What causes incomplete data?

There are several reasons this might occur.

Network downtime, caching issues, and inconsistent power are some of the primary causes of incomplete data. However, the biggest challenge with this particular data problem is that it can be hard to determine the source of the issue. We often don’t know why data is incomplete because the information we need to solve the problem in the first place is missing.

In other words, incomplete data comes with the following catch-22: We need data to know what the problem is, but we don’t have the data as a result of the problem!

An obvious example of incomplete data would be when a monitoring device loses power and stops reporting data. Analysts don’t receive what they expect because the sensor has gone dark.

Additionally, there’s no way of knowing what’s wrong with the device since it has failed altogether -- Is it damaged? Did the broader network fail? Is it a localized issue? Diagnosing incomplete data may require in-person visits to the field, which is time-consuming and burdensome on the team.

At a time when we are producing 2.5 quintillion bytes of data every day, catching incomplete data can be difficult. We are inundated with quantitative information today. Losing data points here and there can easily go unnoticed without the right protections in place.

Legacy SCADA systems, especially, suffer from the incomplete data problem.

Today, many industrial operators have to push the limits of SCADA system network capacity to support requests from office-based personnel. Analysts and supervisors may ask for more data or different types of data, which puts added pressure on a SCADA system architecture that may not be explicitly designed for digital technologies.

Similarly to how cellular towers may drop calls during usage periods, SCADA will drop data as it attempts to keep pace with bandwidth demands.

In addition to inefficient caching practices, SCADA systems are also limited by network outages and power failures in remote environments. As we’ve highlighted, connectivity lapses for any reason can cause incomplete data.

Who is affected by incomplete data?

First and foremost, incomplete data affects analysts who need information for modeling and visualizations. Without having a complete picture of how physical assets are performing, they can’t draw conclusions confidently. Results are skewed until all data streams are restored.

Executive leaders also can’t make strategically sound decisions unless data gaps are plugged. They have to delay and potentially miss out on new opportunities. Or, they move forward, bearing the risk of the unknown.

Finally, field workers are impacted as they are the ones who have to determine what’s wrong. Given the ambiguous nature of the incomplete data problem, solving issues can take a long time and prevent field personnel from executing other higher-value activities.

What do we need to consider?

There are three variables to consider with respect to incomplete data:

Power
Network
Caching

When designing IIoT networks, include backup power plans from the beginning. That way, when primary power goes down, devices stay up and running on secondary sources.

To mitigate network failures, ensure that all components are secure, configured correctly, and well-suited for digital projects. Downtime related to network failures can have major negative impacts on business outcomes.

Mentioned above, inefficient caching across monitoring solutions can be detrimental. You must choose a caching technique that is appropriate for your specific application so that no data is lost in transmission between sensors and the analytics engine.

It’s important for industrial businesses to have checks in place that enable supervisors to identify incomplete data immediately. Otherwise, network or device failures can persist long after they should have been fixed. Incomplete data can be hard to replace, even with sophisticated machine learning algorithms or simulations.

Maintain a dashboard of all network components, including sensors, gateways, modems, and computers, that alerts personnel in real-time if something is wrong. That way, maintenance teams can address problems quickly and minimize incoming gaps in data.

What’s at stake for your business?

Companies pursuing digital transformation need to be aware of the incomplete data problem. Those who don’t have active monitoring or SCADA systems or effective asset tracking capability are at risk of losing valuable intelligence about their digitized assets.

Digital manufacturing and smart factory applications are hard to optimize without knowing the importance of collecting and protecting every critical data stream.

Below are several questions to help you assess your exposure to incomplete data:

Do you have insight into the status of every device and component in your networks?
Can your analytics engine flag or recognize incomplete data fields?
Does your team have the technical expertise to address network failures rapidly?

At WellAware, we understand the many types of industrial data challenges that organizations face today. We help businesses run successful pilots and scale their IIoT deployments to full production with a suite of data solutions.

Ready to solve your incomplete data problem?

Get a Demo of WellAware Today.