Category: Analytics

We’re not ready yet for Industry 4.0

Computer security is a topic that frequently emerges when talking with a manufacturing business that is curious about “digital transformation” or “Industry 4.0”.

I find that a lot of SMEs are sceptical about Industry 4.0, in so much that they can’t see a practical way forward to realise the potential benefits. They are just about persuaded that their business transaction databases can be “secure”. But any talk of connecting machinery to a network, so that real-time data can be recorded for processing is a step too far.

Their fears are not unfounded. Any business must protect its operations from the leakage, loss and mis-use of its data. If we start bolting-on sensors to factory plant, and connect these sensors to other systems to enable greater efficiencies, we have increased the number of opportunities for process data to be exposed. And process data is where the intellectual property (IP) of many manufacturers lies.

A lot of effort has been expended in the development of secure communication networks and computer systems, but the information breaches that become public only serve to reinforce the security fears of SMEs.

When your competitive edge is defined by your process IP, you will be especially motivated to protect it.

A machine operator discussing their work over a beer at the local bar might reveal some information that gives a clue as to what the business does differently. But this is still relatively benign next to the situation where you have access to all of the data that is being generated by a factory.

You can do a lot with data, and it becomes easier to recreate a scenario the more data you possess. So, if you can gain access to that data, you can act in a more informed way.

However, Industry 4.0 is not just about collecting data. It is also about predicting the future using in-process forecasting; developing enhanced methods of visualising complex data to aid comprehension; using computational resources to automate and delegate physical actuation; and also to create models of the future that can be reasoned with to improve coordination, scheduling and resource utilisation.

So, what do manufacturers need?

Common installations of Industrial Digital Technology require sensors of various types, embedded systems to process the data that is captured and associated networking infrastructure that can transport the data to a centralised data storage/processing facility that is typically a cloud, that may reside off the premises.

Making better use of the computational resources at the endpoints of networks has given rise to “Edge Computing” where more of the data processing is “pushed” towards the devices at the edge of a network. This has become possible since hardware is continuously becoming more capable and less costly to deploy.

Edge Computing does have much to offer manufacturers, particularly with respect to the processing of process data much closer to the process operation itself. However, while some processing can occur, more significant processing still requires more capable hardware, and if that hardware is provided via a cloud, there will be a requirement to have an encrypted connection to that cloud so that the analysis can be done.

Since remote cloud resources are a source of security fears for manufacturers there is a need to provide the capability to perform Industry 4.0 analytics and visualisation within the confines of an organisation’s firewall.

How can the research community respond to this need?

Microservices Architecture is one approach to the development of agile and robust software systems that may be suited to Edge Computing environments. As we place greater demands upon our systems, and ask new functions of systems that were not part of the original requirements specifications, there is a need to enable systems that can scale elastically; much like clouds do for utility computing.

Such architectures may also support the development of capabilities that engender trust between smart objects. Manufacturing systems contain many physical objects, some of which interact with each other. If we want to automate the logistics of objects within a manufacturing value-chain, we shall need to ensure that there are workable trust mechanisms so that the correct interactions can take place.

The certificate authority model of trust cannot scale for a world of smart objects, and this is s driver for research into multi-party authentication schemes, as well as distributed ledger approaches to data and identity provenance.

Once we can confidently deliver insightful analytics and automation within a manufacturer’s firewall, I’m sure that the uptake of IDT will accelerate.
Applied analytics for Cyber Physical Systems
From the National Institute of Science and Technology definition of a Cyber Physical System:

“Cyber-Physical Systems or “smart” systems are co-engineered interacting networks of physical and computational components. These systems will provide the foundation of our critical infrastructure, form the bases of emerging and future smart services, and improve our quality of life in many areas.”

How do we know this?

The broad answer to this question lies in the field of analytics.

Analytics makes use of data, information technology, statistical analysis, quantitative methods, mathematical and computer-based models to assist the discovery of insight (typically patterns) so that we can make decisions based on fact. Business Intelligence, which has been a popular application for organisations to purchase, is an example of a function that requires data analytics for it to operate.

Other examples of domains suitable for analytics include:
- Customer Relationship Management (CRM);
- Financial and marketing activities;
- Supply chain management;
- Human resource planning and monitoring;
- Pricing decisions;
- Sport science strategies – “marginal gains”
Business analytics

There is an explicit link between business analytics and the profitability and revenue generation of a business, as well as the strength of return to shareholders. The activities of analytics enhance the comprehension of data, enabling a business to remain competitive. At its most basic, analytics facilitates the creation of informative reports to assist decision making.

There are four categories of analytics:
1. Descriptive analytics – using historical data to understand the past; “what has happened?”
2. Diagnostic analytics – using data to find the root cause of an event; “what made it happen?”
3. Predictive analytics – evaluating historical performance to produce models that can predict behaviour in the future; “what will happen?”
4. Prescriptive analytics – using optimisation techniques to direct actions and simplify decision making; “how do we make it happen?”
For example, a retail organisation clears seasonal stock with a sale. The question is:

“when do we reduce the price and by how much to maximise profits?”

Descriptive analytics examines the historical data for similar products and reports prices, units sold, advertising campaigns, etc.

Diagnostic analytics identifies key combinations of events and activities that produced recognisable behaviour.

Predictive analytics creates a model to present a number of possible future scenarios.

Prescriptive analytics finds the best sets of pricing and advertising to maximise sales revenue.

Descriptive analytics simply tell the consumer what is happening and make visible relationships, such as where a breakeven point might lie in relation to the current results. Such systems don’t instruct a manager as to what to do, and are the core of traditional Business Intelligence functions for some time. Typical examples are reporting and Online Analytical Processing (OLAP) dashboards, data visualisation via reporting, etc.

Decision modelling

Central to the prediction of future performance is the construction of decision models. A model is often a mathematical abstraction or representation of a real system, idea or object, that captures the most important features of reality. It can be described in writing or verbally, but for the purposes of automation is best described mathematically.

This model is used to understand, analyse and facilitate decision making from data that will contain controllable variables (decision variables) and uncontrollable variables.

Predictive decision models often incorporate uncertainty to help inform managers so that they can analyse risk. The aim is to predict future behaviour based on different scenarios, and is thus shaped by the imperfect knowledge of what will happen, otherwise referred to as uncertainty. Risk is associate with the consequences of what actually happens.

The models use algorithms for regression analysis, machine learning and neural networks, all of which are mature and have been tested in many different domains. One business example of predictive modelling is that of marketing. Used in conjunction with descriptive analytics visualisations, marketeers can interpret the outputs from predictive and prescriptive analytics so that they make the most informed decisions possible.

Since we want to identify the “best” solution, we need to find the values of decision variables that minimise (or maximise) cost/profit, etc. This is know an optimisation and is represented by an objective function. The constraints of the model represent the limits of the domain to be modelled, and the optimal solution is the values of the decision variables at the minimum (or maximum) point.

Prescriptive analytics

Commonly referred to as advanced analytics, the predictive models are used to present an optimum set of activities. For instance, an organisation that uses scarce resources, or whose business involves perishable goods, will be interested in identifying the activities that can best manage its operations.

Analytics-led activities enable us to tackle complex problems by providing individualised solutions. Each of the products and services modelled can be organised around the needs of individual stakeholders, and is suited to scenarios where the value of interactions with stakeholders is high.

These approaches become more important as the volume of interaction between stakeholder agents increases; one pertinent example is the rise in IoT devices that are all individual actors in a complex CPS.
The importance of analytics

Everybody is talking about analytics. Together with Artificial Intelligence or AI all of our business problems will be solved apparently.

Analytics sounds like analysis, so it is natural to make the comparison to try and understand what is different between the two.

Analysis is defined as “the process of breaking a complex topic or substance into smaller parts in order to gain a better understanding of it.”. In common parlance, analysis often means the act of using quantitative statistics to explain or discover something of interest.

Analytics is explained as “the discovery, interpretation, and communication of meaningful patterns in data. It also entails applying data patterns towards effective decision making.”.

At first glance, there isn’t much of a difference between these two statements, which I am sure does not help people understand any distinction between the two terms.

For me, analysis remains the core activity of reducing the complexity of data so that it can be comprehended. Analytics is much broader than this, as not only does it include the methods and tools required to create a platform for analysis to take place, it also includes the context in which the data to be analysed resides. Analytics is the scientific thinking and processes behind analysis, the whys and wherefores, and therefore analysis is a component of analytics.

In the manufacturing domain at least, predictive analytics is very topical, and in the broader business domain in general, decision analytics is popular with enterprise software vendors.

Predictive analytics is essentially forecasting, which itself is a mature statistical subject. It is a human trait to want to understand and plan for the future, and considerable research and experience has developed knowledge in this field.

Being able to predict behaviour using a set of input variables enables transport operators to replace service items more economically than using traditional planned maintenance schemes. Similarly, while two different items of plant may both utilise the same bearing type, the difference in work loading on each machine may result in different wear patterns and therefore service life.

It is therefore more prudent to replace either bearing when it is predicted to approach its failure, rather than on a pre-determined date. Such a scenario is described as predictive maintenance, and often includes the topic of condition monitoring.

Businesses often want to identify segments in their customer base, in order to develop ideas for innovative products and services that might appeal to those customer types. It requires analysis that can take a collection of data and identify the characteristics that enable that data to be classified into discrete groups. This is referred to as decision analytics.

Taken together, both predictive analytics and decision analytics are inherent parts of digital manufacturing, and are thus commonly referred to when discussing Industry 4.0.

Inexpensive hardware is assisting the adoption of Industrial Internet of Things, and this is creating a deluge of data that needs to be analysed and visualised in ways that aid its comprehension.

Analytics thus helps us not only understand the insight that lies within data, but it also assists how we cope with increased data volume by way of providing the tools, platforms and methods to manage and analyse that data.
Simple sensing is great value
In a previous life I was a production manager. One of the frustrations of such a role is the feeling that more control and coordination could be exerted “if only I had the data for…”. Modern manufacturing plant often provides either local instrumentation, or the remote logging of its operational data, but if we want to think about integrating plant, and therefore think about optimising operations of the whole system, there is usually some basic information that is missing.

In the late 1990s, transducers for sensing were expensive, and the computational resource required to deal with the sensing data was similarly difficult to justify. This situation was also compounded by the general lack of network infrastructure, which was essentially pre-WiFi. Radio links were available, but they were a) costly and b) prone to interference, or had poor transmission range.

Fast forward to recent times, where:
- transducers are cheap;
- microprocessors are cheap, more than capable for signal conditioning and limited data storage, and easy to network;
- network availability is pervasive via, wired, WiFi, Bluetooth, 4G, etc.
What does this mean for the production/operations manager who still has the same answer of “if only I had …” when they are looking for ways to increase productivity?

It means that we are in an age where it is cheap to experiment with sensing, and it is cheap to integrate the sensing that may already exist, but which is not being used for holistic decision making.

There still exists the situation where manufacturing plant does not produce data while it operates, and it is incumbent upon operators to count items to record data about operations.

Let’s say that we want to gather some data from a production line. Products are produced and transported to a destination via a conveyor belt. We’ll assume that the products are identical, and that they all follow each other in single file. The production manager wants some indication of what is happening in realtime, plus a set of alerts when a significant event has occurred.

So, we fasten a light source and a photocell, or some sort of proximity detector to the side of the conveyor belt. What do we record?

In terms of data, we record a time and date stamp every time an object is detected. We assume that the conveyor belt moves only in one direction (it doesn’t reverse in some situations for instance), and that the sensor does not produce false positives (when it says that there is an object present, we can trust the statement).

As the production line operates, objects move along the conveyor and the sensor produces data in the form of a stream of time and date stamps, which might look like this:

03-09-2019 10:57:23

03-09-2019 10:57:29

03-09-2019 10:57:35

03-09-2019 10:57:41

Let’s assume that we have connected our sensor to a small microprocessor such as an Arduino board, or even a Raspberry Pi. That board will enable the incoming signal from the sensor to be augmented with a timestamp and written to a file, or sent via a network connection to a PC where the data is recorded.

What can we do with that data? With one sensor we can:
- count the total number of objects produced;
- measure the rate at which objects are produced;
- identify events which occur that might interrupt the flow (e.g. system breakdown, changeover between product type, etc.)
If we augmented the system with an additional sensor, placed either above or below the first sensor, we could also distinguish between two product types if each has a different height. But we shall keep to the simpler example of uniform products to keep it simple.

Now that we have the data, we can produce very simple reports of production output and rate, while also creating alerts for when the objects appear not to be arriving.

Furthermore, the data is now being captured and stored in a way that can now be synthesised with other such systems – adding more sensing to other parts of the plant will now enable a more holistic view of the operations to be created.

These first, tentative steps towards data capture are an important introduction to the modernisation of industry through digital manufacturing (or Industry 4.0). Whilst the latest examples of integrated manufacturing plants illustrate the possibilities of Cyber Physical Systems to coordinate, control and actuate physical systems on our behalf, there is still a fundamental reliance on the generation of data via sensing, its collection, processing, and subsequent reporting in a manner that humans can comprehend.
Dealing with IoT time

One factor that motivates industrial organisations to adopt Internet of Things (IoT) technologies is the potential to be able to monitor, control and coordinate processes remotely. This opens up new possibilities for collaboration across manufacturing sites, and has the potential to have a massive positive impact on supply chains of collaborating entities, who may be separate business organisations, but who need to co-exist by serving the needs of each other.

The degree of integration that is feasible between processes is in part determined by the frequency of monitoring that is required to optimise a process or group of processes.

For instance, if a system is measuring the ambient environmental temperature, it is likely that an hourly update will produce more than enough data upon which to gain the insight needed for whatever decisions are taken.

In contrast, a machine tool that is shaping components that form part of larger assemblies, may be reporting tool wear using loading on the drive motors, by way of electric current draw monitoring. In this case, the reporting of hourly data may be meaningless, and a much more frequent stream of reported data would be required to illustrate at which point in time the tool is starting to lose its sharpness and requires replacement.

A second issue in relation to time exists if we consider the potential of collaboration across different geographies, which is feasible with the use of networked sensors (and is typically the basis of a more sophisticated Cyber Physical System or CPS).

In such a case, the motoring and reporting frequency of the data is of importance, but there is an assumption that the data that is reported from each of the sensors is recorded at the same time. The synchronisation of internal clocks in embedded microprocessors can quickly become an issue if we want to integrate lots of data-producing devices together.

Depending on the design of the data producing system, there may be different latencies that affect the time that data is reported, analysed and posted into a repository. For some processes this is more critical than others; the example of temperature recording is clearly more tolerant than that of the concurrent monitoring of machine tool wear.

Our assumption is that we are building systems that sense so that we can do something interesting with the data, and this is usually some form of analytics. Whether the analytics is performed local to the source of the data, or perhaps more “downstream” upon a database or data warehouse, there is still the issue of data integrity; how do we protect against data being recorded that is inadvertently mis-labeled with time stamps that are not synchronised?

Each situation needs to be considered on an individual basis, but it is generally prudent to record a sequence of activities post data capture, so that an analytics function can observe the chain of events that occurred after an event was reported. This might include recording the times that the data was sent, received, processed and archived for instance.