Richard Hill

Judgement for AI-mediated work

Category: Internet of Things

  • Model a business: simulate industrial processes

    Model a business: simulate industrial processes

    There comes a point when your spreadsheet models of business processes just don’t cut it. You observe some complexity that is just too difficult to explore. You have questions that remain unanswered because you can’t do the analysis. One way of tackling this is to model a business so that you can simulate it and gain some useful insight.

    We are going to look at an everyday approach to creating a model of an industrial process or service. We shall consider how we can ask questions of the model and use that to improve our understanding.

    With this understanding we shall then look at building a simple tool to simulate the operation of the process and thus, model a business. This simulation will produce results that we can use to experiment with different scenarios so that when we go back to the business, we can take more informed actions.

    First, we need to understand the system.

    Understanding business operations to model a business

    Manufacturing facilities vary in both size and complexity. One factory might have two or three areas where different processes take place. Another factory might contain hundreds of machines.

    Each of the factories will have evolved to cope with the manufacture of different products, different mixes of order types, different customer demand profiles, varying quality of raw materials, unpredictable machine breakdowns and so on.

    The list of possible interruptions to a neatly ordered continuous flow of efficient output is endless.

    Shop floor supervisors manage these variations using their analytical skills and experience. At some point during the working week, they’ll be required to answer the following questions:

    1. When will order X be finished?
    2. How much stock is tied-up in the factory?
    3. What is the utilisation of the work centre?

    These questions might be asked by different stakeholders.

    Question 1 probably comes from the customer, via the sales department, perhaps because the order is late.

    Question 2 might come from purchasing who are concerned with re-order quantities for input materials. Or it might be the accountants who are assessing cashflow.

    Question 3 is certainly asked by the accountants, so that they can put a measure on the production potential of a factory. But it is also posed by the planners who want to find additional production capacity for more customer orders.

    In a smaller organisation these roles may be undertaken by the same person. In larger companies the functions will be separate departments. Whatever the size of the facility, the questions are the same. The answers are likely to be the same also.

    When faced with such questions there are too many variables to consider, for you to make a reasoned judgement. Such answers start with “it depends”.

    Attempting to quantify the lateness of an order is dependent upon the jobs in front of the late order, the reliability of the process, whether the operator is working at peak performance, the quality of the tooling and raw materials, etc.

    If the process in question is fed, or feeds into other processes, the opportunities for error are compounded. This leads to the use of estimates which might be generous and therefore may build inefficiencies into how we manage the overall operations.

    What we need is a model of the facility. This model captures the essential characteristics of the business unit and lets us change some of those characteristics so that we can see what the effects of those changes might be.

    Our supervisor might have had an idea to reduce the batch sizes of their orders, but not felt able to try it out as their machine utilisation measures might drop.

    If something went wrong and an order was late, the change initiated by the supervisor might be cited as the cause of the reduction in output.

    But if that change could be applied to a model, that has no physical connection to the real facility, perhaps we could learn more about how the system behaves. If we understand the system better, we stand to make better decisions in the future.

    This practice is referred to as simulation and it has long been the preserve of industrial mathematicians, or scientists who study operational research. Such work creates a lot of value for organisations, by creating models and allowing production personnel to experiment with different strategies.

    However, these mathematical approaches are often inaccessible and significant training is required to interpret the models.

    We can often obtain much of the benefit of simulation without the need for advanced mathematical skills, and this is the approach that we shall take in this article.

    Everything is a queue

    Let’s assume that you visit the local supermarket to buy a few items. You select your items and make your way to the checkouts to pay for the shopping.

    There are a number of checkouts in the supermarket but for some reason only one of the checkouts is operational. You are not the only customer in the store, and there are three other people already at the checkout, waiting in a queue to be served.

    When they have been through the checkout, it will be your turn to be served. Fig. 1 illustrates the scenario.

    queueing model of a supermarket checkout
    Fig. 1 Supermarket queueing with one checkout

    We are going to assume that each person and their shopping in the queue represents one job.

    Just for a moment, think about your answers to the following questions:

    1. When will your job (you and your shopping) be finished?
    2. How many jobs are there in the queue?
    3. What capacity of the checkout is utilised?

    You may recognise these questions from earlier. What was your answer to Question 1?

    Since we don’t know how long it takes to process any of the shopping, we would have to say “it depends”.

    It depends on how much shopping each person in the queue has; this might range from a hand basket to an over-laden trolley.

    Question 2 is a little simpler. We know that each person and their shopping is classed as a job, so we just count the number of jobs in the queue. If there are three jobs waiting in front of you, there must be a job in progress at the checkout, which makes four jobs.

    And then there is you, bringing the total to five.

    And what about Question 3?

    When thinking about utilisation we need to consider potential interruptions such as:

    • the checkout operator being changed at the end of a shift;
    • a request from a customer services supervisor for a price because the barcode on an item is unreadable;
    • a power cut causing the till to stop working.

    If there is a queue of customers, and there are no disruptions to the actual process, we can assume that the checkout is kept busy. Once the queue becomes zero (all the jobs have been processed), or there is an interruption, the checkout becomes idle and the utilisation drops to zero.

    Now that we have a basic representation of our supermarket checkout in place, let’s see how we can alter the performance of the system.

    The supermarket manager realises that if customers have to queue for too long they may become frustrated, or even leave the store without making a purchase. This is not good for business, so another checkout is opened up as in Fig. 2.

    supermarket queueing model with two checkouts
    Fig. 2. Supermarket queueing with two checkouts.

    Now, you approach the checkouts and find that there are two checkouts working. Each checkout is processing one job each, with a queue of one job waiting also. You are free to join either queue.

    Let’s assume that it takes the same amount of time to process each job. If that is the case, since both queues are shorter, you will have to queue for less time before your job is processed. The utilisation of the checkouts reduces however, unless there are more jobs arriving behind you.

    We can thus deduce that there is some form of relationship between the number of available checkouts, the number of jobs to be processed, and the overall time taken to process an individual job.

    If the supermarket manager had such a model, they could experiment with the optimum number of checkouts to service their customer demand patterns. This would help them allocate the correct number of checkout staff for busy periods, while reducing the instances of checkouts being idle during quieter times.

    The model would permit them to plan for seasonal adjustments in shoppers’ behaviours.

    But, if the model can be executed quickly, it could also be a tool to explore a scenario that is unfolding – such as a large influx of customers that were unexpected – and this is where modelling and simulation can become a powerful tool for the management of business operations.

    Modelling an industrial process

    We shall now consider an industrial scenario. A joinery company produces wooden window frames. Each of the frames is cut from lengths of timber that are shaped and cut to length by a machine.

    The company receives orders of varying quantities of windows, which means that varying numbers of timber lengths are required from the first machine. The company only cuts timber lengths for orders and does not make products to put into stock.

    Each order is considered to be a job. Just as was the case with the shopper and their variable amount of shopping, each job can vary in size.

    Each job must then spend a certain amount of time waiting in a queue, before being processed by the machine. The total time that the timber is in the system is queueing time + processing time.

    Both the queueing time and the processing time are dependent upon the size of the respective order.

    We can see now that the model for creating lengths of timber window frame is essentially the same as our first supermarket model.

    We have jobs, a queue, and a processing station, where the actual work gets done. This scenario is illustrated in Fig. 3.

    Queueing model for a single industrial machine
    Fig. 3. Queueing model for a single industrial machine.

    For instance, what impact is a longer queue going to make on a) resource utilisation, and b) the overall time that a job spends in the system?

    A longer queue suggests that there will be less interruption to flow, so the utilisation will be higher.

    However, the longer the queue the more time that a particular job takes to be completed, so the delivery time is longer.

    The next stage is to build a simulation so that we can verify our thoughts.

    Designing a process simulation to model a business

    We now have an illustration of how we can model a single industrial process. That model is part of the initial specification of a simulation that we can execute. The simulation will execute a virtual production run, and that will give us an idea of how the model can perform.

    The simulation allows us to change different parameters of the model, without incurring the cost or disruption of moving physical plant around.

    So far, our model describes:

    • a process of material conversion, where lengths of timber are given a profile and then cut into shorter lengths that are suitable for window frames;
    • a single machine that performs the operations described above;
    • each job is processed one at a time. Multiple jobs cannot be processed simultaneously;
    • jobs arrive for processing and wait to be processed in a queue;
    • a job that has been processed is deemed to be complete and exits the system.

    We now need some more information to allow us to build the simulation.

    First, we should describe the rate at which jobs arrive for processing.

    Second, we need to specify the time taken to process a job.

    Third, we need to consider whether there is any variation in the size of a job. For this first example we shall assume that each job requires roughly the same amount of time to process. We shall explore variable job sizes later.

    There are many different simulation tools that can be used to build queueing models. We shall be using “Ciw” (which is Welsh for “queue”).

    Ciw is a simulation framework that uses Python and as such is free to acquire and use. Just type ‘ciw python’ into Google to find it.

    Within Ciw, there are three parameters that are of relevance to our industrial process model.

    1. arrival_distributions: this is the rate at which jobs arrive to be processed. We shall assume that the jobs arrive approximately every 15 minutes, or four times per hour;
    2. service_distributions: this is the time that each job spends being processed, or the time taken to do the shaping and cutting to length of the timber by the server (machine). We shall assume that each job takes 15 minutes;
    3. number_of_servers: this represents the number of machines at a workstation. In our example, we have one machine, or one server.

    It is important to note at this point the difference between parameters that are static, and those that might vary.

    For instance, for a given simulation we can assume that the number of machines (servers) doesn’t alter, so we give it the value of 1 as we want to investigate the scenario with one machine.

    However, while we can say that jobs arrive at a rate of four times per hour, or every 15 minutes, that isn’t strictly realistic.

    Sometimes there are interruptions to the deliveries. A forklift truck might drop the timber when loading it from the lorry, or there may be a physical blockage preventing the wood being placed next to the machine.

    Similarly, the time taken to process the timber won’t always take 15 minutes. This is just an approximation that – on average – takes 15 minutes.

    Sometimes the timber might blunt the cutting blades of the machine and it will take longer to finish the operation.

    Conversely, when the tooling is new or freshly sharpened the machining time will be less than 15 minutes.

    We want our simulation to take account of these variances and we do this by specifying a distribution function. This tells the simulation to use a range of values, whose mean is the arrival rate that we are suggesting.

    So, for an arrival rate of 15 minutes, the simulation will generate a set of values that vary, with a mean time of 15 minutes.

    This allows the simulation to be more realistic as it will take account of naturally occurring variations in waiting and processing times.

    We are now ready to build the simulation.

    Building the simulation in Ciw

    Create a new text file called:

    timber_conversion.py

    We shall enter some snippets of code now to quickly create a simulation to produce some results. Try not to worry about some of the details just yet as they will be explained later.

    What is important is to execute a simulation so that we can start to understand the timber conversion process better. First, we specify the arrival and service distributions, along with the quantity of servers:

    import ciw
    
    N = ciw.create_network(
        # jobs arrive every 10 minutes, or 6 times per hour
        arrival_distributions=[ciw.dists.Exponential(0.1)],
        # jobs take 15 minutes to process which is 4 jobs
        completed per hour
        service_distributions=[ciw.dists.Exponential(0.067)],
        # the number of machines available to do the processing
        number_of_servers=[1]
    )

    You might have noticed that the value contained in

    [ciw.dists.Exponential(0.1)]

    does not seem to relate to an arrival rate of 6 times per hour. This distribution function requires a decimal value, so we divide the arrival rate of 6 (arrivals per hour) and divide it by 60 (the number of minutes in an hour).

    Similarly, for the service time, the rate of processing per hour is 4 and is represented as 4/60 = 0.067.

    The next piece of code to add is:

    ciw.seed(1)
    Q = ciw.Simulation(N)
    # run the simulation for one shift (8 hours = 480 minutes)
    Q.simulate_until_max_time(480)

    This is an instruction to tell the computer to create a simulation and to run it for a simulated time of one shift (8 hours/480 minutes).

    That is all that is required to create the simulation. However, there are no instructions to tell the computer to report the results. The following program code does this:

    waitingtimes = [r.waiting_time for r in recs]
    servicetimes = [r.service_time for r in recs]
    avg_waiting_time = sum(waitingtimes) / len(waitingtimes)
    print(`Avg. wait time: ',avg_waiting_time)
    avg_service_time = sum(servicetimes) / len (servicetimes)
    print(`Avg. processing time: ',avg_service_time)
    print(`Avg. machine utilisation %:',
    		Q.transitive_nodes[0].server_utilisation)

    There are three results that are reported (look for the ‘print’ keyword).

    First, the average waiting time in minutes for each job.

    Second, the average time taken to process each job in minutes.

    Finally, the average utilisation of the machine (server) as a percentage.

    When you execute your simulation you should see the following results in the console:

    Avg. wait time: 51.51392337104136
    Avg. processing time: 12.643780078229085
    Avg. machine utilisation: 0.9969939361643851

    This tells us that on average, a job took nearly 13 minutes to process and had to wait approximately 52 minutes in the queue. The machine was operating for most of the time (99.7% utilisation).

    This is excellent for a shopfloor supervisor who has to report the percentage of time that a machine spends idle.

    Hardly any downtime for the machine in this situation.

    However, let’s use the simulation to start investigating different scenarios.

    We shall now explore the effect of increasing the number of machines from one to two.

    Edit the following line to increase the number of servers (machines) to 2:

    number_of_servers=[2]

    If we execute the simulation again, we observe the following results:

    Avg. wait time: 8.79660702065997
    Avg. processing time: 14.249724856289776
    Avg. machine utilisation: 0.6827993160305518

    We can see that the addition of an extra machine has dramatically reduced the wait time from 52 minutes to around 9 minutes. The utilisation of the two resources has also fallen to 68%, meaning that machining resources are idle for approximately 32% of the shift.

    While there is a reduction in waiting time, and therefore the overall lead time to delivery of a product, there is the additional capital cost of extra plant. Depending on how the machine is operated, there may also be extra labour required to run both machines at the same time.

    The shopfloor supervisor has a conversation with the company owner and it is clear that there is no cash with which to purchase another machine. The next course of action is to try and increase the output of the timber conversion process.

    The service time is 15 minutes, which means that 4 jobs per hour are processed.

    What difference would it make if we could process 5 jobs per hour?

    Edit the following line to reflect a service rate of 5 jobs per hour (5/60=0.08):

    service_distributions=[ciw.dists.Exponential(0.08)]

    Here are the results:

    Avg. wait time: 26.20588722740488
    Avg. processing time: 11.300597865271746
    Avg. machine utilisation: 0.9485495709367999

    The machine utilisation has increased, but the waiting time is much less than it was with a service time of 15 minutes. This illustrates that there is a significant benefit to be had by making even small changes to the service time of a process.

    Such thinking is central to “lean manufacturing” techniques, where potential opportunities for the removal of waste are identified.

    There might be some different tooling that enables the timber to be cut at a faster rate, or there might be a better way of organising the material so that the cutting-to-length operation is optimised for the fewest cuts.

    Confidence

    Once we have built a simulation, it is important that we are confident that it represents the situation that we are modelling.

    If we look at the results we have observed so far, what do we notice about the average processing time?

    We have obtained three different values: 12.6, 14.2 and 11.3 minutes. This is a significant range of values and it suggests that the simulation might not be taking a sufficient number of scenarios into account.

    For a given scenario, there is a time when the simulation queue is empty, and then partially complete, until a steady state of operation is achieved. Similarly, towards the end of a simulation there will be a number of jobs that remain unfinished.

    When we report the statistics of how the process has performed, we are collecting the data for jobs that have been completed.

    Depending on the time require to ‘wind-up’ and ‘wind-down’ a simulation, there could be a disproportionate effect on the performance that we observe. This would decrease our confidence in ability of the simulation to be used as a tool for experimentation.

    We deal with this in two ways. First, we run the simulation for a longer time and then report only the performance from the system once it is in a steady state of operation.

    For our 8 hour shift, we could add an hour before the start and at the end for warm-up and cool-down periods.

    Second, we can run the simulation many times, altering a number (called a ‘seed’) so that each run has some variation introduced into it.

    Create a new file called

    timber_conversion_2.py

    and enter the following code:

    import ciw
    
    N = ciw.create_network(
        # jobs arrive every 10 minutes, or 6 times per hour
        arrival_distributions=[ciw.dists.Exponential(0.1)],
        # jobs take 15 minutes to process which is
        	4 jobs completed per hour
        service_distributions=[ciw.dists.Exponential(0.067)],
        # the number of machines available to do the processing
        number_of_servers=[1]
    )
    
    runs = 1000 # this is the number of simulation runs
    average_waits = []
    average_services = []
    
    for trial in range(runs):
        ciw.seed(trial) # change the seed for each run
        Q = ciw.Simulation(N)
        # run the simulation for one shift (8 hours = 480 minutes) 
        	+ 2 hours (120 minutes)
        Q.simulate_until_max_time(600, progress_bar=True)
        recs = Q.get_all_records()
        waits = [r.waiting_time for r in recs if r.arrival_date >
        	60 and r.arrival_date < 540]
        mean_wait = sum(waits) / len(waits)
        average_waits.append(mean_wait)
        services = [r.service_time for r in recs if r.arrival_date > 60 and r.arrival_date < 540]
        mean_services = sum(services) / len(services)
        average_services.append(mean_services)
        
    print(`Number of simulation runs: ',runs)
    print(`Avg. wait time: ', sum(average_waits)/len(average_waits))
    print(`Avg. processing time: ',
    	sum(average_services)/len(average_services))
    

    Execute the code and you will observe the following results:

    Number of simulation runs: 1000
    Avg. wait time: 115.69878479543915
    Avg. processing time: 14.87316389724181
    Avg. machine utilisation: 0.8560348867271905

    You can now edit the variable <runs=1000> to change the number of times that the simulation executes.

    As the value of <runs> increases the statistics start to stabilise. This indicates that we can have confidence that the simulation is providing results that we trust. This is regarded as good practice for the modelling and simulation of systems.

    Conclusion

    We have looked at the application of queueing to the modelling of an industrial process. Our queueing model helps us understand the system better, and it also helps specify the various parameters that are important to include in our analysis.

    This specification can then be used with a simulation tool. We have used Ciw to quickly construct a simulation that represents our queueing model.

    As the simulation runs we collect summary statistics that can help us understand the inter-relationships between parameters such as job arrival rates, processing times and the number of resources available to do the work.

    We can then explore different scenarios by changing these parameters and this helps us understand what the limits of the system might be. Exploring different situations via simulations is an inexpensive and quick way to find the limits of a system, or to identify new possibilities.

    For example, you might want to find ways of increasing the output of a factory temporarily to complete a particular rush order for an important customer.

    You know that you can increase capacity by adding another shift or by buying new plant. But you might want to know how many additional operators you need to bring in to complete the extra work. You’ll also want to see how this might impact the rest of the orders for other customers.

    You might not be able to buy, install and commission new plant quickly enough, but a simulation can give you a good idea as to whether you should out-source some of the work or not.

    An example of using simulation strategically is to consider the potential impact of the sales team’s forecast for the next quarter; you could use this forecast to investigate the demands that would be made on your business resources and see what resources you might need.

    If you need to, you’ll be in a much stronger position to justify the acquisition of new plant or additional staffing.

    Model a business yourself

    Using the program code from above, experiment with different values.

    You can change the parameters for the number of simulation runs for instance, but you can also change the ‘shift length’; this refers to the amount of simulated time that the program executes.

    Simulation code allows us to try out different values quickly, to see what the different effects might be. This is convenient when we have a specific question to answer.

    However, we often need to perform deeper analysis of a simulation model, and in such cases it is useful to record the effects of our changes.

    Try to adopt good practice by recording the values that you change, noting the effects of these changes in a table. This habit will help you when your models increase in complexity.

    Some good questions to ask of this model could be:

    1. What is the effect on machine utilisation as the arrival rate of jobs declines?
    2. How would you find an optimum set of values to ensure that the system is balanced?

    When you start to build simulations, you quickly gain a deeper appreciation of the dynamics of systems. An important part of simulation is being able to discover, and then communicate the results of your simulation.

    Using the program code above plus the details available in Ciw documentation, develop some additional information to report.

    For example, it would be interesting to see what the average length of the queue is before the machine.

    This will then tell us what the total inventory that is being processed amounts to (Work in Progress, or WIP).

    The code above currently reports the average (mean) of a set of values. Enhance the reporting to include additional summary statistics such as standard deviation.

     

    Return to home page

  • Cyber-Physical Systems challenges: iSCI2020 Invited Talk

    Here is my invited talk about Cyber-Physical Systems challenges, for iSCI2020 in Guangzhou, China. It was pre-recorded due to the COVID-19 travel ban.

    Research Challenges for the Industrial Adoption of Cyber-Physical Systems

    Abstract: Interest in the ‘digitalisation’ of industry, specifically manufacturing, is driving the development of innovative technologies that make the exchange of data, and the inference of knowledge increasingly accessible. Large organisations are able to rapidly acquire and evaluate Cyber-Physical Systems technology, enabling new business models to be created. However, significant challenges exist with regard to the design, operation and evaluation of industrial CPSs in terms of their accuracy, calibration, robustness and ability to fail safely. Traditionally, such systems would have been designed using formal approaches, but the scale of CPS adoption is such that there is less reliance on the established methods of validation and verification. This talk explores some significant challenges for the CPS and associated software development research communities.

  • Production planning

    Production planning

    Experienced shop floor supervisors and production managers can often find themselves at odds with the production planning function in a manufacturing business. Tensions emerge between the organisational desire to trust the principles of Materials Requirements Planning (MRP), which is often the core module for determining works orders to be manufactured, and the hard-won experience of managing materials through workstations.

    The theory sounds fine; assign a lead-time to each component part, enter the due date for the finished product into the system, along with an order quantity, and the planning system will back-calculate the date by which the material is released to the factory.

    Factory supervisors would argue that the assumption that lead-times remain constant is fatally flawed. The planners might reply that if the production schedules were adhered to, everything would work as intended.

    Taking a scientific approach to production planning, by constructing simulations of manufacturing systems, we can observe that any variation in actual lead-times can wreak havoc on the performance of the overall system. A system where material is pushed into the factory as a consequence of a due dat and fixed assumptions of process lead-times is extremely sensitive to line stoppages, operator absence and material shortages.

    Simulation illustrates that there is a direct relationship between lead-time and the quantity of Work-in-Progress. If the WIP increases, so does the lead-time. If you keep pushing material into the system because that is what the MRP software says, the WIP will increase, and therefore so will the lead-time. Orders that become overdue just get added to the list of works orders, and the cycle continues until other measures such as overtime are taken. Production planning can become a nightmare.

    The issue here is that MRP does not take into account the WIP levels for a given system. It assumes that the constant lead-times shall manage a constant level of WIP.

    Staff on the shop floor realise this, though it isn’t always that intuitive how to solve the problem.

    Kanban is often hailed as the solution, as part of a Lean implementation. WIP is explicitly controlled at each workstation in a Kanban line. The material cannot be released for processing until a Kanban card becomes available – it is pulled through the system rather than pushed as with MRP. As soon as Kanban is installed, a dramatic reduction in WIP is observed immediately, which is good news until a stoppage occurs. Lean systems use this “threat” to have everyone focus their attention on the stoppage to resolve the problem, with the aim of eradicating the stoppage permanently.

    However, this still doesn’t always rest easy with the shop floor supervisor, who only truly settles when the bottleneck process is kept running.

    In a manufacturing system, the bottleneck governs the output of the system as a whole, and should therefore be utilised as much as possible. The way to do this is to ensure that there is a suitably sized buffer of jobs in front of the bottleneck to keep it going. Starving the bottleneck is starving the factory of capacity.

    The granular control of WIP at every workstation can therefore be too restrictive for some production lines, especially where setup times are lengthier or there are just more stoppages in general.

    Maintaining a constant level of WIP for the system as a whole, rather than between individual workstations, is the approach referred to as CONWIP. Since CONWIP does not control the individual transit of material between workstations, that material is permitted to flow freely within the factory.

    The emergent effect is that it queues at the entry to the bottleneck, which is exactly what the production manager wants. This keeps utilisation of the slowest process as high as possible, while still restricting the flow of new material into the system, which would adversely affect on-time delivery of finished goods.

    WIP control is a fundamental concept for the management of a production facility. IIoT can help enable WIP management by monitoring the utilisation of the current system bottleneck, and controlling the release of new material into the system in response to natural variations in process cycle time. This is particularly important for manufacturing systems that need to deliver mass-customisation for customers.

  • IIoT: Technical vs. people skills

    IIoT: Technical vs. people skills

    You want your factory to be IIoT enabled. You’ve seen the videos and read the case studies. It’s obvious: IIoT technologies are central to your digital transformation.

    But where do you start? How do you start?

    The technologies of IIoT implicitly demand people with technical skills. While we travel through an early adoption phase, some plant can produce the data we need, but we’ll probably have to augment other plant so that it can do the same.

    IIoT lends itself to the technophiles; even though the barriers to entry are lowering, if you want to fasten a data reporting capability onto a machine tool, you’ll need to know what you are doing.

    If we consider the area where IIoT is flourishing currently – condition monitoring and predictive maintenance – then the domain is populated by technical people, with technical skills, doing technical things.

    Some installations are moving to a service-oriented model, where the manufacturer does not get involved with the IIoT at all. The IIoT installer takes care of data monitoring, analytics, reporting and communication, and merely produces processed data to be consumed by the client.

    If we want to transform a factory, we shall need to think much more broadly than a predictive maintenance solution. The complex interplay of multiple machines, material handling equipment, finishing plant, assembly lines, etc., suggests that there will be an imperative for the rapid up-skilling of existing staff to become more technical.

    But we know that technology projects often fail when the focus is on the technology itself. Of course, it is the potential of the technology that justified the transformation project in the first place; however, people are still central to the operations and they need to be brought along with the change if the change is to stick.

    So, IIoT implementation initiatives need a people-centric approach to lead the development of people-skills. Lean is a good way of approaching IIoT adoption as it focuses on the principles of efficiency, supporting the development of appropriate behaviours.

    With such behaviours in place, IIoT can be `relegated’ to a technology that serves what people really need.

  • Modelling robots and Cyber Physical Systems

    Modelling robots and Cyber Physical Systems

    Webots (https://cyberbotics.com) is a simulation environment for the design and modelling of robotic systems.  Since robotics invariably results in some physical actuation, Webots can also be used to model Cyber-Physical Systems, and being open source, the software is free to use and experiment with.

    Webots is particularly suited to newcomers to robotic systems, though it can still be used more formally in industrial scenarios, via links to the Robot Operating System (ROS – https://www.ros.org).

    If you have a need to develop a robot, a physical control of a process, or you are just curious, Webots is a good place to start.

    I use Webots for teaching both robotic systems and cyber-physical systems, usually in the context of digital manufacturing/Industry 4.0. You can model entire systems at an abstract level, or focus on the detail of sensors interacting with each other.

    Some more reasons to use Webots can be found here: https://cyberbotics.com/#webots

    To get a working installation of Webots, you should visit the excellent documentation that is located at: https://cyberbotics.com/doc/guide/installation-procedure#installation-on-macos

  • IIoT Kanban – not so easy

    IIoT Kanban – not so easy

    Any factory floor supervisor knows that the more raw material/components/work-in-progress that is pumped into a manufacturing system, the longer the lead-time. Put another way, the order is delivered late, and subsequent orders cannot be planned with any certainty as you don’t know when they will be ready. This is not good for business.

    MRP attempted to deal with planning by inferring a fixed lead-time for each stage of production. This didn’t work either, though it doesn’t stop manufacturers persisting with MRP based information systems as they cannot find a better solution.

    Kanban, from Japan, directly deals with the issue of work-in-progress levels, by controlling them directly. Each Kanban card represents a unit of work to be produced. When you run out of cards, you cannot release more material into the system, even though you could increase the utilisation of a machine, or reduce your wastage percentages.

    This discipline can feel counter-intuitive at first, especially when a machine is sat idle and you just know that you could get ahead with one extra batch.

    The effects of too much WIP is elegantly demonstrated by a sausage machine. If you  put too much sausage meat into the grinder, the sausages are uneven and mis-shapen. If you carry on adding meat, the grinder just blocks up.

    When you get the flow of material into the grinder balanced with the application of the sausage skin, everything works nicely.

    The consequences of letting WIP go rampant are clear when there are tangible, physical products. But what about processes that do not result in a product?

    What can we do with information ‘products’?

    Kanban is also developing a following with project managers who would like to follow similar principles; manage the workload at any given time to avoid blockages and breakdowns.

    More experience is required in that it can be more challenging to estimate the amount of work for a given task, as opposed to knowing the machining cycle time of a particular component.

    But, the use of IIoT to capture data does help build a corpus of information upon which to predict future durations for tasks, irrespective of whether there is any physical product involved or not.

    A lot of the promise of digital manufacturing is the ability to delegate coordination, optimisation and decision-making to the machines. This means that the machines will have to be aware of their surroundings so that they can make judgements that do not violate the WIP protocols of a system.

    This means that digital manufacturing is more than IIoT equipment. It needs architectures and conceptual thinking to ensure that the necessary behaviours are in place to replicate and eventually replace human-centric interventions.

    We may need an agent-oriented view of our systems as these attempt to map behaviours into functionality, and a fundamental aspect of such behaviours is that they are communicated between agents socially.

    Kanban’s apparent simplicity actually disguises a set of complex behaviours that humans take for granted.

    The machines have a way to go yet. 

  • Planning for flexible manufacturing systems

    Planning for flexible manufacturing systems

    For decades there has been talk of the ability to design flexible manufacturing systems, that can accommodate variations in demand and remain efficient.

    The variation in demand might be volume; the goods may be seasonal and have one or more peaks per annum. Alternatively, the product configuration or type might change in response to customer demand, in which case the manufacturing system would have to change what it produces to meet that demand.

    A lot of manufacturing plant is not easy to move around, which might help the logistics of moving materials between different machines for different products. If we could arrange the machines in a particular configuration, we could mimic the benefits of a flow line, where the plant is organised to minimise material handling operations between each stage of production.

    And so we have the ‘job shop’, where plant is loosely organised so that there is space for the loading and unloading of raw materials and components, yet the machines are close enough to try and minimise the distance by which material is transported in between workstations.

    Some factories have products that make up a larger proportion of their orders, and the job shop is optimised around these items; others group their plant by general operation: cutting, drilling, fabrication, assembly, finishing, etc. What remains are pure job shops where the machines are used as and when for each individual order, and there is little opportunity to organise  for additional efficiency.

    The emergence of new production technologies such as additive manufacturing presents new opportunities for how we might plan production.

    3D printers can manufacture a wide range of different products, from the same workstation. Such variety in production capability has not been witnessed before as an automated system. Prior to automation, a craft worker could manufacture many different products from their workstation. Attempts to produce multi-capability machining centres have extended the range of what can be produced, but this is nowhere near the potential variety of additive manufacture.

    So what does this mean for production planning?

    For the traditional flow line, the objective of planning was to optimise the balance between work-in-progress (WIP) holdings and the uncertainty of disruption caused by machine breakdowns or interruptions to material supply.

    For a job shop, the planning is more complex and is an attempt to manage the utilisation of each of the machines, whilst also minimising the lead-time to the customer.

    The key concept here is complexity. Flow lines may be lengthy and consist of many workstations, but the direction of material flow is known and throughput is governed by the slowest  operation. That might be a machine, or a faster machine that has broken down temporarily, that is causing the ‘bottleneck’. It is straightforward to a) identify bottlenecks in flow lines, and b) ‘chase’ them by resolving any issues.

    This is a much more complex for a job shop, where the permutations of the order mix, when combined with other uncertainties, compound to become a manufacturing system that is difficult to optimise.

    In a lot of cases, job shops are loosely managed. At best they might be managed by simple rules of thumb, such as ‘keep Machine B running at all costs’ or ‘keep batch sizes between 10 and 25’. Such rules bring a degree of stability to a system, though there are always circumstances that upset the order, and these tend to be more problematic in job shop environments.

    Simulating a manufacturing system therefore sounds like a rational thing to do. If we can simulate the factory operations, we could:

    • understand what the plant utilisation, and capacity constraints of the system might be;
    • pre-empt where the bottlenecks might lie;
    • explore the impact on production of different plant configurations;
    • develop more optimal production schedules prior to commencing manufacture.

    For flow lines this is relatively straightforward. But there is often a reluctance to create simulation models for job shops as they are believed to be too complicated. How can all of the possible combinations be captured to create a meaningful representation of the system to be optimised?

    Simulation is a relatively inaccessible subject for most manufacturers. It does require specialist knowledge, but what most people do not realise is that knowledge is easy to acquire and to put to use.

    Manufacturers with flow lines have generally made use of simulation to good effect. But job shops are not universally difficult to simulate, particularly if the number of different workstations is limited.

    Even in more complex scenarios, it can be useful to model and simulate part of the overall system, particularly if you are attempting to see the effect of a specific intervention to try.

    One of the difficulties of simulating manufacturing systems can be the lack of detailed process information. Cheap IIoT devices can sort that situation quickly. Once the data is captured, a simple simulation model can start to help you understand just how flexible your system needs to be, and what the impact of that flexibility on your plant resources might be.

    Additive manufacturing workstations combine flexibility of configuration, with an almost flow-like relationship to other operations. Many items can be almost completely manufactured  by one process step. This simplifies the approach to modelling, meaning that simulation is going to be an important part of manufacturing in the future.

    To conclude:

    • planning is an important activity for manufacturing, especially in a volatile environment;
    • simulation can help us understand the complexities of a manufacturing system without the cost and hassle of experimenting in real-life;
    • flow lines are often straightforward to simulate, and historically we have not put as much faith into the modelling of job shops;
    • we don’t have to simulate an entire manufacturing system to derive real benefit;
    • inexpensive IIoT equipment can capture process data that is valuable for simulation;
    • advanced production technologies such as additive manufacturing simplify the complexities of simulations and production scheduling;
    • flexible manufacturing requires flexible simulation and planning.

    Some time ago I was speaking to a representative from a global Original Equipment Manufacturer (OEM), who said that 75% of their manufacturing was performed by Small and Medium-sized Enterprises (SME). As the OEMs strive to deliver more options for customers (mass customisation is one example), they in turn create a demand for their SME suppliers to be more flexible.

    Such flexibility can only be delivered by a digitally-enabled supply chain, and most likely means that manufacturing simulation is here to stay.

  • Scrub your IIoT data clean

    Scrub your IIoT data clean

    Your IIoT equipment is installed. You can see the devices on your network. Your log files are being written to when the machine runs. What next?

    Initial excitement can soon turn into dismay when you attempt to produce some initial reports from the data that you are gathering.

    Why are there so many missing values? What does this mean?

    Those readings look far too high. What is going on here?

    While frustrating, this is actually progress. The questions are forcing us to understand the process better, and that will eventually lead to improved insight. We can’t optimise a process if we don’t understand it.

    Since our intention is likely to be to make predictions about future performance using analytics, we need to be conscious that the statistical methods are only as good as the data they are fed.

    If we tell our statistical model that temperature readings of 35C are OK, then we must expect that the predictions will accommodate this. If the readings should not be that high, we are setting our analytics capability up to fail.

    So, to be sure that any subsequent processing of the IIoT data is correct, we need to make sure that our data is ‘clean’. That means, no errors, missing values, inconsistencies, etc.,within the stream of data that is to be scrutinised.

    The process of data cleansing is essential for consistent post-processing. It starts with a set of rules that can be automatically applied to the data as it is produced. These rules enforce a level of quality that you understand, so that you are confident that any calculations that are based on that data are sound.

    For instance, data from sensors can be noisy. There may be out-of-range values that are irrelevant, or that indicate a problem with a sensor. Similarly, an operating hitch with the equipment may mean that a duplicate value is recorded for the same event.

    It might be that a process that requires human input may have contained an error. Or, the system itself may have corrupted some of the data during its storage or transmission to other computational nodes.

    It is therefore important to look closely at the data that is produced from an IIoT device, and to evaluate how relevant it is to the objective of your analysis.

    An important task is to provide a means by which the data can be visualised, as this significantly aids comprehension. A scatter plot is an effective way of illustrating values that are outliers, which may not have been as obvious in tabular data.

    Missing data is often an interesting phenomenon. It immediately raises the quest open of “why” it is absent in the first place, and we need to ascertain the difference between a genuine absence of a value (because nothing was happening) and an error in the dataset. Errors in the dataset can cause issues with subsequent processing, so it might be useful to interpolate a value and have that inserted automatically so that the record is complete. This is something that is best decided in the presence of a domain expert (usually the machine operator), in order to qualify what is the appropriate thing to do.

    Clean data is important and the presence of dirty data can scupper the successful implementation of IIoT equipment. Try to see it as part of the process of understanding your system, and it will build your organisation’s capability to understand its processes much faster than any training course.

  • Simulation as experimentation

    Simulation as experimentation

    Resources are generally more constrained in an SME manufacturer than for a large corporate organisation; there isn’t the luxury of a research and development department, or an intelligence unit that deals with reporting, analysis and planning.

    This situation becomes much clearer for those who have experience of working within an SME. It is not that the staff cannot think innovatively, or solve problems for themselves. It just isn’t feasible to commit any time to experimentation as the actual cost to daily production is too high.

    SMEs are often preoccupied by the orders that need delivering now, and cannot halt production to ‘have a go’ at a potentially interesting idea.

    For owner-managers, there is a tension between the requirement to plan for the future, and the pressing need to deliver the next order. Some SME manufacturer’s don’t plan too extensively and are at the mercy of the prevailing market conditions. They rely on an ability to be agile, to ‘duck-and-dive’ in response to external factors.

    Those manufacturers that do forecast generally apply it to sales, and then use historical experience to translate this into approximations for stock-holding and subsequently the demands that might be placed upon the factory.

    Theory tells us that having access to more data improves the quality of our decisions. Well, this is true up to a point. Too much data, especially raw, granular data, requires too much effort (and know-how) to get its into the condition where it can be useful.

    IIoT technologies have data production and sharing as a functional priority, and while this might give the production supervisor some new insights about how the plant operates, the volume of data that is accumulated will soon become too much to comprehend.

    As such, the ‘experience’ model of production management cannot scale to accommodate the tidal wave of data that a digital transformation can produce.

    Indeed, many of my discussions with SME staff is about how they can make better use of the data that they already have. The discussion starts by the company wanting to see how they can embrace Industry 4.0, and then we end up exploring how they are using the data that is being produced by their existing plant.

    For instance, how are the log files from Machine X being used for planning? In the majority of cases, the data is being saved, and that is the end of it.

    Some of the operations data is waiting to be tapped into as a valuable source of information, such as the electrical power signatures that all plant produce. Hidden in those operations is a wealth of behavioural, condition, and performance data that only requires current sensors to detect.

    But, aside from taking the plunge and actually deploying some IIoT equipment, there is a general reluctance to disrupt the manufacturing schedule as it would appear to create too much of a financial risk.

    So, while the owner-manager realises that they need to strategise for the future, they might only restrict their planning activity to an accounting view. Management accounts provide an abstract means of modelling a business, but for a manufacturer this might not be enough if they are attempting to optimise to a finer degree.

    Essentially, the accounting method of modelling is a high-level simulation of hoe the business might react to external stimuli. What is needs therefore is a less abstract view of the factory, that a) permits lower-level decisions to be taken about important processes, and b) translates the high-level accounting view into a more realistic set of reactions for the manufacturing system itself.

    Simulation is a topic that can be a big turn-off for busy people. It sounds academic, it will probably involve complex mathematics, and it will take too long to learn.

    However, while these statements may be true in some circumstances, there is a much lower barrier to simulation than a lot of people realise.

    Considerable insight can be quickly gained with a spreadsheet, and this is something that is on everybody’s desktop computer these days. Simulation programming languages such as SimPy or ManPy can even open up the power of simulation to non-programmers, though many people find that a spreadsheet is sufficient for their needs.

    Simulation improves the depth of your “what if?” questions, and also answers some of them, which in turn, enhances your understanding of the manufacturing operation as a whole.

    With a little practice, simulation can become a tool for evaluating various options, before you make a decision. This can really empower SMEs to ‘experiment’ with the introduction of IIoT, before they spend a penny!

  • Digital transformation: an approach for SMEs

    Digital transformation: an approach for SMEs

    All the hype from Industry 4.0 creates a lot of impetus for SMEs to ask how they can make it work for them. SMEs are focused on doing more with less, and are motivated to respond to any call to ‘reduce the productivity gap’.

    Management approaches such as lean manufacturing can achieve a lot with existing plant and machinery, and there are countless case studies that demonstrate what can be achieved with pencils, paper, data analysis training, and persistent leadership that builds a ‘can-do’ culture of continuous improvement.

    There is often a sense of frustration from SMEs who can see that they could benefit from a particular technology, yet the cost (either capital or operational) is prohibitive or that they just cannot afford the downtime to implement it.

    In such cases, it is not clear for an SME what the route forward for improvement is, and they can get stuck in a rut with no obvious solution.

    SMEs are sold a vision from the technology vendors that perhaps seems unattainable; the potential benefits of the technology can only be realised in a cooperative, supportive environment, and this is certainly not quick to build.

    So, what SMEs need is a framework that can  explain what stages need to be in place so that their digital transformation ambitions are successful.

    This framework should contain three key aspects as follows:

    1. Understanding what capability is required. Technological know-how is important, but perhaps of more importance is understanding the business requirement for change first, and using this to drive a set of technological and environmental development requirements. Lean is a good way to both understand the existing situation and also to see where the next benefits will come from, and it is essentially a human-centric approach that yields tangible results.
    2. Once the business imperative is understood, we are in a much better position to evaluate what actual technological innovation is required.  A technology vendor might well encourage a large-scale programme of IIoT adoption, but this is exactly the sort of behaviour that makes SMEs sceptical of Industry 4.0, as the upfront costs are often too high. Choosing a process to transform, and the judicious use of simple sensors, localised analytics, and dynamic data visualisation can go a long way to actually realising the benefits that have been identified in the requirements phase.
    3. Look for opportunities to scale. This is perhaps the most exciting stage. Now that we have a few processes that have been improved, it is likely that a) we shall start observing other areas of the business that are becoming stretched, and b) we are also beginning to broaden our outlook and see new opportunities for growth that did not exist before, as a realist of more collaborative operational activity. This is where we can now exploit the fact that we not only understand our processes better, but we can also start to think about automation and the intelligent delegation of process monitoring to the machines.

    The outputs from each of the above steps tend to provide additional insight that motivates continued development. And perhaps more importantly, each decision to purchase equipment is justified by a specific problem that is to be addressed.

    Digital transformation is often described as a ‘top-down’ approach, where executive leadership needs to ‘buy-in’ and support the agenda.

    In SMEs the executive is often the workforce, and they need a bottom-up approach to make it work!

  • Lean + IIoT = skills gap?

    Lean + IIoT = skills gap?

    Central to lean manufacturing is the empowerment of people. Lean can’t work unless there is a pervasive culture that will experiment, implement and evaluate. The lean methods enable employees to focus on fault diagnosis and to propose innovative solutions that minimise the resources required to maximise throughput.

    A lean implementation will help build a good standard of information literacy amongst staff who might previously had very little experience of data analysis. And this could be useful if a factory embarks upon enhancing its data capture with IIoT equipment.

    There is an added complication with IIoT/Industry 4.0/digital manufacturing data capture though; these initiatives assume the use of analysis techniques that are beyond the traditions of Statistical Process Control (SPC).

    Machine Learning (ML) is being touted by many software vendors as a silver bullet (which it isn’t), but it does have a place when it comes to developing insight from data streams that are both fast and large. ML models can help us understand some of the complexities of processes by  providing different perspectives of the data; such models will probably include a number of inter-related processes, rather than the single operation that is represented by a lean Statistical Process Control (SPC) chart.

    SPC charts can often be challenging to comprehend until you have seen a few, or better still, plotted them yourself. They are an excellent way of using a data visualisation technique to communicate deeper understanding of a process.

    But using Fast Fourier transform (FFT) to make sense of streaming data requires a different set of skills, as does the selection of random walk over a support vector machine (SVM) for classification. Hence the the demand for the ‘data scientist’ who can take such decisions and help translate the complexity to us mortals.

    We are heading towards an era where there is a fundamental need for more sophisticated data analysis skills. Technologies such as IIoT are providing better, cheaper access to data. We can look beyond the confines of the individual manufacturing process and analyse operations at scale.

    However, realising this potential means that we need to address a skills gap that is rapidly emerging. A lot of Computer Science degree courses do not teach these skills, some of which are more often found in electrical/electronic engineering courses. Some Data Analytics courses are starting to appear, and these will only increase as businesses see the need to make better use of the data that their IIoT devices are churning out.

    So, lean might be a good way of introducing digital transformation technologies to a business. But this may also expose a need to develop advanced data literacy skills rather rapidly.

     

     

  • How is Industrial IoT (IIoT) different from IoT?

    How is Industrial IoT (IIoT) different from IoT?

    The Internet of Things is becoming more widely understood, at least in terms of the vision. Connecting ‘smart’ objects together, so that they can work together to solve goals requires ever-decreasing units of computation to process the data that is exchanged. From the public awareness perspective, smart objects might range from Wifi-controlled lightbulbs (which are readily available) through to autonomous vehicles (which are not ready quite yet).

    The scenario of a smart household refrigerator ‘talking’ to the household appliances seems to be rather contrived attempt to explain what is possible with IoT technology. Voice control technology such as Apple’s Siri and Amazon’s Alexa are steadily becoming more integrated into people’s daily lives as the technology enables voice commands to integrate with a range of other systems.

    Not everyone realises that their spoken word is being transported to a remote clod service, for processing, before actions are communicated to other systems and devices that deliver a (sometimes physical) response. Telling ‘Alexa’ to turn the lights on is an innocent activity that most people can get used to very quickly.

    So, what is the Industrial Internet of Things? Surely there is more to it than voice control of the factory lighting?

    Voice control is an interesting aspect of any system as it heralds a new way of humans to interact with systems. Perhaps what is interesting about industrial systems is that there is more of an emphasis upon the control of physical systems; using interfaces to actuate and control physical objects such as machinery for instance. You could argue that this applies to autonomous vehicles, the ultimate Cyber Physical System (CPS), and indeed it does, but Industrial IoT still remains more focused upon physical sensing and actuation, particularly when we think about the manufacturing industry.

    There are also interoperability challenges that are specific to industrial systems which IIoT needs to be able to accommodate.

    Unlike domestic IoT, standards for the interoperation of industrial systems have existed for some time, and though some of those standards may be proprietary, they need to be harnessed if the loftier goals of digital manufacturing/Industry 4.0 can be achieved.

    Industrial systems are generally classified into two types;

    – Operational Technology (OT), where robust devices such as Programmable Logic Controllers (PLC) and Supervisory Control and Data Acquisition (SCADA) systems are deployed to ensure safety, accuracy, real-time capability and generally the satisfactory control of manufacturing operations;

    – Information Technology (IT) infrastructure that caters for ‘enterprise’ information systems requirements such as data storage, organisation, transaction processing, business reporting and visualisation, and data security.

    It is typical for many industries, for there to be no direct link between OT and IT systems other than the human operators and supervisors that interface directly with the systems themselves. This ‘gap’ is the area of opportunity that Industry 4.0 is attempting to fill as it drives the adoption of bridging technologies that can enable automation, reasoning and subsequently large-scale delegation of tasks to the machines themselves.

    As we start to contemplate the coupling of OT and IT, established concepts in either field take on new meaning. For instance, ‘safety’ has a more serious connotation when the system is controlling a machine tool. The ‘safety’ of an Enterprise Resource Planning (ERP) system is perhaps more focused upon data leakage rather than a direct threat to the physical wellbeing of a human operator. When designing an embedded control system, a formal approach to the software design is an established practice. However, enterprise software systems have grown to such scales that formal methods are no longer used to develop such applications. This is a clear example of the potential tensions that need to be worked out for successful IIoT adoption.

    So, while IIoT has a lot of commonalities with the IoT, there are  more domain specific challenges to consider. At present, these challenges are focused upon interoperability; not only from a technical perspective of “how to make ‘X’ talk to ‘Y’”, but also ensuring that the needs of each system are preserved without any adverse effect. Security means different things for OT and IT, and their coupling must protect system users accordingly.

  • We’re not ready yet for Industry 4.0

    We’re not ready yet for Industry 4.0

    Computer security is a topic that frequently emerges when talking with a manufacturing business that is curious about “digital transformation” or “Industry 4.0”.

    I find that a lot of SMEs are sceptical about Industry 4.0, in so much that they can’t see a practical way forward to realise the potential benefits. They are just about persuaded that their business transaction databases can be “secure”. But any talk of connecting machinery to a network, so that real-time data can be recorded for processing is a step too far.

    Their fears are not unfounded. Any business must protect its operations from the leakage, loss and mis-use of its data. If we start bolting-on sensors to factory plant, and connect these sensors to other systems to enable greater efficiencies, we have increased the number of opportunities for process data to be exposed. And process data is where the intellectual property (IP) of many manufacturers lies.

    A lot of effort has been expended in the development of secure communication networks and computer systems, but the information breaches that become public only serve to reinforce the security fears of SMEs.

    When your competitive edge is defined by your process IP, you will be especially motivated to protect it.

    A machine operator discussing their work over a beer at the local bar might reveal some information that gives a clue as to what the business does differently. But this is still relatively benign next to the situation where you have access to all of the data that is being generated by a factory.

    You can do a lot with data, and it becomes easier to recreate a scenario the more data you possess. So, if you can gain access to that data, you can act in a more informed way.

    However, Industry 4.0 is not just about collecting data. It is also about predicting the future using in-process forecasting; developing enhanced methods of visualising complex data to aid comprehension; using computational resources to automate and delegate physical actuation; and also to create models of the future that can be reasoned with to improve coordination, scheduling and resource utilisation.

    So, what do manufacturers need?

    Common installations of Industrial Digital Technology require sensors of various types, embedded systems to process the data that is captured and associated networking infrastructure that can transport the data to a centralised data storage/processing facility that is typically a cloud, that may reside off the premises.

    Making better use of the computational resources at the endpoints of networks has given rise to “Edge Computing” where more of the data processing is “pushed” towards the devices at the edge of a network. This has become possible since hardware is continuously becoming more capable and less costly to deploy.

    Edge Computing does  have much to offer manufacturers, particularly with respect to the processing of process data much closer to the process operation itself. However, while some processing can occur, more significant processing still requires more capable hardware, and if that hardware is provided via a cloud, there will be a requirement to have an encrypted connection to that cloud so that the analysis can be done.

    Since remote cloud resources are a source of security fears for manufacturers there is a need to provide the capability to perform Industry 4.0 analytics and visualisation within the confines of an organisation’s firewall.

    How can the research community respond to this need?

    Microservices Architecture is one approach to the development of agile and robust software systems that may be suited to Edge Computing environments. As we place greater demands upon our systems, and ask new functions of systems that were not part of the original requirements specifications, there is a need to enable systems that can scale elastically; much like clouds do for utility computing.

    Such architectures may also support the development of capabilities that engender trust between smart objects. Manufacturing systems contain many physical objects, some of which interact with each other. If we want to automate the logistics of objects within a manufacturing value-chain, we shall need to  ensure that there are workable trust mechanisms so that the correct interactions can take place.

    The certificate authority model of trust cannot scale for a world of smart objects, and this is s driver for research into multi-party authentication schemes, as well as distributed ledger approaches to data and identity provenance.

    Once we can confidently deliver insightful analytics and automation within a manufacturer’s firewall, I’m sure that the uptake of IDT will accelerate.

  • Some challenges for Cyber Physical Systems

    Cyber Physical Systems are one step towards our quest to build smart systems. Smartness implies convenience, and there many compelling visions of the technology improving the quality of our lives.

    Smart phones are ubiquitous now, and there are trends developing whereby a smart phone is the only means of access to the internet that some demographics have. Smart home technology is essentially automation, that through inexpensive devices, coupled to a pervasive network of broadband and 4G cellular access, is leading towards greater uptake of lighting and heating control for instance. Smart cities are perhaps the most visionary application of technology, as they integrate technologies that are available, together with technologies that are developing rapidly such as autonomous vehicles.

    Research to date has been roughly partitioned into Wireless Sensor Networks, Internet of Things and Cyber Physical Systems, with clear areas of overlap. It is clear though that there is a move towards convergence as it becomes more difficult to separated the technologies, as at one level, any integration of physical sensing and actuation through a communication network actually becomes a cyber physical system.

    What is interesting is that we appear to have a hierarchy of systems developing, with the realisation that a CPS is not necessarily limited to a discrete set of components and capabilities that are brought together to solve a particular problem; a CPS can emerge because it has been enabled by a pre-existing network infrastructure for instance.

    So, if we have systems-of-systems, how do we:

    • verify that a solution is correct prior to execution, and
    • resolve any conflicts during execution.

    The CPSs do need to scale in an uncertain environment, as the density of data and interactions increases, as well as the availability of computation through embedded “smart” devices.

    We also have hardware and humans in the loop, which also mandates that we model behaviour and emotions if we are to explore realistic representations of the eventual systems.

    We know that a CPS acts upon its environment, and that the environment is continually changing. How do we cater for secondary, unforeseen, unintended events?

    This poses challenges for human safety and how we provide the necessary controls that can respond in real-time. We also know from experience that services are not all created equally. Different design approaches and standards can lead to different results.

    Real-time control is a concern for the CPS research community, as the effects when a system does not respond instantaneously could be fatal. Such scenarios may happen at some time in the future, and depending upon the environment may cause exposure to hazardous situations such as chemicals, radiation or unknown phenomena.

    The effects may be different for a changing demographic of user, and might not appear as we would expect.

    One approach to the arrangement of CPSs could be to consider watchdog architecture, where the components and services are partitioned into:

    • verification: we have methods in place to verify and validate design models prior to implementation. Formal methods can assist with this.
    • conflict detection: we proactively include functionality within the design to seek out and detect behaviour that would cause conflict with either the system itself or with other interacting CPSs.
    • conflict resolution: the solution has the capability to be rational and optimises its decisions based upon experience (knowledge base) and the desires of the system, which recognising any altruistic, share goals of a given community.

    Scaling and density

    Such is the potential scale of deployment of CPSs, there are challenges that must be addressed relating to both the scale of a CPS’s influence, as well as the density of sub-components of CPSs and their communications infrastructure.

    For instance, how many individual devices might represent an agent or actor in an environment? If we assemble systems from sub-systems, who owns what in a service-based environment? This is particularly relevant to the stewardship of personal data. A related challenge is knowing when to share data to exploit an opportunity that will advance an agent’s goals.

    The need to be flexible and adapt to changing circumstances is important for an effective CPS. How do we reconfigure during execution?

    All of this functionality will place a greater demand on messaging for sensing and actuation, and introduces the possibility of greater interference and cross-talk within the networks. Since these are increasingly likely to have an impact on critical functions, there will be safety implications to consider.

    Runtime complexity

    Earlier, we considered a watch-dog architecture as one potential approach to governing a CPS. The implications of this being managed in real-time are additional complexity. Not only does the design have to take account of the inherent relationships between conflict detection, resolution and verification, but during execution, conflict detection, safety analysis and re-validation needs to occur.

    This emphasises the dynamic nature of a CPS; it must react to environmental stimuli, whilst considering its internal goals. It has an obligation to not do harm, and must therefore continually re-assess its actions.

    Software engineering approaches include testing as a stage, or as an interactive part of the development process. In the case of a CPS, the testing needs to be a continuous function if the underlying design model is to retain its key principles. For instance, sensor readings can be emulated to to test a particular function. If a set of readings produces test results that are outside of scope, then the CPS has to decide whether to adapt or to retain its current configuration. One significant challenge is how we can produce models that include an element of “conscience” for a CPS, to steer its reasoning when faced with dilemmas.

    Data challenges

    Big Data has brought the characteristics of volume, velocity, veracity and value to the fore. A CPS needs to decide what data is required for a particular purpose. An interface with humans is likely to include the capability to process natural language, to extract concepts from streamed inputs, and to perform information searches, all in real-time. In addition, the CPS will also need to be able to distinguish between information that is trusted and that which is not.

    Complexity

    Normal human behaviour (whatever that is), is complex. Humans can cope with datasets that are incomplete, by using past experience, reasoning, or even guessing. How can be develop such capabilities in software? This is a key challenge for the control theory communities.

  • Cyber Physical Systems and Agency

    Cyber Physical Systems and Agency

    One of the attractions of Cyber Physical Systems thinking is that it has the potential to not only automate tasks that humans don’t/can’t do, but also there is the possibility of task delegation.

    From the human managers perspective, we can describe goals to staff (the “what”) and expect them to achieve the goal, without having to describe “the how”. The assumption in this case is that the staff member who is being delegated to has the capability to achieve a goal and knows “how” to do it.

    In effect, when we manage humans we rely on, and often exploit the agency that individuals have. This agency allows them to take decisions in response to a situation, without the delegator becoming involved.

    If we think of a human organisation as a domain for a CPS, there are a set of challenges that people deal with on a day-to-day business, such as:

    • an organisation typically has more than one employee, and there are numerous other stakeholders (suppliers, customers, regulators, etc.) that interact with the business. A human agent needs to be able to deal with decision-making that might be both collective and distributed;
    • the collection of stakeholders, who themselves have their own goals, agendas and capabilities, leads to a system that has complexities; there may be many ways to achieve something, and depending upon the stakeholders involved new possibilities may emerge as a result of new interactions;
    • successful delegation assumes trust, both in the ability to achieve a goal, but also in the most optimal way for a business as its resources are constrained;
    • systems that involve humans must suit the needs of humans. Humans interact in a variety of ways, including speech and visually, which means that the operational methods need to harness this to be successful.

    These challenges set quite a challenge for the design and specification of a CPS.

    How do we even approach the planning of such a system? How do we model an existing CPS, or a system that we want to convert to a CPS?

    How do we know that the result will behave as we expect it too?

    Multiagent systems

    We are, in effect describing a multiagent system (MAS). A MAS is a collection of agents that interact with each other to achieve system or individual goals.

    A community of people is a MAS, such as a team, a department, a manufacturing plant, or even a supply chain.

    At the micro level, we are interested in how an individual agent goes about its business.

    At the Macro perspective, we are concerned with how agents interact to achieve their desires, being conscious that as each agent has agency, it is feasible that they do not always have the same set of goals.

    In contrast with traditional enterprise application design, a MAS does not have formal control encoded into the DNA of its agents; a MAS relies upon communication between agents as the enabler of action and achievement.

    What is an agent?

    The simplest explanation is that an agent is an entity that can sense and act upon its environment. Sensing and actuation are of course fundamental components of a CPS. The link between sensing and actuation is some form of intelligence.

    From Wooldridge and Jennings (1995):

    “An agent is a computer system that is situated in some environment, and that is capable of autonomous action in this environment in order to meet its design objectives”.

    So, intelligent agents:

    • have autonomy
    • act with a specific purpose;
    • are situated in an environment.

    Autonomy is an essential characteristic if we want to delegate complex tasks to intelligent agents. We want our agents to act in response to unforeseen and unpredictable events, though we still want to retain ultimate control over an agent if things go wrong.

    An intelligent agent has the following properties:

    • reactive: it provides a timely response to the sensed environment;
    • proactive: the agent’s behaviour is directed towards a goal;
    • autonomous: the agent will take the initiative when required;
    • social: an agent will cooperate and coordinate with other agents;
    • intelligent: the agent will be able to reason between its senses and its own knowledge base.

    A rational agent is said to be able to balance reactive and proactive behaviours. We don’t want our agent to react to every situation, and conversely we don’t want an agent that procrastinates by constantly planning, though some human managers may recognise both of these behaviour in human staff!

    Agency as a design metaphor

    Agents and agency help us understand human societies at all levels as the assist us to abstract ourselves away from the complex detail. 

    Agents are perhaps the ultimate “black box” approach to encapsulation that is a feature of Object Oriented software development.

    In the same way that a capable employee can be re-deployed onto a different task, a software agent is reusable in different settings.

    Indeed, the new environment may not have been present when the agent was designed originally, and it is this level of adaptation that is required by a future CPS.

    MAS also assist in the modelling of interactions within a system, and game theoretic models of interaction can be explored. Such modelling can allow us to observe the effects of more sophisticated emergent behaviours between agents, such as coalition forming, bargaining and negotiation.

    In a sense, we take such behaviours for granted in human systems, but these are exactly the behaviours that a CPS must possess if we are to delegate any meaningful work to one.

    Agent Oriented Software Engineering (AOSE) is an approach to the modelling and design of systems using the properties of agency. One example of a situation that involves the interaction between two agent actors is illustrated in the Contract Net protocol below:

    sequence diagram of contract net protocol
    Contract Net protocol

    This protocol defines the communicative acts that need to take place between the two actors (agents), and in itself creates a specification for the behaviours that each of the agents require to function correctly when negotiating a transaction.

    Isn’t agency just Artificial Intelligence?

    There is a a view that AI is limited to the study of the atomic elements that contribute to intelligence: reasoning, planning, learning and perceiving. Agency includes communication as a means by which the atomic components can be combined, and as such a MAS is actually distributed AI.

  • Why cybersecurity is not secure enough for IIoT

    Why cybersecurity is not secure enough for IIoT

    Industry is in a constant state of flux as technologies are being developed, evaluated and deployed in order to create competitive advantage, increase productivity and efficiency, and to work towards a sustainable future.

    Ever-cheaper embedded devices and transducers, together with pervasive networking infrastructure has resulted in the rapid uptake of equipment that is now being implemented at massive scale. Consumers are becoming more familiar with the term Internet of Things (IoT); this technology is essentially driving a revolution within the commercial environment and is known as the Industrial Internet of Things (IIoT).

    Since IIoT devices are being implemented across industry, the sheer increase in computational nodes that are inter-connected via a network inevitably increases the potential points of system vulnerability. More and more “back doors” to previously secure (albeit not connected) infrastructure. The value of sharing data may be arriving at some cost.

    Security is therefore a pertinent issue for industry. A data breach from an organisation may leak valuable Intellectual Property (IP) to a competitor, with potentially disastrous consequences. A leak may expose confidential customer data, at the risk of jeopardising an organisation’s reputation. In the case of Cyber Physical Systems, there could be human lives lost.

    It is the development and adoption of new technologies and business models that is at the heart of these new vulnerabilities. Cloud computing has transformed the infrastructure of many organisations by enabling processing and storage to be outsourced to shared computing facilities in data centres, enabling computing to be an on-demand, elastic utility.

    Wireless communications enable data to be shred between devices where cables are either difficult to lay or their installation cost is prohibitive.

    Both of these developments are examples of organisations needing to increase their awareness of security control measures, whilst some organisations get it wrong and suffer the consequences.

    Wireless devices can be disabled remotely, or perhaps more worryingly, can be used to “listen in” to the data that is being sensed. CPS can be taken over, and physical actuation compromised.

    Security systems to date have primarily relied upon authentication mechanisms that use a central authority to establish relationships of trust between known components. As the explosion of IIoT devices continues, such authentication systems cannot scale sufficiently and new methods – such as multiparty authentication – are viewed as one possible way of addressing this challenge.

    Machine-to-Machine (M2M) communication is a key factor within digital manufacturing and the Industry 4.0 movement. This enables more data to be collected at the source of a manufacturing process so that tighter integration and coordination can be exploited between collections of manufacturing plant. The Internet means that the physical location of plant does not affect its ability to be included within a system, and thus much more macro-level system optimisations are possible.

    The issue is that what was once a recognised risk of a rogue operator/factory worker leaking process data to a competitor for personal profit, we now have the possibility that more detailed process data, that may describe an entire function of an industry rather than just one piece of plant, can potentially be accessed remotely and silently. Thus, security is becoming a major concern for IIoT adopters.

    As such, cybersecurity from an information security perspective is somewhat limited in its effectiveness for IIoT as it is concerned with the protection of data. IIoT’s inclusion of physical actuation as part of a control system, means that the security mechanisms have to take account of control mechanisms as well, as it is feasible that an adversary may hack an IIoT system, not to steal the data, but to “mess up” a process.

  • Simple sensing is great value

    Simple sensing is great value

    In a previous life I was a production manager. One of the frustrations of such a role is the feeling that more control and coordination could be exerted “if only I had the data for…”. Modern manufacturing plant often provides either local instrumentation, or the remote logging of its operational data, but if we want to think about integrating plant, and therefore think about optimising operations of the whole system, there is usually some basic information that is missing.

    In the late 1990s, transducers for sensing were expensive, and the computational resource required to deal with the sensing data was similarly difficult to justify. This situation was also compounded by the general lack of network infrastructure, which was essentially pre-WiFi. Radio links were available, but they were a) costly and b) prone to interference, or had poor transmission range.

    Fast forward to recent times, where:

    • transducers are cheap;
    • microprocessors are cheap, more than capable for signal conditioning and limited data storage, and easy to network;
    • network availability is pervasive via, wired, WiFi, Bluetooth, 4G, etc.

    What does this mean for the production/operations manager who still has the same answer of “if only I had …” when they are looking for ways to increase productivity?

    It means that we are in an age where it is cheap to experiment with sensing, and it is cheap to integrate the sensing that may already exist, but which is not being used for holistic decision making.

    There still exists the situation where manufacturing plant does not produce data while it operates, and it is incumbent upon operators to count items to record data about operations.

    Let’s say that we want to gather some data from a production line. Products are produced and transported to a destination via a conveyor belt. We’ll assume that the products are identical, and that they all follow each other in single file. The production manager wants some indication of what is happening in realtime, plus a set of alerts when a significant event has occurred.

    So, we fasten a light source and a photocell, or some sort of proximity detector to the side of the conveyor belt. What do we record?

    In terms of data, we record a time and date stamp every time an object is detected. We assume that the conveyor belt moves only in one direction (it doesn’t reverse in some situations for instance), and that the sensor does not produce false positives (when it says that there is an object present, we can trust the statement).

    As the production line operates, objects move along the conveyor and the sensor produces data in the form of a stream of time and date stamps, which might look like this:

    03-09-2019 10:57:23

    03-09-2019 10:57:29

    03-09-2019 10:57:35

    03-09-2019 10:57:41

    Let’s assume that we have connected our sensor to a small microprocessor such as an Arduino board, or even a Raspberry Pi. That board will enable the incoming signal from the sensor to be augmented with a timestamp and written to a file, or sent via a network connection to a PC where the data is recorded.

    What can we do with that data? With one sensor we can:

    • count the total number of objects produced;
    • measure the rate at which objects are produced;
    • identify events which occur that might interrupt the flow (e.g. system breakdown, changeover between product type, etc.)

    If we augmented the system with an additional sensor, placed either above or below the first sensor, we could also distinguish between two product types if each has a different height. But we shall keep to the simpler example of uniform products to keep it simple.

    Now that we have the data, we can produce very simple reports of production output and rate, while also creating alerts for when the objects appear not to be arriving.

    Furthermore, the data is now being captured and stored in a way that can now be synthesised with other such systems – adding more sensing to other parts of the plant will now enable a more holistic view of the operations to be created.

    These first, tentative steps towards data capture are an important introduction to the modernisation of industry through digital manufacturing (or Industry 4.0). Whilst the latest examples of integrated manufacturing plants illustrate the possibilities of Cyber Physical Systems to coordinate, control and actuate physical systems on our behalf, there is still a fundamental reliance on the generation of data via sensing, its collection, processing, and subsequent reporting in a manner that humans can comprehend.

  • Managing software complexity

    Managing software complexity

    The construction of software for an application is a complicated process. We employ Software Engineers to develop software in a way that helps us arrive at robust solutions, and it is the training and experience of such engineers that we rely on.

    One of the many effects of the adoption of Internet of Things technologies is the realisation that the inter-connected-ness of physical objects with other objects, and human beings, is creating systems that are inherently complex.

    If we consider the scenario where a machine-tool is controlled by a microprocessor, usually referred to as an embedded system, the range of outcomes that the system must govern, whilst numerous, is conceivable to the software engineer and can be accounted for in the resultant program code for that application.

    If, however, we consider a situation where collections of machine tools, all of different types, are networked such that they can exchange information for the purposes of enhanced control, optimising resources and reducing wastage, the complexity of such a system is more challenging to fathom. If we then augment such a system with inputs from the functions from within a manufacturing supply chain, the scope of complexity becomes increasingly more difficult to fathom.

    The combination of localised sensing, data processing and analytics, data exchange, data fusion and aggregation, data storage and visualisation is what constitutes a system (perhaps a Cyber Physical System) that can offer considerable benefits for an organisation that seeks competitive advantage. But how do our software engineers deal with such a challenge?

    There has been a tradition of developing the craft of software engineering, whereby the use of methods and frameworks, when combined with real-world experience of software creation, has culminated in the development of the skills and knowledge that we recognise as befitting the role of a “software engineer”.

    A significant proportion of the software engineering role is the ability to deal with complexity in system design, but also to handle the effects of complexity after a system has been implemented, through updates that might be required as a result of new requirements, unforeseen requirements, and system design deficiencies, otherwise known as “bugs”.

    As we start to comprehend the potential impact of the IoT era, there is an emerging awareness of the need to be able to design and test – more exhaustively – software before it is deployed.

    The development of formal approaches to software development is something that has been an active research topic in academic arenas for many years, with its industrial application being generally limited to “safety-critical” systems such as nuclear power plants, aircraft control systems, etc.

    But we are now in the midst of a period where CPS are increasingly accessible and as a consequence they are being introduced into application areas by individuals who a) are ignorant of formal approaches to software development and b) are not software engineers and are therefore lacking even the “craft” of software engineering.

    What do we mean by formal methods?

    Formal methods are an approach where the ability to analyse software design is an inherent part of the design process. This facilitates not only the construction of models that replicate the system to be developed (and thus use abstraction to manage the complexity), but it means that a system model can be tested and evaluated before a line of code is even written.

    The use of mathematical notation means that the specification of a system can be expressed precisely, but that it can also be formally reasoned against to test for inconsistencies.

    This formality offers considerable advantages for software development such as:

    • software specifications are more explicit and rigorous. This supports the goal of requirements engineering to ensure that the proposed system delivers what is needed;
    • program code is more rigorous as there is a formal underpinning for each aspect of the software that is being developed. The software engineer knows that functions within the model have been tested against the specification and as such has already verified that the program functionality is correct. There is the additional benefit that the formal declaration of requirements, together with logical reasoning, means that some degree of testing can actually be automated;
    • system maintenance and future modification will be simpler, partly because such a system has been more thoroughly designed, but also because the underlying documentation is explicit and includes the reasoning for the inclusion of a all functionality.

    The above are compelling arguments for a return to the thinking around formal methods, and how these can help us develop the next generation of IoT-inspired systems.

  • Dealing with IoT time

    Dealing with IoT time

    One factor that motivates industrial organisations to adopt Internet of Things (IoT) technologies is the potential to be able to monitor, control and coordinate processes remotely. This opens up new possibilities for collaboration across manufacturing sites, and has the potential to have a massive positive impact on supply chains of collaborating entities, who may be separate business organisations, but who need to co-exist by serving the needs of each other.

    The degree of integration that is feasible between processes is in part determined by the frequency of monitoring that is required to optimise a process or group of processes.

    For instance, if a system is measuring the ambient environmental temperature, it is likely that an hourly update will produce more than enough data upon which to gain the insight needed for whatever decisions are taken.

    In contrast, a machine tool that is shaping components that form part of larger assemblies, may be reporting tool wear using loading on the drive motors, by way of electric current draw monitoring. In this case, the reporting of hourly data may be meaningless, and a much more frequent stream of reported data would be required to illustrate at which point in time the tool is starting to lose its sharpness and requires replacement.

    A second issue in relation to time exists if we consider the potential of collaboration across different geographies, which is feasible with the use of networked sensors (and is typically the basis of a more sophisticated Cyber Physical System or CPS).

    In such a case, the motoring and reporting frequency of the data is of importance, but there is an assumption that the data that is reported from each of the sensors is recorded at the same time. The synchronisation of internal clocks in embedded microprocessors can quickly become an issue if we want to integrate lots of data-producing devices together.

    Depending on the design of the data producing system, there may be different latencies that affect the time that data is reported, analysed and posted into a repository. For some processes this is more critical than others; the example of temperature recording is clearly more tolerant than that of the concurrent monitoring of machine tool wear.

    Our assumption is that we are building systems that sense so that we can do something interesting with the data, and this is usually some form of analytics. Whether the analytics is performed local to the source of the data, or perhaps more “downstream” upon a database or data warehouse, there is still the issue of data integrity; how do we protect against data being recorded that is inadvertently mis-labeled with time stamps that are not synchronised?

    Each situation needs to be considered on an individual basis, but it is generally prudent to record a sequence of activities post data capture, so that an analytics function can observe the chain of events that occurred after an event was reported. This might include recording the times that the data was sent, received, processed and archived for instance.

  • What is a Cyber Physical System?

    What is a Cyber Physical System?

    As technology develops and becomes more accessible, people find new  ways of exploiting the technology to achieve new objectives. The manufacturing industry is a pertinent example of how technology can transform productivity and increase quality through improved repeatability, often using mechanisation and automation.

    The industrial revolution is recognised as a step-change in the technology that was utilised to increase efficiencies, by enabling steam engines to mechanise and, to some degree, automate production processes.

    Various subsequent technological developments such as the use of electricity to facilitate automated production and assembly lines (Industry 2.0), followed by enhanced control through the use of embedded electronic systems and microprocessors (Industry 3.0), has led to the current thinking that we are now in an era that is referred to as Industry 4.0, where the individually controlled physical systems are connected to enable new ways of collaboration, by way of networking and the internet. This has given rise to the label of “Cyber Physical Systems” or CPS.

    But what is a CPS?

    As the concept matures there are many variations as to how people might describe a CPS, but the following characteristics are common to a lot of definitions:

    • A CPS has tightly coupled computational and physical components that can reason and interact with their environment
    • A CPS is a conglomeration of software, embedded processing, and real-time sensing and actuation

    A Framework for Cyber-Physical Systems was released by the National Institute of Science and Technology CPSP Working Group on May 26, 2016. The working group published a definition of a CPS as follows:

    “Cyber-Physical Systems or `smart’ systems are co-engineered interacting networks of physical and computational components. These systems will provide the foundation of our critical infrastructure, form the bases of emerging and future smart services, and improve our quality of life in many areas.”

    National Institute of Science and Technology CPSP Working Group, May 26, 2016

    diagram of NIST framework for Cyber Physical Systems

    Like any new technology, there is a temptation to use a label at every opportunity, and many systems have become re-badged as CPS.

    When we think about whether a system should be considered as a CPS, we really need to think about the characteristics that distinguish a CPS, from a system that uses sensors, does some processing, and is connected to a network.

    Depending on your perspective, there are some subtleties that help us identify a CPS. Some examples might be:

    Ubiquity – there is an expectation that a CPS facilitates ‘processing everywhere’, and that the processing is enhanced (or ‘distributed”) as a consequence of interconnected-ness. As such there is a potential move from humans interacting with systems to enable the transfer of insight and inference between systems, to fully distributed, autonomous systems that exchange pertinent knowledge on a need to know basis.

    Complexity – a CPS should enable new possibilities to be realised, and this should also account for unforeseen or un-planned emergent scenarios.

    Delegation and trust – we have already grown accustomed to using computers to automate boring, repetitive, or even dangerous tasks. Many manufacturing jobs that used to exist became extinct with the advent of technology. However, if we are to delegate responsibility to a CPS, we need tone able to trust that not only will it achieve the correct outcome (whatever ‘correct’ means), but that it will achieve the goal in a way that is acceptable to us. How we perceive a task, and how it should be completed, suggests that we shall probably expect a CPS to be modelled and designed in a more human-centric way than we have perhaps approach computer controlled system design in the past. For instance, humans like to interact through speech and conversation, so a CPS may need to incorporate this capability into its interface, Similarly a CPS may need to ‘see’ and recognise objects.

    We can now see that such systems are extremely complex, and the complexity is at a scale that cannot be comprehended easily without specific design tools to abstract the designers away from the minutiae.Key questions to consider are:

    • how do we plan to build a CPS?
    • how do we model an existing CPS?
    • how will I know that the system will behave as I expect it to?
    • how will it react in the way I want?

    To summarise, the following is a set of attributes that we might use to categorise a CPS:

    • there is a cyber capability (i.e. networking and computational capability) in every physical component;
    • the sub-systems are networked at multiple and extreme scales;
    • the system is complex at multiple temporal and spatial scales;
    • the components dynamically reorganise and reconfigure to meet existing and emerging goals;
    • control loops are closed at each spatial and temporal scale, and maybe there is a human in the loop;
    • their operation needs to be dependable and in certain cases certifiable as well;
    • the computation/information processing and physical processes are so tightly integrated that it is not possible to identify whether behavioural attributes are the result of computations (computer programs), physical laws, or both working together.

    As such, not every system controlled by a digital controller is a CPS; it may or may not be depending on the specifications and the design approach taken.