- Network
- A
40,000 tags and not a single password: how we saved the factory from shutdown
Imagine having to replace an airplane engine mid-flight. That’s roughly the situation we found ourselves in when we took on the localization of production systems at a Russian factory. A black box instead of a control system, password-protected access, lack of documentation, and the inability to stop production — the full gentleman’s set of pitfalls after 2022.
I'm Ivan Balashov, and at the integrator K2Tech, I lead the digital manufacturing direction, implementing digital solutions in the industry. In this case, I will share how we pieced together fragments of information about system configurations, implemented a Russian SCADA in place of a Western one, migrated MES functionality, all without stopping production.
If you ever find yourself facing the task of import substitution for critically important production systems, this story may provide insight into typical pitfalls and ways to navigate them.
The Cost of Downtime — Build a New Plant
This was a painfully familiar situation: a large industrial enterprise was part of a foreign holding, and all support for the IT infrastructure and management systems was provided by Western specialists. Then the plant found itself alone with its production and management systems, without support, maintenance, or the ability to upgrade.
Production efficiency was declining, the amount of defects was increasing, critical errors were emerging, and timely data for making the right decisions was lacking — all of this created serious risks both for production and for the business as a whole. Take, for instance, the furnace, which was the heart of production. One wrong step — and the hot raw material would freeze, leading to irreversible results — it can't just be turned on and heated up again. The cost of a furnace operational error is six months of repairs (essentially dismantling, removing one large chunk of solidified material weighing tons, and reassembling the furnace) and a complete halt in production during this time, causing delays in shipments, contracts, and…
For some time, the plant operated in such a desperate situation until the management decided to switch to domestic solutions and regain control.
Import substitution live
The SCADA system at the plant was functioning on inertia (which is generally acceptable if it operates stably and nothing new needs to be extracted from it) — without updates, technical support, and, worst of all, without access to its configuration (there were no administrator passwords). By the start of the project, it was barely hanging on: any failure, even in the hardware (a new license key could not be generated), could become fatal, and restoring it would be impossible. The situation with the MES level was also not great - the data from the systems diverged from the facts…. Finding the cause and reassembling the configuration (restoring it) was not possible (access was only for viewing, users had no rights).
The attraction promised to be exciting — we were to effectively implement SCADA and MES from scratch in a live working production environment. At the same time: there were no working documents, technology descriptions, diagrams, system configurations, etc. Our specialists had to restore the logic of their operation and recreate the configurations almost blindly.
Credit must be given to the plant employees. Their contribution, knowledge, and desire to have a controlled production cannot be underestimated. This is a case where both the integrator and the plant team were a united team focused on solving a common task.
Audit with bias
Before creating a new management system, it was necessary to understand where we could afford some experiments without risking the collapse of production, and where it was categorically not worth doing. We began with a thorough investigation of the existing SCADA and control objects.
We created a map of the system, including both visible and less visible components, and analyzed how everything is interconnected at the network level. The production was divided into workshops with different functions, each containing specialized equipment for specific tasks. All this equipment needed to be connected to a new local network and a new, domestically developed SCADA system. We went through the plant from top to bottom, studying each unit, its functional capabilities from a control perspective, and the ways it could be connected.
The project did not look unusual architecturally but had its own set of complex conditions. We had to implement new infrastructure that complied with regulatory requirements for segmentation and security, ensure the transition to domestic software, while maintaining connectivity with the existing control system, about which we knew almost nothing. And... do all of this live.
A separate challenge was the peculiarities of some of the equipment to which we needed to connect. There were countless driver options for Windows, but finding drivers for the Russian operating system was quite the quest.
Fault tolerance and secure access
For the architecture of the plant's network, we implemented two closed isolated segments with no internet access and no access to the customer's industrial control system, as well as firewalls to control and filter traffic between the segments.
The main task was to ensure the transparency of the closed control segments. We set up the internal infrastructure so that we could see all the traffic—no packet was filtered. This was necessary to enable complete monitoring.
The second key point was the creation of a secure perimeter. The security of the closed segments was ensured through a single entry point, a firewall cluster, through which we allowed only verified traffic. Access through them is granted only to the monitoring system, which operates through an encrypted tunnel from the customer's office, as well as several workstations inside the plant. Essentially, we implemented an analog of a demilitarized zone with strict access control, thus meeting the regulator's requirements.
We divided access between network segments using firewall settings. First, we coordinated the IP plan with the customer, then conducted a detailed analysis to determine which data from the technological network is actually needed in the corporate segment. Based on this data, we manually configured each rule—what to allow and what to block. Perhaps the main feature of the network's operation was the need for precise time synchronization. In production accounting and manufacturing itself, second-by-second synchrony is critically important. Configuring all the equipment and virtual machines to the customer's NTP added quite a bit of complexity (since we only transfer trusted and secure traffic back and forth).
When choosing hardware, we focused on the customer's vendor list, as it was a mandatory requirement. However, we tried to select only the products we had confidence in from the list.
We first assembled the server rack and deployed SCADA, allowing the commissioning engineers to immediately begin work on recreating configurations from factory systems and employee knowledge. Next, we deployed MES and plunged headfirst into describing and forming connections.
When developing the solution, we initially laid out fault tolerance for the entire core part—servers, core switches, and firewalls. The server cabinet with the central equipment was connected to two independent power sources, and switches with redundant power supplies were chosen for access nodes. This scheme significantly reduces the risk of downtime due to sudden breakdowns. Fortunately, the customer already had a guaranteed building power supply system in operation, so it was not necessary to install additional UPS units.
Engineering
When the work with the data was in full swing, we turned our attention to engineering systems and low-voltage systems. Thanks to thorough preparation and the absence of typical problems such as non-structural walls, this stage took only one and a half weeks. During this time, we completely updated the infrastructure: installed more than 10 cabinets, laid 8 kilometers of fiber optic cable, and installed over 20 switches.
An ordinary star was created, which we implemented according to a fault-tolerant scheme. For this, we brought two branches to each installed cabinet. For cable laying, we used the existing engineering infrastructure of the customer. Building new routes would have taken too much time, which was already lacking.
What is always worth looking at separately: does the customer have the physical ability to create a "copy" of the system to later migrate to it?
From the world, a thread: communication with the staff
We could not copy the old SCADA, and it would not have helped anyway - the description of the tags (which tag is responsible for what) was absent, and by name, we identified at best about 10% of the total number. Therefore, we had to manually create the configuration signal by signal and restore what each tag means. Naturally, we had to go deeper than just recreating the signal tree - together with the plant engineers, we identified all the old calculated parameters, added new ones, and also restored (or changed where necessary) the algorithms for their calculation. It was necessary to separate truly important indicators from useless ones and restore the logic of their formation, adding missing indicators.
Sometimes restoring the logic of calculating a single indicator in the system stretched over 2-3 weeks. Parameters were not always calculated by an obvious formula. You think that the parameter is calculated one way, check - it does not match the actual data... you have to look for another approach.
We carefully compared the calculation results in the new and old SCADAs, and in some cases, we considered the task solved only when the values completely matched. In other cases, the task was considered solved when we heard from the plant workers: "Here! Finally, it is calculated the way we need it!"
Tags are transforming…
The customer determined the initial volume of tags that needed to be recreated in the new system, and when we entered the project, we accepted this figure as a given. However, when we were already on-site and could see the system live, as well as talk to the factory employees, we found out that the actual number of tags turned out to be five times more than initially stated. As a result, working with tags became, perhaps, one of the longest, most complex, and responsible stages of the project. It took us about 5 months to gather all the necessary information from more than 40,000 tags and correctly display it in the new system. At the same time, we had to revise the project budget, as additional licenses were required.
We were luckier with MES — the situation turned out to be more predictable, although there were still surprises. During the implementation, the customer realized the need to revise the content of reports concerning the initial requirements. During this realization, the number of reports (and the complexity of some of them) required for implementation fluctuated from X to 4X… Well, just for fun, it all ended at the figure of 4X-2 :)
By the way, there was a funny story with the monitoring system. Initially, it was not included in the technical specifications. But common sense and experienced operation quickly showed the harsh reality — without monitoring, it was like being without hands. The solution was elegant but complex to implement - we connected to the existing monitoring system at the customer's headquarters. The difficulties were that we could not directly access the customer's system ourselves, and all our actions and “implementation of requests” were carried out by people who had no relation to the project.
“Patchwork” migration
The day of the transition arrived, and we began the phased switch to the new systems, gritting our teeth… shoulder to shoulder with the factory employees. We transitioned individual production areas to autonomous mode and immediately connected them to the new system. During this period, part of the factory operated on the old system, while part was already on the new one. The full transition took a day and a half, and, importantly, — without a single production stop.
At the stage of equipment implementation, we worked with the customer to set up an L3 interface through which we retrieved data from the OPC server of the old SCADA system to the new one. This same interface was used during the migration. We disconnected elements from the old network and connected them to the new one in batches. When the migration was completed, the interface was successfully decommissioned. The whole process looked almost like a movie. The switch, the cable, and the new SCADA system in operation. We were switching cables and holding our breath, waiting to see if the link would appear in the new system within the critical 10–15 seconds. The link was present — the switch was successful. It seems simple, but it's worth considering that we were switching section by section throughout the plant at different times. Perfect coordination was required — just 200-250 points on the migration plan, which had to be done with a stopwatch. We succeeded :)
These skills came in handy for us more than once. Subsequently, the systems had to be switched back and forth several times. The necessity of these moves is easily explained — the closed nature of the legacy systems (unaccounted nuances of the old systems that could only be identified by trying them out). Details not mentioned during the examination and collaboration would emerge, special control points, or rare operational scenarios. We promptly returned to the old system, made improvements to the new one, and returned again. In principle, it’s obvious that if your system is managed/operated and configured by others, and you don’t have administrative access to it, you definitely don’t know all the important and necessary features and nuances of your system's operation.
Project outcomes: complete control, stability, and potential for development
As a result, the plant received what it wanted — a fully controllable domestic control system with a transparent infrastructure and reliable protection. The domestic SCADA and MES have been operating seamlessly at the customer's site for almost a year now. Specialists receive all the necessary data, and the new reporting system allows for more accurate assessments of output and plans for increasing efficiency.
If the company's management considers expanding production capacity, the expansion of control systems will not be a problem - now it is a clear and transparent system of its own. The architecture is designed for scalability both at the physical and logical levels.
Although K2Tech regularly works on such tasks, such projects are rare. When one team needs to "pull" the entire set of MES systems, SCADA, engineering, data transmission network, replacing the existing and closed system, it's always a challenge.
Therefore, I would like to address companies planning similar migration — don't delay the decision. Degradation of control systems, like wear and tear of equipment, is inevitable, and production management is a critically important element of any enterprise. The longer you delay the transition, the more serious the risks become, and the more difficult the migration will be. And if you really care about honest production figures and a real assessment of efficiency, it will be very difficult without your own, managed system.

Write comment