Take control. How we took control of the factory from foreigners

In the spring of 2022, our team faced an unconventional task: we had to transfer the entire IT infrastructure of a large manufacturing enterprise from foreign management to Russian jurisdiction within two weeks. The continuity of the factory's operations was at stake: from factory machines to accounting and document management systems.

My name is Konstantin Kim, I am a network technology expert at the IT integrator K2Tech. In this article, I will explain how we, in the midst of chaos and rising panic, transferred all critically important systems of the client to Russia and took control of all the equipment. You will learn about the technical aspects of migrating Active Directory, restoring access to Cisco and Aruba network equipment, as well as about psychological support building trustful relationships with the client in a crisis situation.

We prepared for various scenarios of events — from ideal to worst-case. How everything turned out in practice — read below.

Situation — SOS

In 2022, against the backdrop of well-known events, foreign management decided to disconnect a very large Russian factory from the IT infrastructure. They informed employees about the cessation of operations in the Russian Federation, but for some reason forgot to explain how the transfer of control would take place.

Needless to say, such uncertainty caused serious concern among the Russian management of the factory — the enterprise could suddenly lose complete control over its own infrastructure. An urgent solution was required.

And so, two weeks before the critical hour, the management of the Russian factory turned to us for help. We had to ensure the continuity of production after the disconnection of foreign systems. Work began immediately.

Assessing the situation and making a plan

The project was implemented without direct interaction with foreign colleagues — we only worked with the Russian management and technical service of the client. The initial audit was conducted remotely through interviews and video conferences.

It turned out that the IT infrastructure and corporate systems of the enterprise were located on three sites: in the Microsoft cloud, in offices near production in Russia, and on servers in Europe. The local network operated on SD Access technology managed by controllers in Europe.

Critically important production systems were physically located in Russia but were completely dependent on the servers of the parent company. After the VPN channels were disconnected by the head office, the Russian enterprise would lose access to all systems—from production accounting to process management.

The Russian technical service of the customer performed basic operations under the management of the European head company: maintained the physical infrastructure, installed equipment, performed switching, and provided remote access. All further configurations and connections were carried out remotely by European specialists.

The technical specialists of the customer had an in-depth knowledge of the physical infrastructure: the location, connection, and composition of the equipment. This data became a reliable starting point for our work.

How the European colleagues would behave was still unclear, so we considered two scenarios for the development of events:

  1. In a favorable scenario, the European side would provide us with all access details to the equipment. We would conduct an audit of the infrastructure, check the functionality of the systems, and develop instructions for the local technical service on management and maintenance.

  1. In an unfavorable scenario, access would not be granted. We would have to restore the equipment configuration without initial data. The infrastructure includes solutions from different manufacturers, which creates additional risks: there could be a loss of licenses and disconnection of critical functions.Accordingly, all these risks need to be taken into account.

At the initial stage of the project, the customer’s management was very concerned about the situation. The main source of stress for them was not even the second scenario, but simply—uncertainty.

Therefore, an important part of the work became the informational support for the plant management. We presented a detailed strategy with well-developed action scenarios for various situations. We literally drew an action algorithm, like in a computer science textbook. This was very helpful.


Take control of our factory: reclaiming sovereignty and empowering local workers.

Getting to Work

The client's infrastructure included three key locations: a production complex in one of the regions, a finished goods warehouse in the Moscow region, and a central office in Moscow. All sites were connected by corporate data transmission channels to the head office in Europe. The cessation of interaction with the European side meant the severing of these connections. It was necessary to establish communication channels between the Russian sites directly.

A separate task was to restore the user authentication system. The existing scheme used domain authorization with a pair of login-password and digital certificates. To maintain its operability, a new local domain server was needed. Another important task was the restart of the corporate DNS server.

Considering the numerous critical dependencies, it was decided not to wait for disconnection from the head office. We began migrating key services to the Russian cloud. Within a week, we implemented most of the critically important infrastructure services on virtual machines and performed their initial setup. The following were restored or recreated:

  • domain structure DC-AD;

  • related infrastructure services;

  • printing services (print server);

  • document management system;

  • RDS terminal servers.

As well as database and application servers for 1C, on which the specialized contractor carries out the development and recreation of ERP + MES systems on platform solutions from 1C.

Is Everything Under Control?

Successes in deploying the infrastructure created a sense of control over the situation. The situation paradoxically changed: now we had to not calm the client down, but motivate them to take a more active part in the project.

This is certainly better than panic, but it was still not the time to relax. It is one thing to create a copy of the Active Directory servers, and another to work with the network infrastructure that was still under the control of European administrators.

We intensified the coordination efforts of all project participants. At the same time, the collection of technical information, clarification of details, and development of scenarios for restoring access to network equipment continued.

The client's infrastructure included Cisco and Aruba equipment equipped with a built-in Password Recovery service. In the worst case, we could use this mechanism to regain access to the devices. However, this method could lead to partial loss of configurations, so we considered it only as a backup option.

But the main task became gaining access to the core of the plant's local area network. The scale of the production area and the variety of network services made configuration recovery critically complex. The technical service had only a basic understanding of the IP addressing scheme, without a detailed logical topology of the network. Recovering such infrastructure blindly could take weeks.

Restoring access to the equipment

The European side provided partial access to the network equipment, but without the possibility of feedback and clarification of details. As a result, we received a third — intermediate scenario between positive and negative developments.

The obtained credentials allowed us to regain control of part of the infrastructure, but we were not given the passwords for the border routers.

We had to use that very Service Password Recovery. This is a standard recovery mechanism that most network equipment manufacturers have.

This reset allows the router to start without the current configuration and authentication parameters. The administrator gains the ability to work with the device's file system and, if they know where to look, extract the previous configuration from memory and use it for reconfiguration.

Thus, within three days after the shutdown of foreign servers, we:

  • gained control over the local network and Wi-Fi infrastructure;

  • restored access to border routers;

  • configured a secure network connection to the new Russian cloud infrastructure;

  • practiced the procedure for remote access to the core network through border routers for remote administration.

How everything functions now

As a result of the project, corporate resources have become available to all categories of users and contractors of the company. The migration process was transparent for end users, causing no difficulties in their work. Although this is still an interim state, localization efforts are ongoing.

The client's network infrastructure was initially designed with high requirements for fault tolerance and security. As a result, we did not need to implement additional security mechanisms during the restoration of access — efforts were focused on ensuring independence from foreign administrators. Consequently, the management of the network infrastructure has been transferred to local technical specialists.

Currently, the architecture of corporate systems is implemented in a hybrid model: the main data center is located at the client's regional site, while part of the services is hosted in the cloud infrastructure.

This solution ensures availability and fault tolerance, which are required for such operations and events:

  • Duplication of infrastructure systems across sites, allowing for sequential maintenance.

  • Geo-distributed data center, reserving against the loss of communication channels or the entire site.

  • Mitigation of virtualization platform failure and loss of authorization capabilities by related services through the existence of a hardware-independent data center.

  • Guaranteed local availability of authorization and authentication means for systems and users within the local site relative to them.

Summing up

In conclusion, I would like to highlight a few key points from our experience:

The migration of a large enterprise's infrastructure is possible even within extremely tight deadlines with proper organization of the process. Critical success factors include a competent team, clear role distribution, and the client's motivation to cooperate.

Practice has shown the effectiveness of a hybrid approach to building infrastructure. The combination of local resources with cloud solutions provides the necessary fault tolerance and allows for the rapid deployment of backup resources.

Dependence on a single vendor or cloud provider creates serious risks for the business. Therefore, when designing corporate infrastructure, it is advisable to think in advance about scenarios for autonomous operation and the possibility of quickly migrating critical services. We recommend having up-to-date documentation and established procedures for restoring access to equipment.

We hope our experience will help colleagues better prepare for similar situations.

Comments

    Relevant news on the topic "Network"

    Also read