Migration of CDN from nginx to Angie: RUTUBE case

Hello everyone! We have long planned to tell a couple of scenarios for using our open-source web server Angie. Today we will talk about how Angie is used in the infrastructure of such a large project as RUTUBE.

Together with the RUTUBE team, we analyzed the entire process from goal setting to result. On the part of RUTUBE, Dmitry Ivanov (@abvgd) head of the infrastructure operation department, helped us to compile the article.

RUTUBE:

In its infrastructure, RUTUBE uses the nginx web server on a large number of services. However, there are several problems with nginx:

  1. A zoo of third-party modules. The fact is that open source nginx does not provide all the functionality we need. Additionally, we need: kaltura for ds-origin, rtmp, geoip2 module, brotli on frontend/innerend/cdn, vts on frontend/innerend/cdn, etc. As a result, we risk getting into a situation where some modules are incompatible with others.

  2. HTTP 4xx and 5xx errors are monitored in a very crooked way: nginx → access_log → Rsyslog → Kafka → ClickHouse → metric. At the same time, part of the information was lost, for example, it was difficult to understand in which virtual host-location the problem occurred. We would like to use a different approach to collect HTTP 2xx/4xx/5xx statistics by services, i.e. per server/location.

As a result, the RUTUBE team decided to test the hypothesis that Angie can help with these difficulties. It worked!

For reference: today RUTUBE is hundreds of millions of uploaded and distributed videos, 4000 servers, 200 CDN servers, 25 cities of presence, 7 Tbps of traffic.

Migration from nginx to Angie

Angie Software:

RUTUBE uses Angie as an in-place replacement for nginx, while all files/configs/paths remain in place. Logs are still written to /var/log /nginx (if such a path was defined in the config), the config itself is located in etc/nginx/nginx.conf, etc. The Angie service is available under the name nginx, i.e. to do, for example, reload, you can issue the command systemctl reload nginx. The command systemctl reload angie also works. In general, we described the process of seamless migration from nginx to Angie in our instructions.

RUTUBE:

In our infrastructure, a lot is built around nginx. Suffice it to say that our nginx configs are so large that they are placed in separate repositories, for which separate delivery pipelines are written. To avoid redoing everything everywhere, we chose the solution of in-place replacement of one web server with another. To do this, we adapted the nginx configs for Angie by adding the nginx_angie switch and included similar constructs, for example:

error_log /var/log/nginx/error.log crit;
pid /var/run/nginx.pid;
{% if nginx_angie %}
load_module modules/ngx_http_brotli_static_module.so;
load_module modules/ngx_http_brotli_filter_module.so;
{% endif %}
events {

or

server_name example.com;
{% if nginx_angie %}
status_zone example_com;
{% endif %}

The transition itself was smooth. Now the entire RUTUBE CDN runs on Angie and can easily distribute 7 Tbps from almost 200 powerful servers.

Angie Software:

Colleagues approached the migration process wisely and professionally, not only laying a safety net where necessary but also using all the built-in Angie features to their advantage.

The Angie team assembles the modules themselves, and users do not need to bother with this. Before switching from nginx to Angie, colleagues from RUTUBE collected all the used modules and checked that they were available in Angie. They did not find legacy GeoIP v1, only GeoIP v2, but this was solved very simply — they finally migrated to GeoIP v2 everywhere where GeoIP v1 was used.

As a result, the RUTUBE team, with the help of Angie, solved both problems: they dealt with the module zoo and monitoring.

Monitoring that could

Angie Software:

Monitoring with Angie is even easier. In the case of RUTUBE, a significant advantage of Angie over nginx turned out to be the availability of various metrics that can be viewed in the web application Angie Console Light, exported to Zabbix or Prometheus. For example, with nginx, RUTUBE colleagues could not monitor per server/location statistics. With Angie, they simply defined different zones and monitor them calmly.

By the way, we wrote a huge article on tekkix about Angie's monitoring capabilities (using Console Light and not only).

RUTUBE:

Where Angie is installed, the Angie Console Light web application is also installed. The thing turned out to be very useful. Console Light is a lightweight real-time activity monitoring interface that displays key server load and performance indicators. The console is based on the capabilities of the Angie API interface; activity monitoring data is generated in real-time, unlike other monitoring solutions that rely on logs.

Angie's monitoring capabilities helped us:

  • find locations/virtual hosts with no traffic for a long time through API metrics;

  • find virtual hosts with parasitic activity (Sic!) through API metrics;

  • find providers that blocked our BGP announcements through API metrics;

  • after switching to Angie, we got caching efficiency metrics, before that there were only indirect ones (like reading/writing to disk). Since then, we have significantly improved caching efficiency.

Moreover, with the help of Angie, we found a potential problem with the network filtering equipment that stands between the web server and the upstreams. It looked very New Year's: in the Angie console, all the upstreams (or most of them) lit up red at the same time, and then immediately green.

Angie Software:

We especially liked how colleagues from RUTUBE created an Angie by Zabbix agent template in Zabbix, which replaced the Nginx by Zabbix agent template. The template provided detection of HTTP zones and location zones from the API module. By the way, zones in location should be named without a "minus" sign, otherwise LLD in Zabbix will not detect them (or the part with minuses should be quoted).

Moreover, over time, colleagues from RUTUBE developed a replacement for this template based on the Zabbix HTTP agent: Angie by HTTP. Its advantage is that metrics are collected via HTTP, which means compression can be used, as metrics take up a significant amount of space on a large number of virtual hosts and/or upstreams. In short — well done.

And one more thing to note — colleagues from RUTUBE repeatedly contacted us and suggested adding various functionalities to Angie. For example, they drew our attention to the fact that it is impossible to understand how many requests went to the network in the resolver metrics. As a result, we created a separate cached metric that shows the number of responses specifically from the resolver's cache.

Whenever possible, we try to integrate all reasonable and useful suggestions from the community. The experience of RUTUBE and feedback from the company's engineers are very valuable to us. Be like RUTUBE engineers — try Angie and share the results with others.

Comments