- Network
- A
Understanding CDN Cache Statuses
Hello, tekkix! Today we will delve into one of the key aspects of CDN operation – cache statuses. If you have ever looked into server response headers or analyzed CDN logs, you have probably encountered mysterious abbreviations like HIT, MISS, or EXPIRED. Let's figure out what they mean and why they are so important for understanding CDN operation.
How caching works in CDN
Before we delve into cache statuses, let's recall how caching works in CDN in general.
CDN (Content Delivery Network) is a distributed network of servers located in various geographical locations. The main task of CDN is to speed up content delivery to end users by placing it closer to them.
When a user requests a file through CDN, the following happens:
-
The request hits the nearest CDN server to the user (called an edge server or simply "edge").
-
The edge checks if the requested file is in its local cache.
-
If the file is in the cache and it is up-to-date, the edge delivers it to the user.
-
If the file is not in the cache or it is outdated, the edge requests it from the origin server, saves it to its cache, and delivers it to the user.
It is at step 2 that various cache statuses arise, which we will talk about today.
Cache statuses: HIT, MISS, and EXPIRED
Now let's analyze the main cache statuses and what they mean:
-
HIT
HIT – is a good guy in the world of CDN. When you see HIT, it means that the requested file was found in the CDN cache and delivered directly to the user from there. This is the fastest scenario as it does not require contacting the origin server.
Example:
X-Edge-Cache: HIT
-
MISS
MISS is when the file is not in the CDN cache. This can happen for several reasons:
-
The file is being requested for the first time.
-
The file was in the cache but was removed due to inactivity.
-
The cache was manually cleared.
In the case of MISS, the CDN requests the file from the origin server, saves it in the cache, and delivers it to the user. This is slower than HIT because the CDN still contacts the origin server. However, for subsequent requests, the file will be delivered from the cache, speeding up their processing.
Example:
X-Edge-Cache: MISS
-
EXPIRED
EXPIRED means that the file is in the cache but it is outdated according to the caching settings. In this case, the CDN will request the current version from the origin server, update the cache, and deliver the file to the user.
Example:
X-Edge-Cache: EXPIRED
-
There are also other statuses (usually in trace amounts): UPDATING, REVALIDATED, STALE
Module for working with large files
When we talk about CDN, it is important to consider that it is not always just about "files" as whole objects. In cases where large files are requested (for example, 1 TB in size), the user does not have to wait for the entire file to be downloaded to the edge server. For such situations, the NGINX module ngx_http_slice_module is used, which splits the file into parts and allows them to be transferred to the user as they are received.
This is important because without this module, users could face long delays waiting for the entire file to load. Therefore, it is more correct to talk about requesting a file and/or its part, which better reflects the real work of the CDN.
Status REVALIDATED
Another status worth mentioning is REVALIDATED. This header is used when the edge server checks with the origin server whether the file has changed since the last request. If the server returns a 304 Not Modified response and the etag header has not changed, then the file (or part of it) remains in the cache for a new period, and there is no need to reload it from the origin server.
Status STALE
The STALE header indicates that the file was served from the cache even if the edge server could not reach the origin server. This is possible thanks to the proxy_cache_use_stale setting in NGINX, which allows cached content to be served in case of problems accessing the source.
Why can there be many MISS?
A common question we hear from clients is: "Why do I have so many MISS in the statistics?". Indeed, a high percentage of MISS can seem like a problem, but this is not always the case. Here are some reasons why there can be many MISS:
-
Low website traffic in certain regions. If an edge receives few requests, files may be removed from the cache due to inactivity or may not appear on some edges at all because no one has accessed them from there yet.
-
Frequent content updates. For example, the website owner may regularly clear the cache manually or use file versioning. Versioning is a way to update files without changing their names but by adding parameters to the URL. For example, the file style.css may get the URL parameter ?v=20240909 — where 20240909 indicates the update date (or any set of characters that differ from the previous version at the client's discretion). This parameter makes the browser and CDN treat the file as new, even if its name (style.css) has not changed. As a result, the old version of the file is ignored, and the new version is loaded from the origin server, leading to an increase in MISS requests. Thus, the content is cached, but new versions of files require cache updates each time the version parameter changes.
-
Incorrect caching settings on the origin server. For example, too short cache lifetime or headers that prohibit caching. In some cases, a file may be cached but not useful to other users if, for example, unique content is generated for each request.
If each request generates a unique response, it can be cached, but this cache will not be useful to other users. Thus, each new request will result in a MISS for the next user, even if the file is cached for the previous one. -
Large variety of content. If you have millions of unique files, not all of them can fit in the cache at the same time.
We also want to note that if the user does not have the ability to manage the cache-control header on their side (for example, for some reason the files on their server have cache-control: no-cache), but they want to cache files in the CDN – in our personal account, you can check the boxes "ignore cache-control and Expires headers" of the client.
-
Example:
How to check the cache status for a specific file?
Checking the cache status for a specific file is quite simple:
-
Open the file through the CDN address in a separate browser window.
-
Press F12 to open developer tools.
-
Go to the Network tab.
-
Refresh the page.
-
Find the desired file in the list of requests and click on it.
-
In the response headers, look for lines like:
X-Edge-Cache: HIT
X-Edge-Ip: 192.168.0.1
Here X-Edge-Cache will show the cache status, and X-Edge-IP will show the address of the specific edge that processed the request.
If the content on the site has changed, but the CDN still returns the old version of the file (for example, the style.css file has been changed, but the CDN returns the old version because it is in the cache with HIT), the site owner can clear the cache. This can be done for the entire project or specific files, through the personal account or via the API. After clearing the cache, the request for style.css will return MISS, and the new version of the file will be loaded from the origin server.
Where to view the cache status history?
If you need more detailed statistics on cache statuses, you can refer to the CDN logs. In cdnnow! this is done as follows:
-
Go to the control panel of your project.
-
Go to the "Logs" tab.
-
Select the period you are interested in and download the logs.
-
In the downloaded file, find the column upstream_cache_status. It contains the cache statuses for each request.
In addition, the Personal Account provides detailed statistics on the number and percentage distribution of cache statuses (HIT, MISS, EXPIRED, and others). This data can be viewed broken down by days and hours, allowing you to assess the effectiveness of caching at different times.
For convenience, you can also configure your browser to immediately see cache statuses.
In the developer tools on the Network tab, you can add the 'Domain' column to sort requests by CDN domain. To display the X-Edge-Ip and X-Edge-Cache headers, copy the name of the header you are interested in, right-click on the column headers, select 'Response Headers', and then 'Manage header columns'. After that, click 'Add custom header' and paste the header name (e.g., X-Edge-Cache or another depending on your CDN provider). By checking the boxes to display this data, you can immediately see which files are cached and which are not.
Conclusion
Understanding CDN cache statuses is key to optimizing your website's performance. HIT, MISS, and EXPIRED are not just mysterious abbreviations, but important indicators of how your content delivery system is working.
A high HIT rate usually means your CDN is working efficiently. But don't be alarmed if you see a lot of MISS – this is normal for certain usage scenarios. The main thing is to understand the reasons and be able to interpret this data in the context of your specific project.
What has been your experience with CDN caching? It would be interesting to read in the comments and thank you for reading!
Write comment