Software Supply Chain Security. Building processes with OSS

21:39
22.10.2024
Venut
184

Hello, tekkix! We are talking about one of the options for using Open Source tools for Software Supply Chain Security. Colleagues in the field asked to post a small overview here:)

Hello, tekkix! I am Mikhail Chereshnev, a DevSecOps specialist at Swordfish Security. Today we will talk about the importance of protecting the software supply chain — a critical aspect for any modern company. In this article, I will explain how we at Swordfish Security implement the Software Supply Chain Security (SSCS) strategy and show how Open Source solutions can help with this task.

This article is a brief text version of my presentation at the PHD2 conference. If you are interested in the topic, you can watch the full presentation via the links: [RuTube] / [YouTube].

What is a "Supply Chain"?

First, let's understand what a supply chain is in everyday life. Imagine you order a pizza through a mobile app. This chain involves different links: the retailer (pizzeria), the distributor (who delivers semi-finished products), and the ingredient manufacturer (farm or factory). However, in the world of DevSecOps, two new elements need to be added to this chain — the attacker and the consequences of their possible attack.

Now imagine that an attacker gains access to one of the links in this chain. For example, if they compromise the distributor, all pizzeria stores will be at risk. If the attacker infiltrates the factory where the ingredients are produced, the contamination can spread even further, affecting the entire chain. Finally, if the "farm" (raw material supplier) is unreliable — for example, due to the use of bad seeds — this will negatively affect the final product.

Now let's translate this example to the context of software development. In the software supply chain, key stages can be identified, such as:

Code and dependencies;
CI/CD systems (Continuous Integration / Continuous Delivery);
Artifact registries (where packages, containers, and other dependencies are stored);
Execution platforms (where the application is deployed and run).

Each of these links is susceptible to attacks. If an attacker gains access to at least one element, it can lead to serious consequences for the entire system. Therefore, protecting all stages of the supply chain is a top priority for any company striving to ensure the security of its applications and data.

Main threats in the software supply chain

Several key threats may arise in the process of ensuring the security of the software supply chain. Here are some of them:

Undeclared capabilities
These are hidden functions or capabilities that unexpectedly appear in the source code uploaded to public repositories, such as on GitHub. Such changes can be introduced without the developers' knowledge and serve to attack the system.
Disclosed organizational secrets
One of the common mistakes is the accidental publication of sensitive information in open repositories. These can be passwords, API keys, access tokens, or other critically important data that attackers can use to infiltrate the company's systems.
Vulnerabilities in artifacts
Problems can arise at the level of artifact management, such as packages, containers, or libraries. Insufficient verification of uploaded code, lack of information about who built it, and how it was modified during delivery create security risks.
Vulnerabilities in CI/CD systems
This aspect deserves special attention. CI/CD systems, which automate the building and deployment of applications, are still insufficiently studied from a security perspective, especially in the Russian segment. Until quality materials and methods for protecting CI/CD appear, this important aspect will remain at risk. Attackers can use vulnerabilities in CI/CD chains to infiltrate final systems.
Vulnerabilities of registries and execution platforms
Container registries, continuous integration systems, as well as execution platforms (e.g., Kubernetes) can also be subject to attacks.

Despite the existence of numerous recommendations, regulatory documents, and frameworks, many of them lack practical value for engineers. Therefore, it is important not only to follow regulatory requirements but also to develop technical solutions to enhance security at every stage.

Standards and Frameworks for Software Supply Chain Security (SSCS)

Many standards and frameworks have been developed to protect the software supply chain. Here are some of the most significant:

CIS (Center for Internet Security)
Read also:
- How we implemented vector search in Postgres Pro
- CIS Software Supply Chain Guide: a guide to protecting the software supply chain.
- CIS Software Supply Chain Benchmarks: a set of benchmarks for assessing the security of the supply chain.
S2C2F (Secure Supply Chain Consumption Framework)
Includes practical requirements and tools that help organizations safely implement and consume open-source solutions.
SCITT (Supply Chain Integrity, Transparency and Trust)
An initiative aimed at ensuring transparency and trust in the software supply chain. SCITT sets standards regarding the authenticity of entities, evidence, policies, and artifacts in the supply chain.
OSC&R (Open Software Supply Chain Attack Reference)
- Attack Matrix for Software Supply Chain Security: an attack matrix for the software supply chain, allowing the systematization of potential threats and attacks.
- PBOM (Pipeline Bill of Materials): a specification that allows tracking all stages of software development and assembly to ensure security.

Now let's focus on three key frameworks that can help improve supply chain security right now:

SLSA (Supply-chain Levels for Software Artifacts)
This is a framework that offers a checklist of standards and activities to prevent supply chain interference, enhance integrity, and ensure the security of packages and infrastructure. More about how to use SLSA can be found here.
TUF (The Update Framework)
TUF provides a flexible structure and specification for protecting software update processes. This framework allows developers to integrate it into any update system to ensure package security and prevent update attacks. More about TUF can be read here.
in-toto
In-toto is a framework for ensuring the integrity of the software supply chain. It offers an extensible metadata standard for documenting all steps in the development process, thereby ensuring that each step is performed correctly and securely. Learn more about in-toto here.

What practices should be implemented to protect the supply chain?

Now let's move on to the practical part. To ensure the security of the software supply chain (SSCS), it is necessary to convince all participants in the process — whether people or systems — that this process is airtight. Globally, it includes three key stages:

Collection of Evidence
We need a system that will collect and store all data confirming safety at each stage of the supply chain.
Publication and Storage of Evidence
The evidence must be published and stored for further verification. This is an important step as it ensures the transparency of the process.
Validation of Evidence
At the final stage, all collected data is checked to confirm its authenticity and reliability.

Process Evolution until 2022

Previously, the process included several steps such as:

Provenance — an artifact that records environmental changes during the build and indicates which product was created.
Attestation — a signed statement confirming that the artifact has passed all necessary checks.
Signature — confirmation of the origin of the artifact.
SBOM (Software Bill of Materials) — a list of all software components and their dependencies.

Simplified Scheme

Today, this process can be simplified to two key elements: signature and attestation. Within the in-toto framework, attestation can include any statement collected during the development process. For example, various DevOps and DevSecOps tools (such as CI/CD, secret scanning systems, DAST, SAST, etc.) can generate reports, and these reports can be included in the attestation as evidence.

Types of such statements or predicates in SLSA and in-toto may include:
- .cyclonedx
- .spdx
- .provenance
- .realise
- .test-result
- .vuln
- .link
- .vsa
- .scai
- .runtime-trace

You can develop your own specification if you have an IB tool that makes decisions or generates any data. For example: https://swordfishsecurity.com/attestation/assessment-result/v0.1. You can read the full description of the in-toto specification here.

How to make a correct statement?

To create a correct statement, you need to add information about the subject — the person or system responsible for performing a specific task. If a signature confirming authorship is added to this statement, we get an attestation that meets the requirements of SLSA. The attestation ensures that the artifact has been verified and the results belong to the verifying party.

Technologies for protecting the software supply chain (SSCS)

We have already discussed the stages of generating and collecting evidence. Now let's move on to the issue of storing this data, which is no less important part of the process of ensuring the security of the supply chain. The main issues to consider:

Where to store SSCS metadata?
How to safely distribute this data?
What metadata do our users need?
How can a consumer assess the safety of an artifact?
How to protect data from hacking?
Are existing SSCS data suitable for all types of artifacts?

These questions help develop requirements for a metadata repository: it must be immutable, protected from hacking, and independent of the artifacts themselves.

Data storage in SSCS: what to use?

In the modern world, there are already well-described structures for storing data, many of which are based on the concept of a Merkle tree. This is an algorithm that allows data to be stored in fragments and calculates a common hash for all these fragments, ensuring data integrity. It can be imagined as a lake, at the bottom of which each stone is a block of information.

Among the interesting repositories for supply chain data are:

TLOG (Transparency Log Server)
A transparency log that records and stores supply chain metadata. An example implementation is Rekor.
OCI Registry/CAS (Content Addressable Storage)
A content-addressable storage that allows storing both artifacts and related evidence.

Actions with the metadata repository

With an addressable storage, two main actions are performed: data loading and evidence retrieval. Various systems are used for this, such as TLOG and CAS. For example, Rekor can record and store data using systems like Trillian (a transparency log management system) or MariaDB (a database for storing metadata).

Features of different storage systems

Each system has its own unique features, and their choice depends on the organization's goals. For example:

TLOG (e.g., Rekor):
- Stores only metadata.
- Does not use role-based access control (RBAC).
- Supports only data addition (no deletion).
- Independent of artifacts.
- Has a verifiable log and metadata map.
OCI Registry/CAS:
- Stores both artifacts and evidence.
- Supports RBAC (role-based access control).
- Allows creating and deleting artifacts.
- Compatible with OCI standards.

Validation of Evidence

The process of validating evidence is also quite simple and has already proven itself well in the industry. There are several classic systems for verification, which include:

CRI (Container Runtime Interface) — a container runtime interface through which keys can be loaded for artifact verification (e.g., Docker, Podman, and others).
Controllers — such systems as Gatekeeper, Kyverno, Kubewarden, and Sigstore/Policy Controller, which help enforce security policies.
Specialized utilities — for example, Notary, Rekor-CLI, Sigstore/Cosign, which provide convenient tools for signing and verifying artifacts.

Solutions

As we mentioned earlier, the software supply chain (SSCS) includes three key stages: generation, storage, and validation of evidence. Let's consider what technologies and solutions can be used at each of these stages.

Generation of Evidence

For the generation of evidence, we can use a wide stack of DevSecOps tools. In our company, the main focus is on specifications such as in-toto and Notary, which allow for effective tracking and recording of artifact provenance. As data sources for generating evidence, we use DevOps and DevSecOps tools.

The key point here is the use of data types already defined in the SLSA framework. This standard was developed by engineers with many security aspects in mind, so its implementation saves time and resources. If you are thinking about creating a new standard, you should carefully consider whether it is justified in your situation, or if it is easier to use existing specifications.

Evidence Storage

There are many solutions for storing evidence. The most popular are OCI registries (container registries) that support storing artifacts and metadata. Examples of such registries include:

Yandex Container Registry
Nexus
Harbor

These solutions provide secure and centralized storage of data necessary to ensure the integrity of the supply chain.

Evidence Validation

For evidence validation, you can use both custom solutions and ready-made systems. Examples of custom solutions include:

Kyverno, Gatekeeper, Policy Controller — these are policies and access controllers that check the compliance of artifacts with established security requirements.
CRI (Container Runtime Interface) — an interface through which keys are loaded to verify containers such as Docker and Podman.
Cosign — a tool for signing and verifying container images and other software artifacts.

For those who do not want to develop their own systems, there are many ready-made platforms available. One of the most convenient options is to use the CNAPP (Cloud-Native Application Protection Platform) — a platform for protecting cloud-native applications. These platforms automatically manage the processes of generating, storing, and validating evidence, ensuring the full cycle of supply chain protection.

You can also consider solutions such as CSP (Content Security Platform) and ASOC (Application Security Orchestration and Correlation), which collect and analyze security evidence at all stages of software development and deployment.

Conclusion (About our experience in building)

In this article, I discussed the key technologies, practices, and tools for protecting the software supply chain (SSCS). We talked about the importance of generating, storing, and validating evidence, and also looked at solutions such as in-toto, Notary, OCI registries, and validation platforms.

I would like to explain in detail how we apply these technologies at Swordfish Security, but detailing the entire process with 15-20 illustrations on tekkix would be too redundant. Therefore, if you are interested in this topic, I recommend watching the full video of my presentation at the PHD2 conference [RuTube] / [YouTube]. In the video, I explain in detail how we implemented these tools and solutions in our projects and how it works.

Thank you for your attention!