Artificial Intelligence in Information Security: Increasing Business Efficiency and Profitability

Artificial intelligence has already firmly entered the arsenal of modern business tools. Its application is now the norm, not the exception. AI effectively solves the problems of classification, data analysis, and risk prediction in various industries.

For example, in the banking sector, AI is used to improve scoring systems. Algorithms trained on real data assess the creditworthiness of clients, taking into account many factors: credit history, asset availability, place of work, and the borrower's income level. This allows finding a balance between the needs of clients and lending risks, increasing the efficiency of financial institutions.

But besides financial tasks, AI can also be used in the field of cybersecurity. Can its implementation improve overall protection? And affect profitability?

Artificial intelligence and machine learning (ML) technologies help analyze information, quickly adapt to changes, and improve the software delivery process. In the context of the Secure Software Development Lifecycle (SSDLC), the use of AI models opens up wide possibilities.

According to the requirements of regulators (GOST R 56939-2016) and the Central Bank of Russia (for example, OUD4), it is necessary to conduct regular dynamic and static analysis of the developed software. AI can play an important role in implementing these requirements.

Let's consider an example of its implementation in the continuous process of static code analysis (SAST). What can it offer here?

  1. Reduce the number of false positives: SAST tools sometimes detect potential vulnerabilities that do not actually exist. Machine learning methods improve detection accuracy by analyzing the context and specifics of the code.

  2. Find new vulnerabilities: traditional SAST methods are often based on known patterns and signatures, unlike them, machine learning can find hidden signs indicating new and previously unknown vulnerabilities.

  3. Adapt to languages: models trained on one programming language can be adapted to others.

In addition to using AI for static code analysis, ML algorithms can effectively process dynamic analysis reports (DAST).

Dynamic Analysis (DAST) is a method of application security testing that simulates external attacks and attempts to find vulnerabilities in a running application. DAST scanners analyze the application by interacting with it through the user interface or API, checking for known vulnerabilities and potential weak points. In large organizations, the number of daily scans can be high, and the number of security reports can reach hundreds or thousands.

This raises the task of clustering and classifying scan report results. Clustering is the process of dividing a large array of data into groups (clusters) based on common characteristics or features. This method is widely used in the analysis of report data to simplify their processing, understanding the structure, and identifying patterns and anomalies.

Classification is the process of systematizing objects, entities, or concepts into specific categories based on common features or characteristics. It allows organizing and structuring information from scan reports, making it easier to understand and use. Additionally, it becomes possible to rank the found vulnerabilities by their criticality level more effectively.

Machine learning methods can significantly improve the analysis of DAST and SAST reports, making it more efficient and accurate. At the same time, resource consumption is reduced, as is the likelihood of making a mistake due to human factors when an Application Security engineer (AppSec) "missed" a potential vulnerability.

Tasks solved using ML for automating report analysis can be as follows:

  • Reducing false positives: ML algorithms can analyze large volumes of data and identify patterns that indicate real vulnerabilities.

  • Detecting new vulnerabilities: ML can detect new vulnerabilities by analyzing reports and finding patterns that were previously unknown.

  • Automating analysis processes: ML can automate routine report analysis tasks, such as classifying vulnerabilities by severity or prioritizing fixes.

  • Risk prediction: ML can be used to predict potential risks associated with specific vulnerabilities.

  • Improving report quality: ML can help improve the quality of DAST reports by providing more accurate and informative data on identified vulnerabilities.

The use of AI to solve information security tasks allows finding a balance between the quality of checks and the speed of development. IT products become more secure while maintaining the ability to quickly respond to market changes without "downtime" in development and associated financial losses. In other words, the speed of security checks is directly proportional to the speed of introducing new solutions, products, and features into production.

Practical application of AI in information security: research

To avoid limiting ourselves to theoretical discussions about the benefits of using AI in information security, we conducted practical research. Our goal was to demonstrate with real examples how artificial intelligence can improve the efficiency of security analysis and speed up development processes.

In this study, we tested machine learning-based solutions for automated triage of security reports. We focused on two key AppSec practices: analysis of static analysis (SAST) reports and processing of dynamic scanning (DAST) results. Let's take a closer look.

Analysis of SAST reports using the Offline model

Reports from various tools containing complete information about vulnerable functions or methods were used to train the model. Experts annotated real triggers for various types of vulnerabilities in the source code.

The model was trained on expert-processed reports with annotated real triggers for various types of vulnerabilities in the source code. The task of the model is to check triggers with the type To Verify and change such triggers to the status Not Exploitable. During the training process, the engineer checks the obtained results and corrects them if necessary, which improves the quality of the model.

Results:

  • Increased speed of parsing potential vulnerabilities by AppSec engineers;

  • Reduced number of false positives;

  • Positive impact on TTM when identifying blocking defects that may turn out to be false positives.

The next step is to implement an online model for processing SAST results directly during scanning in the pipeline.

Analysis of dynamic scanning (DAST) reports

In this case, not only typical reports in PDF format were analyzed, but also the scanning traffic of the test object. A labeled dataset was prepared to train the model:

  • About 600 thousand examples, of which 15% are potential vulnerabilities;

  • Text data was cleaned and vectorized;

  • Additional features were added (time between server responses and other detectors);

  • A classifier (catboost) was trained and the model quality was checked with the results: Precision: 0.99; Recall: 0.98; F1 score: 0.991; ROC AUC score: 0.99.

During the analysis of test data, the model identified 2700 potentially vulnerable records in the scan reports, and expert evaluation confirmed the accuracy of identifying non-standard responses at the level of 96.7%.

Summary

The results of our study were quite impressive. First of all, we found that the time required to analyze the scan results was significantly reduced, in some cases by an order of magnitude. This improvement allows AppSec engineers to process a much larger volume of reports and data in the same amount of time.

Finally, we noted an increase in the accuracy of the analysis. This can be of great importance in preventing potential threats.

One of the most important results was the acceleration of the process of bringing new functionality into industrial operation. In today's world, where the speed of development and implementation of new products often determines the success of a company, this advantage is hard to overestimate.

Comments