Why Traditional RAG Search is Inadequate for Security

Here is how traditional RAG search works (essentially Google search and Perplexity search work this way) - you take a large set of documents (the entire internet) , chunk them , convert them into embeddings, store them into a vector database. When the user searches a query, you get the closest n matched text from the embeddings. There are several othe flavors of RAG, such as agentic RAG , graph RAG - they all have the same architecture

Traditional RAG Arch

Problems with traditonal RAG search

Sourcing

Unlike consumer search use cases, security and compliance work is highly dependent on accuracy and sources. The impact of taking actions on the wrong sources is highly consequential. In RAG search, you don't have much control over the documents that RAG selects at inference time. Our compliance customers have complained again and again that ther workflow is not complete until they take the results from google or perplexity search, and cmd+F on the official document to verify the accuracy.

For example, take a simple search of password requirements for PCI. Since none of the answers are from official PCI documentation, a compliance engineer has to spend additional time searching the official documentation whether these answers are correct or not.

Accuracy

Along with sourcing , hand in hand comes with accuracy issues. If the user query is well known and has internet corpus , then the answers are fairly accurate in the traditional RAG methods as the answer is sourced from multiple internet URLs.

However, the accuracy falls apart when its needed - when it is not a well knowen question . We will lay out the accuracy issues as we discuss the alternate architecture that we use at Transilience AI

Alternate Architecture

An alternate architecture we propose is how cyber security consultant approaches the problem. The consultant, when the user asks, will surigically go and read the documents (or have the pre knowledge of the official documents and answers) and will only give answer from the official documents, which are authoritative. A good consultant also will tell you control numbers, page numbers for the answer as well

Transilience RAG Architecture

Lets test at the accuracy for both approaches for couple of deeper questions

Example 1 - PCI Assessment Findings

The official PCI documentation on assessment finding types. There are 4 possible assessment findings

PCI Official answer

Perplexity answer - Perplexity gives a 5th possible type, thats not an option

Transilience cyber consultant answer gets it right .

Transilience Cyber Consultant Answer

Example 2 - RBI

Lets take RBI requirements for co-operative banks. Here is a snippet from the official document

Official snippet

Here is perplexity answer

Perplexity answer - inaccurage

Here is the answer from Transilience cyber security consulant with references -

Answer from Transilience Cyber Consultant.

We are introducing this architecture as a beta functionality in Transilience cybersecurity consultant application. Just like our other apps (vulnerability and threat intelligence) that are powering our backend, we are offering this as a free apps for cyber professionals to use.

For commercial and to use it on our own custom documentation, please contact hello@transilience.ai

Transilience AI backend team

Smritika Sadhukhan ✨ Venkat Pothamsetty

Frontend team

Muzaffar Hossain Garima Sadhnani

Why Traditional RAG Search is Inadequate for Security

Why Traditional RAG Search is Inadequate for Security

Problems with traditonal RAG search

Sourcing

Accuracy

Alternate Architecture

Lets test at the accuracy for both approaches for couple of deeper questions

Recent Posts

Web App Vulnerability Testing with Transilience AI PenTest Agent

ISO42001: A Comprehensive Guide to Artificial Intelligence Management Systems

Why Traditional AI Search Engines Fail Security Engineers