Categories
GenAI

Notes on Resource-Augmented Generation (part 1)

I’ve been thinking recently about Resource Augmented Generation (RAG) and vector databases, and I wanted to gather some notes towards a talk.

The question

Most of the applications I’m seeing for Generative AI seem to involve RAG. But I feel that the vector database is doing most of interesting work here – and, in a lot of cases, an LLM-generated response is not always the best ‘view’ for the data that is returned. I want to dig a little more into vector databases, how they work, and what can be done with them.

Panda Smith wrote, “If you want to make a good RAG tool that uses your documentation, you should start by making a search engine over those documents that would be good enough for a human to use themselves”; I want to learn more about this search part of RAG

Previous posts

Notes from wikipedia

Pulling out some notes from the wikipedia article:

  • “[RAG] modifies interactions with a large language model so that the model responds to user queries with reference to a specified set of documents, using this information to supplement information from its pre-existing training data. This allows LLMs to use domain-specific and/or updated information.”
  • There are two phases – information retreival and response generation
  • RAG was first proposed in the paper ‘Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks’ in 2020
  • The process involves the following stages:
    • Indexing – the documents to be searched are stored – usually by converting the data into vector representations
    • Retrieval – given a user query, a document retriever finds the most relevant documents for the query
    • Augmentation – the documents are put in a prompt for an LLM
    • Generation – the LLM generates a response based upon the prompt
  • The ‘chunking’ of the documents (how they are divided up into pieces to be stored) affects how good the responses are.
  • Risks of RAG
    • While RAG reduces hallucinations in the responses, it cannot eliminate them.
    • There is a danger of losing important context in the chunking phase.

Interesting links

Leave a Reply

Your email address will not be published. Required fields are marked *