Interesting Links

I need to look into using vector databases for recommendation engines.

“If you want to make a good RAG tool that uses your documentation, you should start by making a search engine over those documents that would be good enough for a human to use themselves.” – link

Simon Willison gave a great talk about embeddings. This introduced the concept, showed how to do ‘vibes-based search’ against sqllite using llm, and talked about 2D visualisations.

“Being able to spin up this kind of ultra-specific search engine in a few hours is exactly the kind of trick that excites me about having embeddings as a tool in my toolbox.”
“A fascinating thing about RAG is that it has so many different knobs that you can tweak. You can try different distance functions, different embedding models, different prompting strategies and different LLMs. There’s a lot of scope for experimentation here.”

Adding semantic search to datasette

Lovely visualisation of 40 million Hacker news posts. This uses Uniform Manifold Approximation and Projection (UMAP) to reduce the vectors to a 2D geographic map (among other things). Some interesting applications of a massive dataset.

The llm tool can be used for image search using CLIP

This leads to someone searching images of faucets by image and phrase (what taps best represent the idea of ‘Bond villain?’)

According to ChatGPT, people have played with using vector databases to analyse recipes. There is a paper on this, ‘Learning Cross-modal Embeddings for Cooking Recipes and Food Images‘, but I can’t find any details on applications with it, experimental or otherwise.

Interesting discussion of search in RAG (look for the RAG section)

Text Embedding Models Contain Bias. Here’s Why That Matters – interesting Google paper from 2018

Using a vector database, SkyCLIP, and Leaflet to create a searchable aerial photograph

“This is why Retrieval-Augmented Generation (RAG) is not going anywhere. RAG is basically the practice of telling the LLM what it needs to know and then immediately asking it for that information back in condensed form. LLMs are great at it, which is why RAG is so popular.” link

Spring AI documentation on RAG

Notes on Resource-Augmented Generation (part 1)

I’ve been thinking recently about Resource Augmented Generation (RAG) and vector databases, and I wanted to gather some notes towards a talk.

The question

Most of the applications I’m seeing for Generative AI seem to involve RAG. But I feel that the vector database is doing most of interesting work here – and, in a lot of cases, an LLM-generated response is not always the best ‘view’ for the data that is returned. I want to dig a little more into vector databases, how they work, and what can be done with them.

Panda Smith wrote, “If you want to make a good RAG tool that uses your documentation, you should start by making a search engine over those documents that would be good enough for a human to use themselves”; I want to learn more about this search part of RAG

Playing with embeddings was a post I wrote in September ’24 looking at using vector databases for ‘vibes-based search’
GenAI is already useful for historians discussed an article about a historian using GenAI to find diary entries relevant to their research
Retrieval-augmented generation using SpringAI was a Spring AI RAG demo I built for a previous talk

Notes from wikipedia

Pulling out some notes from the wikipedia article:

“[RAG] modifies interactions with a large language model so that the model responds to user queries with reference to a specified set of documents, using this information to supplement information from its pre-existing training data. This allows LLMs to use domain-specific and/or updated information.”
There are two phases – information retreival and response generation
RAG was first proposed in the paper ‘Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks’ in 2020
The process involves the following stages:
- Indexing – the documents to be searched are stored – usually by converting the data into vector representations
- Retrieval – given a user query, a document retriever finds the most relevant documents for the query
- Augmentation – the documents are put in a prompt for an LLM
- Generation – the LLM generates a response based upon the prompt
The ‘chunking’ of the documents (how they are divided up into pieces to be stored) affects how good the responses are.
Risks of RAG
- While RAG reduces hallucinations in the responses, it cannot eliminate them.
- There is a danger of losing important context in the chunking phase.

Interesting links

Twitter thread by Jo Kristian Bergum on ‘The rise and fall of the vector database infrastructure category’
The Best Way to Use Text Embeddings Portably is With Parquet and Polars – fascinating discussion of vector databases, using Magic: The Gathering cards as a dataset
Embeddings: What they are and why they matter Text of a talk by Simon Willison: “Embeddings are a really neat trick that often come wrapped in a pile of intimidating jargon”

Recent Posts

Recent Comments

Archives

Categories

Interesting Links

The question

Previous posts

Notes from wikipedia

Interesting links