Revisiting Retrieval Augmented Generation
I’ve been kinda hot and cold on Retrieval Augmented Generation (RAG).
I rushed in and experimented early using an overview of Singapore law. After seeing other locals try to implement them, I scoffed at it. I dismissed it as “grab three relevant articles from my vector db and ask #ChatGPT to write an answer on it”. Now I am going to have a second go.
One of the problems I suggested with a simple implementation of RAG concerned the embeddings used to search:
The granularity of the embeddings made from your data store seems also to have a significant impact. An embedding that takes too wide or narrow a snapshot of your data might miss or jumble the point.
A straightforward way to be more flexible is to divorce the embeddings used for search and the context used for generating an answer.
Really love the concept of @LangChainAI
— Harry Zhang (@zhanghaili0610) August 19, 2023
Parent Document Retriever, as well as the cool diagram by @clusteredbytes! ❤️
So I took some time and refactored a sequence diagram to address the internal workflow, hope it helps~ ? https://t.co/Fv9JzPa3XH pic.twitter.com/51ISYHf5RE
Once you have more flexibility with the context, there are various ways to play with what is passed to the large language model:
- In the previous tweet, it’s suggested that the embeddings can be laser focused, but we can pass the whole document as context.
- We don’t need to pass the whole document as is to the model, instead we can summarise it or even add new context to it.
Now that OpenAI models can accept 16,000 or more tokens, it may be useful to play with how the context is presented to the model.
Currently, I am thinking of an “Ask a Judgement” Chatbot. Court judgements presents an interesting kind of document: they are long, contain several sections, and are also interlinked in complex ways.
Although smooshing it down into a vector database is going to be interesting, I guess I also have to think about what a user might want to ask a judgement. It’s still worth a fine prototype! I am especially curious to see whether it’s able to resolve my qualms about how RAG does in legal documents.
Let’s keep building!
Love.Law.Robots. – A blog by Ang Hou Fu
- Discuss... this Post
- If you found this post useful, or like my work, a tip is always appreciated:
- Follow [this blog on the Fediverse]()
- Contact me: