RAG Contextual Compression Technique
(Redirected from RAG Contextual Compression)
Jump to navigation
Jump to search
A RAG Contextual Compression Technique is a RAG algorithm technique that ...
References
2023
- Claude 2
- QUOTE: Contextual compression is another advanced RAG technique that uses a secondary smaller LLM to condense lengthy retrieved documents into more concise relevant context. This summarized content contains the key relevant facts without superfluous text, allowing for more condensed and efficient context to be passed to the primary LLM.
2023
- https://python.langchain.com/docs/modules/data_connection/retrievers/contextual_compression
- QUOTE: One challenge with retrieval is that usually you don't know the specific queries your document storage system will face when you ingest data into the system. This means that the information most relevant to a query may be buried in a document with a lot of irrelevant text. Passing that full document through your application can lead to more expensive LLM calls and poorer responses.
Contextual compression is meant to fix this. The idea is simple: instead of immediately returning retrieved documents as-is, you can compress them using the context of the given query, so that only the relevant information is returned. “Compressing” here refers to both compressing the contents of an individual document and filtering out documents wholesale.
- To use the Contextual Compression Retriever, you'll need:
- a base retriever
- a Document Compressor
- The Contextual Compression Retriever passes queries to the base retriever, takes the initial documents and passes them through the Document Compressor. The Document Compressor takes a list of documents and shortens it by reducing the contents of documents or dropping documents altogether.
- QUOTE: One challenge with retrieval is that usually you don't know the specific queries your document storage system will face when you ingest data into the system. This means that the information most relevant to a query may be buried in a document with a lot of irrelevant text. Passing that full document through your application can lead to more expensive LLM calls and poorer responses.