Embeddings

Large Language Models Large Language Model (LLM) can only scan over a small amount of data at once, like 1000 words, but what about when you have a large PDF file of data, to tackle this problem we use Embeddings.

Embeddings create numerical representation of text, this numerical representation captures the meaning of the text, and similar words have similar vectors. A Vector Stores is where can store these embeddings.

When we want to store a large dataset / file, we first break it down into small chunks and then using embedding get the vector values / Embeddings and then save these values in the Vector Stores , we can later use the database to fetch the data which is most relevant to the query at hand and then pass them to the model.

LangChain offers many integrations for such embedding providers

There are three different types of methods to get answers

  • Map Reduce
  • Refine

Map Reduce

map reduce

This method takes all the chunks and and chains them to the LLM with the prompt to get the answer this can be done in parallel, and all the data is used again to summarise them and give a final answer, too many model calls , can be expensive and all the queries are considered to be unrelated.

Refine

Refine

It is used to build up an answer and gives long answers, the context and the answer generated are usually very relevant, but takes much longer as it is dependent on the previous answer.

Map ReRank

Caution

This method is still experimental

Map ReRank

Map ReRank is another method where we make many calls to the model and rank them, and we select the highest score, this relies on the model knowing what the score should be.