LangChain RAG Workflow

Step 1: LLM Initialization

The user selects an LLM Llama3 or Mistral or AceGPT:7b via the Streamlit dropdown. The app initializes the selected model using Ollama.

Step 2: PDF Upload and Chunking

The uploaded PDF is loaded using PyPDFLoader. The document is split into smaller chunks using RecursiveCharacterTextSplitter to optimize retrieval.

Step 3: Vector Store Creation

Document chunks are embedded using OllamaEmbeddings and stored in the Chroma vector database for future queries.

Step 4: Query Workflow

The user inputs a question. The Retriever fetches the top 5 most relevant document chunks based on similarity. The QA Chain: Combines the retrieved context with the user query. Uses a custom prompt custom_qa_prompt to format the input for the LLM. Generates a response via the selected LLM.

Step 5: Output Display

The app displays the answer: If context is available, the response includes information extracted from the PDF. Otherwise, a brief direct response is generated.