Friday, May 22, 2026

AI Retrieval System RAG_Implementation Project

 

RAG (Retrieval-Augmented Generation) Implementation Using Google Gemini & FAISS

RAG (Retrieval-Augmented Generation) is one of the most important concepts in modern AI applications. It combines:

  • Retrieval systems (searching relevant information)

  • Large Language Models (LLMs) (generating intelligent responses)

Instead of depending only on the LLM’s training knowledge, RAG allows the AI to search custom documents and answer questions from them.

Your project uses:

  • LangChain

  • Google Gemini API

  • HuggingFace Embeddings

  • FAISS Vector Database

to create a simple RAG chatbot inside Google Colab.


What This Project Does

The workflow is:

User Question ↓ Search Relevant Text Chunks ↓ Send Context + Question to Gemini ↓ Generate Accurate Answer

Example:

You provide Python tutorial text.

User asks:

What is Python used for?

The system:

  1. Finds relevant chunks related to Python usage

  2. Sends them to Gemini

  3. Gemini answers using only retrieved context


Step-by-Step Explanation


1. Installing Required Libraries

!pip install langchain langchain-google-genai langchain faiss-cpu langchain-text-splitters langchain-community

Explanation

This command installs all required Python libraries.

Libraries Used

LibraryPurpose
langchainFramework for building LLM apps
langchain-google-genaiConnects Gemini AI with LangChain
faiss-cpuVector database for similarity search
langchain-text-splittersSplits large text into chunks
langchain-communityCommunity integrations
sentence-transformersEmbedding models

2. Importing Required Modules

import os from langchain_google_genai import ChatGoogleGenerativeAI from langchain_community.embeddings import HuggingFaceEmbeddings from langchain_community.vectorstores import FAISS from langchain_text_splitters import RecursiveCharacterTextSplitter from langchain_core.prompts import PromptTemplate

Explanation

These imports provide all core functionalities.

Modules

ModulePurpose
osAccess environment variables
ChatGoogleGenerativeAIGemini LLM integration
HuggingFaceEmbeddingsConverts text into vectors
FAISSStores vectors for searching
RecursiveCharacterTextSplitterSplits large text
PromptTemplateCreates custom prompts

3. Setting Gemini API Key

os.environ["GEMINI_API_KEY"] = "your_api_key"

Explanation

This sets the Gemini API key as an environment variable.

Gemini requires authentication to access Google's AI models.

You replace:

"your_api_key"

with your actual API key.


4. Input Text

Your project uses text data containing Python tutorial content.

This acts as the knowledge base for the RAG system.


5. Splitting Text into Chunks

splitter = RecursiveCharacterTextSplitter(
    chunk_size = 200,
    chunk_overlap = 50
)

Explanation

LLMs cannot efficiently process huge text directly.

So we split text into smaller chunks.

Parameters

ParameterMeaning
chunk_size=200Each chunk contains 200 characters
chunk_overlap=5050 characters overlap between chunks

Why Overlap?

Overlap preserves context continuity.

Without overlap:

  • sentences may break awkwardly

  • information may be lost


6. Creating Documents

docs = splitter.create_documents([text])

Explanation

This converts text chunks into LangChain document objects.

Each document contains:

  • page content

  • metadata

Example:

Chunk 1 → Python is a programming language... Chunk 2 → Used in AI, automation...

7. Creating Embeddings

embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

Explanation

Embeddings convert text into numerical vectors.

AI cannot understand raw text directly.

So:

"Python programming"

becomes:

[0.234, 0.872, -0.192, ...]

These vectors help measure semantic similarity.

Model Used

all-MiniLM-L6-v2

A lightweight and fast sentence transformer model.


8. Creating FAISS Vector Store

vectorstore = FAISS.from_documents(docs, embeddings)

Explanation

FAISS stores vector embeddings efficiently.

It acts like:

AI-powered search engine

Now the system can:

  • search similar text

  • retrieve relevant chunks quickly


9. Creating Retriever

retriver = vectorstore.as_retriever(
    search_kwargs = {"k":3}
)

Explanation

Retriever searches the vector database.

Parameter

"k":3

means:

  • return top 3 most relevant chunks

when user asks a question.


10. Formatting Retrieved Documents

def format_docs(docs): return "\n\n".join( d.page_content for d in docs )

Explanation

This function combines retrieved chunks into one formatted context string.

Example:

Chunk 1

Chunk 2

Chunk 3

11. Creating Prompt Template

prompt = PromptTemplate( input_variables = ["context","question"], template = """ Use only the context below to answer the question. context: {context} question: {question} Answer: """ )

Explanation

This controls how instructions are sent to Gemini.

Important Part

Use only the context below

This reduces hallucination.

Gemini will answer ONLY from retrieved text.


12. Initializing Gemini Model

llm = ChatGoogleGenerativeAI(
    model = "gemini-2.5-flash"
)

Explanation

This initializes Google's Gemini Flash model.

Why Gemini Flash?

  • fast

  • lightweight

  • cheaper

  • good for RAG systems


13. Creating Main RAG Function

def rag_answer(query): r_docs = retriver.invoke(query) context = format_docs(r_docs) full_prompt = prompt.format( context=context, question=query ) response = llm.invoke(full_prompt) return response.content

Step-by-Step Flow

Step 1

retriver.invoke(query)

Searches relevant chunks.


Step 2

format_docs(r_docs)

Formats retrieved context.


Step 3

prompt.format()

Creates final prompt.


Step 4

llm.invoke()

Sends prompt to Gemini.


Step 5

Returns generated answer.


14. Asking Question

query = "What is python used for ?" answer = rag_answer(query)

Explanation

The user asks a question.

RAG system processes it and generates response.


15. Printing Output

print("question :", query) print("\n") print("answer", answer)

Explanation

Displays:

  • question

  • generated answer


16. Final Output

Question:
What is python used for?

Answer:
Python is used for development, and its libraries help with a wide range of tasks, making development easier.

How RAG Works Internally

User Query ↓ Embedding Generated ↓ FAISS Similarity Search ↓ Relevant Chunks Retrieved ↓ Context Added to Prompt ↓ Gemini Generates Answer

Advantages of RAG

AdvantageDescription
Reduces hallucinationUses actual context
Supports private dataCan use custom documents
Real-time knowledgeNew docs can be added
Faster than fine-tuningNo retraining needed
ScalableWorks with huge datasets

Real-World Applications

RAG is used in:

  • ChatGPT-style chatbots

  • AI search engines

  • company knowledge assistants

  • PDF question-answering systems

  • customer support bots

  • legal document assistants

  • medical AI systems


Common Improvements

You can improve this project by adding:

UpgradeBenefit
PDF UploadRead PDFs
Streamlit UIWeb interface
Chat HistoryMemory
Better EmbeddingsMore accurate search
Pinecone/ChromaDBCloud vector DB
Multi-document supportMultiple files
OCRRead images

Final Understanding

This project demonstrates:

  • Generative AI

  • Semantic Search

  • Vector Databases

  • LLM Prompt Engineering

  • Embeddings

  • Retrieval Systems

which are core technologies behind modern AI systems like:

  • ChatGPT

  • Perplexity AI

  • GitHub Copilot

  • AI Search Engines

It is actually a very strong GenAI project for portfolio and interviews.

Wednesday, November 19, 2025

Understanding the Agentic Capabilities of Gemini 3 Pro

If you thought the jump from Gemini 1.5 to 2.5 was significant, Google’s latest release might just redefine your expectations entirely. Officially dropped on *November 18, 2025, **Gemini 3 Pro* isn't just a "smarter chatbot"—it is an active partner designed to stop talking and start doing.

From coding interfaces on the fly to cutting out the excessive polite waffle we’ve grown used to, here is everything you need to know about Google’s most aggressive AI update yet.

Saturday, November 15, 2025

How to Use Your Android Mobille From Ubuntu Wirelessly Using Scrcpy

 Today, controlling your entire mobile phone directly from your Linux desktop is easier than ever. Whether you want to reply to messages faster, manage apps, or use your phone’s screen inside Ubuntu, scrcpy is one of the fastest and most reliable tools to do it.

Monday, November 3, 2025

How to Add ChatGPT, Microsoft Copilot & Perplexity AI to WhatsApp

Artificial Intelligence tools like ChatGPT, Microsoft Copilot, and Perplexity AI have become extremely popular, helping users with instant answers, writing assistance, and productivity tasks. The good news is that you can now use these AI chat assistants directly on WhatsApp in India by simply saving their official numbers and starting a chat. Below is a simple guide to add them and use them safely.

ChatGPT Go Is Now Free for One Year in India

 OpenAI has announced a special offer for Indian users: its popular ChatGPT Go plan will be available free for one year, giving everyone access to premium AI tools without paying a single rupee.

This initiative makes India the first country to receive a full-year complimentary upgrade, signaling OpenAI’s growing commitment to its fast-expanding user base in the region.

Tuesday, October 28, 2025

How to Convert VLC generated video screenshots into a Single PDF on Ubuntu

 Have you ever captured multiple screenshots from a video — maybe a tutorial, lecture, or online course — and wanted to combine them into one organized PDF?

In this guide, you’ll learn how to easily convert your VLC Media Player screenshots into a single, well-ordered PDF file using a simple command-line tool on Ubuntu.

This method is fast, reliable, and keeps your image quality intact — perfect for study notes, presentations, or archiving your learning material.

How to convert images to pdf using img2pdf in ubuntu operating system

Have you ever captured multiple screenshots from a video — maybe a tutorial, lecture, or online course — and wanted to combine them into one organized PDF?
In this guide, you’ll learn how to easily convert your VLC Media Player screenshots into a single, well-ordered PDF file using a simple command-line tool on Ubuntu.

This method is fast, reliable, and keeps your image quality intact — perfect for study notes, presentations, or archiving your learning material.

AI Retrieval System RAG_Implementation Project

  RAG (Retrieval-Augmented Generation) Implementation Using Google Gemini & FAISS RAG (Retrieval-Augmented Generation) is one of the mos...