Thursday, June 11, 2026

Ship It: From Code to Kubernetes in One Git Push

 

Ship It: From Code to Kubernetes in One Git Push

In modern software development, writing code is only one part of the journey. The real challenge begins when you need to test it, package it, deploy it, monitor it, and make sure it stays reliable under real usage. That is exactly what a DevOps workflow is designed to solve. In this project, the complete process has been demonstrated in a practical and production-style way, starting from a developer’s first commit and ending with a live Kubernetes deployment that is monitored in real time. What makes this especially impressive is that the entire setup runs locally on a laptop, uses free tools, and does not require any cloud expense.

What This Project Is About

“Ship It” is a hands-on DevOps project built to show how a simple application can move through a real delivery pipeline. The application itself is a small Flask-based task tracker, but the surrounding architecture makes it feel like a real production system. It includes automated testing, Docker image creation, Kubernetes deployment, infrastructure provisioning, and observability through Prometheus and Grafana. In short, it is a complete demonstration of how modern engineering teams move code safely from development to production.

Why This Project Matters

Many beginners can write an application, but very few understand the full process of delivering it reliably. This project solves common problems that developers face in real life. It removes the “works on my machine” issue by containerizing the app. It replaces manual deployment with a GitHub Actions pipeline. It provides visibility through monitoring dashboards. It also prevents infrastructure drift by defining everything as code. Even scaling is handled automatically with Kubernetes HPA. These are the same kinds of practices used in strong engineering environments.

Technology Stack Used

  • Python Flask for the application
  • Docker for containerization
  • Kubernetes using Kind for local cluster deployment
  • OpenTofu/Terraform for infrastructure as code
  • GitHub Actions for CI/CD automation
  • Prometheus for metrics collection
  • Grafana for visual dashboards
  • Helm for installing the monitoring stack

Project Architecture

The workflow begins when the developer makes a change and pushes the code to GitHub. GitHub Actions then runs automated tests first. If the tests pass, it builds the Docker image, pushes it to the container registry, and deploys it to Kubernetes. Once the application is running, Prometheus collects metrics and Grafana displays them on live dashboards. This creates a complete loop from code commit to deployment to monitoring.

The Flask Application

At the heart of the project is a simple Flask app. The app works as a task tracker and includes basic routes for viewing the app, listing tasks, creating tasks, deleting tasks, checking health, and exposing metrics. What makes it powerful is not the size of the app, but the way it has been prepared for production-style deployment. It includes Prometheus metrics from the beginning, which means it can be monitored properly once deployed.

The app also includes a test suite with seven pytest tests, showing that reliability starts early in the development cycle. Before anything is deployed, the pipeline verifies that the application behaves correctly. This is a very important DevOps habit because it prevents faulty code from reaching the cluster.

Docker: Making the App Portable

Docker is used to package the app into a consistent image so that it behaves the same way across different environments. The Dockerfile is designed with layer caching in mind, which improves rebuild speed. The base image is kept lightweight, requirements are installed first, and application code is copied afterward. This makes the build more efficient and easier to maintain. The container also includes a health check so Kubernetes can track whether the app is healthy.

OpenTofu/Terraform: Infrastructure as Code

Instead of creating resources manually, the project defines Kubernetes infrastructure as code. OpenTofu provisions the namespace, resource quota, and limit range. This is a major DevOps advantage because the setup becomes repeatable, trackable, and version controlled. If the infrastructure ever needs to be recreated, it can be done consistently with a few commands.

Kubernetes Deployment

The app is deployed on a local Kubernetes cluster created with Kind. The deployment uses rolling updates, which means new pods come up before old ones are removed. That helps avoid downtime during updates. The service exposes the app internally, and the Horizontal Pod Autoscaler adds or removes replicas depending on CPU usage. This makes the system feel much closer to a real production environment.

Monitoring with Prometheus and Grafana

A strong DevOps pipeline is not complete without observability. This project includes Prometheus to collect time-series metrics and Grafana to show them visually. The dashboards can track request rate, latency, pod count, and error rate. That means you can actually see what is happening inside the application instead of guessing. This is one of the most valuable parts of the entire project because monitoring turns a basic deployment into a professional one.

CI/CD with GitHub Actions

One of the most important features of this project is its GitHub Actions pipeline. Every push to the main branch triggers a sequence of steps: testing, building, pushing the image, and deploying the app. The pipeline is designed so that each step depends on the previous one succeeding. If the tests fail, the deployment never starts. That makes the workflow much safer and much more realistic.

This project also shows a very useful DevOps lesson: good automation prevents bad releases. If a test fails, the broken version is blocked. If everything passes, the new version goes live. This is exactly how mature delivery systems protect production from errors.

What Makes This Project Special

What stands out most is that the entire system is built to mirror a professional environment while remaining accessible to learners. It uses free, local tools and still demonstrates key industry concepts such as CI/CD, containerization, orchestration, metrics, rolling deployment, autoscaling, and infrastructure as code. The project is not only a coding exercise, but a full DevOps learning experience.

Key Learning Outcomes

  • How code moves from a Git push to a live Kubernetes deployment
  • How automated tests protect the pipeline from bad changes
  • How Docker ensures consistent runtime behavior
  • How Kubernetes supports rolling updates and autoscaling
  • How Prometheus and Grafana provide real-time observability
  • How infrastructure as code improves repeatability and control

Why Beginners Should Build Projects Like This

Projects like this are extremely useful for students and early-career developers because they show how modern software systems actually work. Many people know how to write code, but fewer know how to deploy, monitor, and maintain it. By building something like this, you gain practical experience with tools that are used every day in real engineering teams. It also makes your portfolio far more impressive because it proves that you can think beyond the code editor. 

Final Thoughts

“Ship It: From Code to Kubernetes in One Git Push” is a strong DevOps project because it connects development, automation, deployment, and monitoring in one clear workflow. It is simple enough to understand, but complete enough to teach real engineering concepts. If you are learning DevOps, cloud-native deployment, or modern CI/CD practices, this is the type of project that can help you build both confidence and credibility. :contentReference[oaicite:15]{index=15}

In a portfolio or blog, this kind of project stands out because it shows hands-on understanding, not just theory. It demonstrates that you can build applications, package them properly, deploy them safely, and observe them in real time. That is what modern software delivery is all about.


📖 Overview

Ship It is a hands-on DevOps project that demonstrates a complete, production-grade software delivery pipeline from a developer's first commit to a live, monitored Kubernetes deployment — in under 60 seconds, on your laptop, for $0.

Every tool and pattern used here mirrors what engineering teams at companies like Amazon, Google, and Netflix use in production. The only difference is that instead of running in the cloud, everything runs locally using Kind (Kubernetes in Docker).

What happens when you run git push?

git push origin main
      │
      ▼
┌─────────────────────────────────────────┐
│         GitHub Actions Pipeline          │
│                                          │
│  ① pytest        (~10s)  ✅ or ❌ STOP  │
│  ② docker build  (~15s)                 │
│  ③ push to GHCR  (~10s)                 │
│  ④ kubectl apply  (~4s)                 │
└──────────────────┬──────────────────────┘
                   │
                   ▼
      ┌────────────────────────┐
      │   Kubernetes (Kind)    │
      │                        │
      │  Rolling update        │
      │  Zero downtime         │
      │  HPA auto-scaling      │
      └───────────┬────────────┘
                  │
                  ▼
      ┌────────────────────────┐
      │  Prometheus + Grafana  │
      │                        │
      │  Live metrics          │
      │  Request rate          │
      │  p95 latency           │
      └────────────────────────┘

Total: ~49 seconds end-to-end

Why build this?

Problem Ship It Solution
"It works on my machine" Docker image runs identically everywhere
Manual deployments break things GitHub Actions pipeline with automated tests
No visibility into production Prometheus + Grafana live dashboard
Infrastructure drift OpenTofu IaC — everything in Git
Traffic spikes cause outages HPA auto-scales pods automatically

🏗 Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Developer Workflow                        │
│                                                                  │
│   Edit Code  →  Test Locally  →  git commit  →  git push        │
└─────────────────────────────┬───────────────────────────────────┘
                              │ webhook trigger
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    GitHub Actions (Free Tier)                    │
│                                                                  │
│  ┌──────────┐    ┌──────────────┐    ┌──────────────────────┐   │
│  │  pytest  │───▶│ docker build │───▶│    kubectl apply     │   │
│  │  7 tests │    │  + push GHCR │    │  (rolling update)    │   │
│  └──────────┘    └──────────────┘    └──────────────────────┘   │
└──────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│              Kubernetes Cluster (Kind — 3 nodes)                 │
│                                                                  │
│  Namespace: ship-it  (managed by OpenTofu)                       │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                    Deployment                            │    │
│  │                                                          │    │
│  │   ┌──────────────┐      ┌──────────────┐                │    │
│  │   │     Pod 1    │      │     Pod 2    │  ← HPA: 2-5    │    │
│  │   │  Flask App   │      │  Flask App   │    replicas    │    │
│  │   │  Port 5000   │      │  Port 5000   │                │    │
│  │   └──────┬───────┘      └──────┬───────┘                │    │
│  └──────────┼────────────────────┼────────────────────────┘    │
│             │                    │                               │
│  ┌──────────▼────────────────────▼────────────────────────┐    │
│  │              Service (ClusterIP: 805000)             │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                  │
│  ┌──────────────────────┐  ┌──────────────────────────────┐     │
│  │   ResourceQuota      │  │        LimitRange             │     │
│  │   CPU:  500m max     │  │   Default: 100m CPU / 64Mi    │     │
│  │   RAM:  256Mi max    │  │   Limit:   200m CPU / 128Mi   │     │
│  │   Pods: 10 max       │  │                               │     │
│  └──────────────────────┘  └──────────────────────────────┘     │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼ /metrics scrape every 5s
┌─────────────────────────────────────────────────────────────────┐
│                   Observability Stack                            │
│                                                                  │
│   Prometheus ──────────────────────────▶ Grafana Dashboard      │
│   (time series DB)                       (live visualization)   │
│                                                                  │
│   Metrics: app_requests_total, app_request_latency_seconds       │
│   Panels:  Request rate · p95 Latency · Pod count               │
└─────────────────────────────────────────────────────────────────┘

📋 Prerequisites

Everything is free and local — no cloud account required.

Required Tools

Tool Version Purpose Install
Docker Desktop Latest Container runtime docker.com
kubectl v1.36+ Kubernetes CLI Pre-installed with Docker Desktop
Kind v0.23+ Local K8s cluster kind.sigs.k8s.io
Helm v3.21+ K8s package manager helm.sh
OpenTofu v1.12+ Infrastructure as Code opentofu.org
Python 3.11+ Application runtime python.org
Git Latest Version control git-scm.com

WSL2 Setup (Windows users)

If you're on Windows, run everything inside WSL2 (Ubuntu recommended). Enable Docker Desktop's WSL2 integration:

  1. Open Docker Desktop → Settings → Resources → WSL Integration
  2. Toggle on your Ubuntu distro
  3. Click Apply & Restart

Verify everything is working:

docker run hello-world && kubectl version --client && kind version && helm version && tofu version

Verify your setup

echo "=== Docker ===" && docker --version
echo "=== kubectl ===" && kubectl version --client
echo "=== Kind ===" && kind version
echo "=== Helm ===" && helm version
echo "=== OpenTofu ===" && tofu version
echo "=== Python ===" && python3 --version
echo "=== Git ===" && git --version

⚡ Quick Start

Clone and run the entire stack in 5 commands:

# 1. Clone the repo
git clone https://github.com/manishsinghkuswaha/Ship-it-end-to-end.git
cd Ship-it-end-to-end

# 2. Create the Kind cluster
kind create cluster --config kind-config.yaml

# 3. Provision infrastructure with OpenTofu
cd terraform && tofu init && tofu apply -auto-approve && cd ..

# 4. Build and load the Docker image
docker build -t ship-it:local . && kind load docker-image ship-it:local --name ship-it

# 5. Deploy everything
kubectl apply -f k8s/

Access the app:

kubectl port-forward svc/ship-it -n ship-it 5000:80
curl http://localhost:5000/

📁 Project Structure

ship-it/
│
├── app/                          # Flask application
│   ├── main.py                   # App + Prometheus metrics + 4 routes
│   ├── requirements.txt          # Python dependencies
│   └── test_main.py              # 7 pytest tests
│
├── k8s/                          # Kubernetes manifests
│   ├── deployment.yaml           # 2 replicas, rolling update strategy
│   ├── service.yaml              # ClusterIP service (port 80 → 5000)
│   ├── ingress.yaml              # NGINX ingress controller
│   ├── hpa.yaml                  # HPA: 2-5 pods on CPU > 50%
│   └── servicemonitor.yaml       # Prometheus ServiceMonitor
│
├── terraform/                    # Infrastructure as Code
│   ├── main.tf                   # Namespace + ResourceQuota + LimitRange
│   └── variables.tf              # Configurable variables
│
├── .github/
│   └── workflows/
│       └── deploy.yml            # CI/CD: test → build → push → deploy
│
├── kind-config.yaml              # 3-node Kind cluster config
├── docker-compose.yml            # Local dev stack (app + prometheus + grafana)
├── prometheus.yml                # Prometheus scrape config
└── Dockerfile                    # Multi-stage optimised image

🔧 Phases

Phase 1 — The Flask Application

The app is intentionally simple — a task tracker with 4 routes. What makes it production-grade is the observability built in from line one.

Routes:

Method Endpoint Description
GET / App info (name, version, task count)
GET /tasks List all tasks
POST /tasks Create a task {"title": "..."}
DELETE /tasks/<id> Delete a task by ID
GET /healthz Health check endpoint
GET /metrics Prometheus metrics endpoint

Prometheus metrics exposed:

app_requests_total{method, endpoint, status}     # Counter
app_request_latency_seconds{endpoint}            # Histogram (p50, p95, p99)

Run tests locally:

cd app
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
pytest test_main.py -v

Expected output:

test_main.py::test_index                    PASSED
test_main.py::test_create_task              PASSED
test_main.py::test_get_tasks                PASSED
test_main.py::test_delete_task              PASSED
test_main.py::test_delete_missing_task      PASSED
test_main.py::test_create_task_missing_title PASSED
test_main.py::test_healthz                  PASSED
7 passed in 0.35s

Phase 2 — Docker

The Dockerfile is optimised for layer caching — the single most impactful Docker performance technique.

FROM python:3.11-slim          # 50MB vs 900MB full image

WORKDIR /app

# ✅ Copy requirements FIRST — this layer gets cached
# If requirements.txt hasn't changed, pip install is skipped
COPY app/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# ✅ Copy code AFTER — this layer only re-runs when code changes
COPY app/ .

EXPOSE 5000

# ✅ HTTP health check — Kubernetes uses this for traffic routing
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD python3 -c "import urllib.request; urllib.request.urlopen('http://localhost:5000/healthz')"

CMD ["python3", "main.py"]

Build and test:

# Build (first time: ~60s, subsequent: ~3s due to layer cache)
docker build -t ship-it:local .

# Run locally
docker run -d -p 5000:5000 --name ship-it-test ship-it:local

# Test all endpoints
curl http://localhost:5000/
curl http://localhost:5000/tasks
curl -X POST http://localhost:5000/tasks -H "Content-Type: application/json" -d '{"title": "Deploy to K8s"}'
curl http://localhost:5000/metrics | head -20

# Cleanup
docker stop ship-it-test && docker rm ship-it-test

Run full local stack (app + Prometheus + Grafana):

docker compose up --build -d
Service URL Credentials
Flask app http://localhost:5000
Prometheus http://localhost:9090
Grafana http://localhost:3000 admin / admin

Phase 3 — Infrastructure as Code (OpenTofu)

All Kubernetes infrastructure is defined as code. Nothing is clicked through a UI.

What OpenTofu manages:

Resource Purpose
kubernetes_namespace Isolated namespace for the app
kubernetes_resource_quota Hard limits: 500m CPU, 256Mi RAM, 10 pods max
kubernetes_limit_range Default container limits if not specified

Create the Kind cluster:

kind create cluster --config kind-config.yaml

The kind-config.yaml creates a 3-node cluster (1 control-plane + 2 workers) with port mappings for ingress:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: ship-it
nodes:
  - role: control-plane
    extraPortMappings:
      - containerPort: 80
        hostPort: 8080
  - role: worker
  - role: worker

Provision with OpenTofu:

cd terraform

# Download Kubernetes provider
tofu init

# Preview what will be created — no changes yet
tofu plan

# Apply — creates namespace, quota, limitrange
tofu apply -auto-approve

Expected output:

Plan: 3 to add, 0 to change, 0 to destroy.
Apply complete! Resources: 3 added, 0 changed, 0 destroyed.

Verify:

kubectl get namespace ship-it
kubectl get resourcequota -n ship-it
kubectl get limitrange -n ship-it

Phase 4 — Kubernetes

Load the image into Kind (Kind can't pull from local Docker daemon by default):

docker build -t ship-it:local .
kind load docker-image ship-it:local --name ship-it

Apply all manifests:

kubectl apply -f k8s/

Watch pods come up:

kubectl get pods -n ship-it -w

Verify everything is running:

kubectl get all -n ship-it

Expected output:

NAME                           READY   STATUS    RESTARTS   AGE
pod/ship-it-5857b58f7-bkb7z    1/1     Running   0          40s
pod/ship-it-5857b58f7-lfqqn    1/1     Running   0          40s

NAME              TYPE        CLUSTER-IP       PORT(S)   AGE
service/ship-it   ClusterIP   10.96.146.238    80/TCP    40s

NAME                      READY   UP-TO-DATE   AVAILABLE
deployment.apps/ship-it   2/2     2            2

NAME                                          MINPODS   MAXPODS   REPLICAS
horizontalpodautoscaler.autoscaling/ship-it   2         5         2

Access the app:

# Port-forward (works on all platforms including WSL2)
kubectl port-forward svc/ship-it -n ship-it 5000:80

# In a new terminal
curl http://localhost:5000/
curl http://localhost:5000/tasks

Trigger a rolling update:

# Watch pods in one terminal
kubectl get pods -n ship-it -w

# Trigger rollout in another terminal
kubectl rollout restart deployment/ship-it -n ship-it

# Watch new pods come up before old ones terminate
# maxUnavailable: 0 guarantees zero downtime

Key manifest details:

k8s/deployment.yaml — rolling update strategy:

strategy:
  type: RollingUpdate
  rollingUpdate:
    maxSurge: 1          # One extra pod during update
    maxUnavailable: 0    # Never reduce below desired replica count

k8s/hpa.yaml — auto-scaling:

spec:
  minReplicas: 2
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50    # Scale when CPU > 50%

Phase 5 — Prometheus + Grafana

Install the full observability stack via Helm:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

helm install monitoring prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace \
  --set grafana.adminPassword=admin \
  --set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false \
  --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false

Apply the ServiceMonitor (tells Prometheus to scrape our app):

kubectl apply -f k8s/servicemonitor.yaml

Access Grafana:

kubectl port-forward svc/monitoring-grafana -n monitoring 3000:80

Open http://localhost:3000 — login with admin / admin

Key PromQL queries for dashboards:

# Request rate per endpoint (last 1 minute)
sum(rate(app_requests_total[1m])) by (endpoint)

# p95 latency per endpoint
histogram_quantile(0.95,
  sum(rate(app_request_latency_seconds_bucket[1m])) by (le, endpoint)
)

# Error rate (5xx responses)
sum(rate(app_requests_total{status=~"5.."}[1m]))

# Pod count in namespace
count(kube_pod_info{namespace="ship-it"})

Generate load to see live metrics:

for i in {1..100}; do
  curl -s http://localhost:5000/ > /dev/null
  curl -s http://localhost:5000/tasks > /dev/null
  curl -s -X POST http://localhost:5000/tasks \
    -H "Content-Type: application/json" \
    -d "{\"title\": \"Task $i\"}" > /dev/null
  sleep 0.2
done

Phase 6 — GitHub Actions CI/CD

The pipeline lives in .github/workflows/deploy.yml and runs on every push to main.

Pipeline stages:

git push origin main
        │
        ▼
┌───────────────┐     ┌─────────────────┐     ┌───────────────────┐
│  1. Run Tests │────▶│ 2. Build & Push  │────▶│  3. Deploy        │
│               │     │                 │     │                   │
│  pytest -v    │     │  docker build   │     │  kubectl apply    │
│  7 tests      │     │  push to GHCR   │     │  rolling update   │
│  ~10s         │     │  SHA tag        │     │  ~4s              │
└───────────────┘     └─────────────────┘     └───────────────────┘
                                                      ↓
                                              ✅ New version live
                                                    ~49s total

Critical: Each job only runs if the previous one succeeded. If tests fail, Docker never builds. If Docker fails, Kubernetes never deploys.

The gatekeeper demo — trigger a pipeline failure:

# Add a failing test
cat >> app/test_main.py << 'EOF'

def test_intentional_failure():
    assert 1 == 2, "This blocks the deploy"
EOF

git add app/test_main.py
git commit -m "bug: introduce a failing test"
git push origin main

Watch the Actions tab — Build and Deploy never run. The cluster keeps serving the last good version.

Recovery:

# Remove the bad test
sed -i '/def test_intentional_failure/,+2d' app/test_main.py

git add app/test_main.py
git commit -m "fix: remove failing test"
git push origin main

All 3 jobs go green. Total recovery: ~49 seconds.


🎮 Demo Guide

Step-by-step commands for a live session:

Session startup (run 15 min before)

# 1. Verify cluster
kubectl get nodes

# 2. Verify pods
kubectl get pods -n ship-it

# 3. Port-forward app (Terminal 1 — keep running)
kubectl port-forward svc/ship-it -n ship-it 5000:80

# 4. Port-forward Grafana (Terminal 2 — keep running)
kubectl port-forward svc/monitoring-grafana -n monitoring 3000:80

# 5. Verify app
curl http://localhost:5000/

# 6. Open browser tabs
#    - http://localhost:5000     (App)
#    - http://localhost:9090     (Prometheus)
#    - http://localhost:3000     (Grafana — admin/admin)
#    - GitHub Actions tab

Generate traffic for Grafana demo

for i in {1..50}; do
  curl -s http://localhost:5000/ > /dev/null
  curl -s http://localhost:5000/tasks > /dev/null
  curl -s -X POST http://localhost:5000/tasks \
    -H "Content-Type: application/json" \
    -d "{\"title\": \"Task $i\"}" > /dev/null
  sleep 0.2
done

Rolling update demo

# Watch pods in terminal
kubectl get pods -n ship-it -w

# Trigger update (in second terminal)
kubectl rollout restart deployment/ship-it -n ship-it

HPA scale demo

# Watch HPA
kubectl get hpa -n ship-it -w

# Generate heavy load (in second terminal)
for i in {1..200}; do curl -s http://localhost:5000/ > /dev/null; done

🔧 Troubleshooting

App not responding

# Restart port-forward
kubectl port-forward svc/ship-it -n ship-it 5000:80

# Check pod status
kubectl get pods -n ship-it
kubectl describe pod <pod-name> -n ship-it

Pods in CrashLoopBackOff

# Check logs
kubectl logs -n ship-it -l app=ship-it --previous

# Force restart
kubectl rollout restart deployment/ship-it -n ship-it

Image not found

# Rebuild and reload into Kind
docker build -t ship-it:local .
kind load docker-image ship-it:local --name ship-it
kubectl rollout restart deployment/ship-it -n ship-it

Grafana not loading

kubectl port-forward svc/monitoring-grafana -n monitoring 3000:80

# Get password if changed
kubectl get secret monitoring-grafana -n monitoring \
  -o jsonpath="{.data.admin-password}" | base64 --decode && echo

HPA showing <unknown> for CPU

# Install metrics-server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

# Patch for Kind's self-signed certs
kubectl patch deployment metrics-server -n kube-system \
  --type='json' \
  -p='[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--kubelet-insecure-tls"}]'

Full nuclear reset

kind delete cluster --name ship-it
kind create cluster --config kind-config.yaml
docker build -t ship-it:local . && kind load docker-image ship-it:local --name ship-it
cd terraform && tofu apply -auto-approve && cd ..
kubectl apply -f k8s/
helm install monitoring prometheus-community/kube-prometheus-stack \
  --namespace monitoring --create-namespace --set grafana.adminPassword=admin
kubectl apply -f k8s/servicemonitor.yaml

📊 Project Stats

Metric Value
Pipeline duration ~49 seconds end-to-end
Test suite 7 tests in 0.35s
Docker image size ~180MB
Kubernetes nodes 3 (1 control-plane + 2 workers)
Default pod replicas 2
Max pod replicas (HPA) 5
Cloud cost $0
Tools used 8

🗺 What to Learn Next

Once you've mastered this pipeline, these are the natural next steps:

Topic Tool Why
GitOps ArgoCD Declarative deployments from Git — next evolution after GitHub Actions
Secret management HashiCorp Vault Don't put secrets in environment variables
Service mesh Istio mTLS, traffic splitting, canary deployments
Cloud Kubernetes AWS EKS Same patterns, production scale
Container security Trivy Scan images for CVEs in the pipeline
Log aggregation Loki + Grafana Complete observability: metrics + logs + traces

Friday, May 22, 2026

AI Retrieval System RAG_Implementation Project

 

RAG (Retrieval-Augmented Generation) Implementation Using Google Gemini & FAISS

RAG (Retrieval-Augmented Generation) is one of the most important concepts in modern AI applications. It combines:

  • Retrieval systems (searching relevant information)

  • Large Language Models (LLMs) (generating intelligent responses)

Instead of depending only on the LLM’s training knowledge, RAG allows the AI to search custom documents and answer questions from them.

Your project uses:

  • LangChain

  • Google Gemini API

  • HuggingFace Embeddings

  • FAISS Vector Database

to create a simple RAG chatbot inside Google Colab.


What This Project Does

The workflow is:

User Question ↓ Search Relevant Text Chunks ↓ Send Context + Question to Gemini ↓ Generate Accurate Answer

Example:

You provide Python tutorial text.

User asks:

What is Python used for?

The system:

  1. Finds relevant chunks related to Python usage

  2. Sends them to Gemini

  3. Gemini answers using only retrieved context


Step-by-Step Explanation


1. Installing Required Libraries

!pip install langchain langchain-google-genai langchain faiss-cpu langchain-text-splitters langchain-community

Explanation

This command installs all required Python libraries.

Libraries Used

LibraryPurpose
langchainFramework for building LLM apps
langchain-google-genaiConnects Gemini AI with LangChain
faiss-cpuVector database for similarity search
langchain-text-splittersSplits large text into chunks
langchain-communityCommunity integrations
sentence-transformersEmbedding models

2. Importing Required Modules

import os from langchain_google_genai import ChatGoogleGenerativeAI from langchain_community.embeddings import HuggingFaceEmbeddings from langchain_community.vectorstores import FAISS from langchain_text_splitters import RecursiveCharacterTextSplitter from langchain_core.prompts import PromptTemplate

Explanation

These imports provide all core functionalities.

Modules

ModulePurpose
osAccess environment variables
ChatGoogleGenerativeAIGemini LLM integration
HuggingFaceEmbeddingsConverts text into vectors
FAISSStores vectors for searching
RecursiveCharacterTextSplitterSplits large text
PromptTemplateCreates custom prompts

3. Setting Gemini API Key

os.environ["GEMINI_API_KEY"] = "your_api_key"

Explanation

This sets the Gemini API key as an environment variable.

Gemini requires authentication to access Google's AI models.

You replace:

"your_api_key"

with your actual API key.


4. Input Text

Your project uses text data containing Python tutorial content.

This acts as the knowledge base for the RAG system.


5. Splitting Text into Chunks

splitter = RecursiveCharacterTextSplitter(
    chunk_size = 200,
    chunk_overlap = 50
)

Explanation

LLMs cannot efficiently process huge text directly.

So we split text into smaller chunks.

Parameters

ParameterMeaning
chunk_size=200Each chunk contains 200 characters
chunk_overlap=5050 characters overlap between chunks

Why Overlap?

Overlap preserves context continuity.

Without overlap:

  • sentences may break awkwardly

  • information may be lost


6. Creating Documents

docs = splitter.create_documents([text])

Explanation

This converts text chunks into LangChain document objects.

Each document contains:

  • page content

  • metadata

Example:

Chunk 1 → Python is a programming language... Chunk 2 → Used in AI, automation...

7. Creating Embeddings

embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

Explanation

Embeddings convert text into numerical vectors.

AI cannot understand raw text directly.

So:

"Python programming"

becomes:

[0.234, 0.872, -0.192, ...]

These vectors help measure semantic similarity.

Model Used

all-MiniLM-L6-v2

A lightweight and fast sentence transformer model.


8. Creating FAISS Vector Store

vectorstore = FAISS.from_documents(docs, embeddings)

Explanation

FAISS stores vector embeddings efficiently.

It acts like:

AI-powered search engine

Now the system can:

  • search similar text

  • retrieve relevant chunks quickly


9. Creating Retriever

retriver = vectorstore.as_retriever(
    search_kwargs = {"k":3}
)

Explanation

Retriever searches the vector database.

Parameter

"k":3

means:

  • return top 3 most relevant chunks

when user asks a question.


10. Formatting Retrieved Documents

def format_docs(docs): return "\n\n".join( d.page_content for d in docs )

Explanation

This function combines retrieved chunks into one formatted context string.

Example:

Chunk 1

Chunk 2

Chunk 3

11. Creating Prompt Template

prompt = PromptTemplate( input_variables = ["context","question"], template = """ Use only the context below to answer the question. context: {context} question: {question} Answer: """ )

Explanation

This controls how instructions are sent to Gemini.

Important Part

Use only the context below

This reduces hallucination.

Gemini will answer ONLY from retrieved text.


12. Initializing Gemini Model

llm = ChatGoogleGenerativeAI(
    model = "gemini-2.5-flash"
)

Explanation

This initializes Google's Gemini Flash model.

Why Gemini Flash?

  • fast

  • lightweight

  • cheaper

  • good for RAG systems


13. Creating Main RAG Function

def rag_answer(query): r_docs = retriver.invoke(query) context = format_docs(r_docs) full_prompt = prompt.format( context=context, question=query ) response = llm.invoke(full_prompt) return response.content

Step-by-Step Flow

Step 1

retriver.invoke(query)

Searches relevant chunks.


Step 2

format_docs(r_docs)

Formats retrieved context.


Step 3

prompt.format()

Creates final prompt.


Step 4

llm.invoke()

Sends prompt to Gemini.


Step 5

Returns generated answer.


14. Asking Question

query = "What is python used for ?" answer = rag_answer(query)

Explanation

The user asks a question.

RAG system processes it and generates response.


15. Printing Output

print("question :", query) print("\n") print("answer", answer)

Explanation

Displays:

  • question

  • generated answer


16. Final Output

Question:
What is python used for?

Answer:
Python is used for development, and its libraries help with a wide range of tasks, making development easier.

How RAG Works Internally

User Query ↓ Embedding Generated ↓ FAISS Similarity Search ↓ Relevant Chunks Retrieved ↓ Context Added to Prompt ↓ Gemini Generates Answer

Advantages of RAG

AdvantageDescription
Reduces hallucinationUses actual context
Supports private dataCan use custom documents
Real-time knowledgeNew docs can be added
Faster than fine-tuningNo retraining needed
ScalableWorks with huge datasets

Real-World Applications

RAG is used in:

  • ChatGPT-style chatbots

  • AI search engines

  • company knowledge assistants

  • PDF question-answering systems

  • customer support bots

  • legal document assistants

  • medical AI systems


Common Improvements

You can improve this project by adding:

UpgradeBenefit
PDF UploadRead PDFs
Streamlit UIWeb interface
Chat HistoryMemory
Better EmbeddingsMore accurate search
Pinecone/ChromaDBCloud vector DB
Multi-document supportMultiple files
OCRRead images

Final Understanding

This project demonstrates:

  • Generative AI

  • Semantic Search

  • Vector Databases

  • LLM Prompt Engineering

  • Embeddings

  • Retrieval Systems

which are core technologies behind modern AI systems like:

  • ChatGPT

  • Perplexity AI

  • GitHub Copilot

  • AI Search Engines

It is actually a very strong GenAI project for portfolio and interviews.

Wednesday, November 19, 2025

Understanding the Agentic Capabilities of Gemini 3 Pro

If you thought the jump from Gemini 1.5 to 2.5 was significant, Google’s latest release might just redefine your expectations entirely. Officially dropped on *November 18, 2025, **Gemini 3 Pro* isn't just a "smarter chatbot"—it is an active partner designed to stop talking and start doing.

From coding interfaces on the fly to cutting out the excessive polite waffle we’ve grown used to, here is everything you need to know about Google’s most aggressive AI update yet.

Saturday, November 15, 2025

How to Use Your Android Mobille From Ubuntu Wirelessly Using Scrcpy

 Today, controlling your entire mobile phone directly from your Linux desktop is easier than ever. Whether you want to reply to messages faster, manage apps, or use your phone’s screen inside Ubuntu, scrcpy is one of the fastest and most reliable tools to do it.

Monday, November 3, 2025

How to Add ChatGPT, Microsoft Copilot & Perplexity AI to WhatsApp

Artificial Intelligence tools like ChatGPT, Microsoft Copilot, and Perplexity AI have become extremely popular, helping users with instant answers, writing assistance, and productivity tasks. The good news is that you can now use these AI chat assistants directly on WhatsApp in India by simply saving their official numbers and starting a chat. Below is a simple guide to add them and use them safely.

ChatGPT Go Is Now Free for One Year in India

 OpenAI has announced a special offer for Indian users: its popular ChatGPT Go plan will be available free for one year, giving everyone access to premium AI tools without paying a single rupee.

This initiative makes India the first country to receive a full-year complimentary upgrade, signaling OpenAI’s growing commitment to its fast-expanding user base in the region.

Tuesday, October 28, 2025

How to Convert VLC generated video screenshots into a Single PDF on Ubuntu

 Have you ever captured multiple screenshots from a video — maybe a tutorial, lecture, or online course — and wanted to combine them into one organized PDF?

In this guide, you’ll learn how to easily convert your VLC Media Player screenshots into a single, well-ordered PDF file using a simple command-line tool on Ubuntu.

This method is fast, reliable, and keeps your image quality intact — perfect for study notes, presentations, or archiving your learning material.

Ship It: From Code to Kubernetes in One Git Push

  Ship It: From Code to Kubernetes in One Git Push In modern software development, writing code is only one part of the journey. The real ch...