Projects in production AI and platform engineering
Selected work showing end-to-end execution: problem framing, architecture decisions, secure deployment, and operational readiness across AI and cloud platforms. Metrics reflect representative results from recent implementation and benchmark windows.
- Flagship
AI-Powered Chatbot (RAG System)
Production-focused RAG assistant built to answer domain-specific questions with grounded responses, secure access, and cloud deployment patterns suitable for enterprise workloads.
Tech Stack:
PythonFastAPIAzure OpenAIRAGDockerAzure Container AppsHighlights:
- ✓Retrieval-augmented generation pipeline for grounded answers
- ✓JWT authentication and rate limiting for secure API access
- ✓Redis caching to reduce response latency and repeated token usage
- +3 more features
Impact Metrics:
- •p95 latency: ~1.4 s cached responses, ~2.8 s full RAG path
- •Grounded-answer pass rate: ~81% on retrieval/citation quality checks
- •Release cadence: 4–7 deploys/month via CI/CD with rollback support
- Flagship
DevOps Project
End-to-end platform engineering project that provisions Azure infrastructure, packages services for Kubernetes, and automates deployment through a repeatable CI/CD workflow.
Tech Stack:
PythonFastAPITerraformAKSHelmGitHub ActionsAzureHighlights:
- ✓Terraform-managed Azure infrastructure and AKS cluster setup
- ✓Helm chart packaging for versioned Kubernetes releases
- ✓GitHub Actions pipeline for build, validation, and deployment
- +2 more features
Impact Metrics:
- •Provisioning time: ~45 min initial environment, ~12 min incremental changes
- •Pipeline reliability: ~95% success on routine, non-breaking builds
- •Lead time to production: same day to 1 business day for approved changes
- Flagship
AI Quiz Platform
Microservices-based assessment platform designed to separate user, quiz, and results domains, with secure APIs and deployment workflows that support independent scaling.
Tech Stack:
JavaScriptNode.jsExpressMongoDBDockerMicroservicesHighlights:
- ✓Three independently deployable services (user, quiz, results)
- ✓Security controls with Helmet, CORS, and rate limiting
- ✓Docker Compose orchestration for local and deployment workflows
- +2 more features
Impact Metrics:
- •API p95 latency: ~140–210 ms at 50–100 combined RPS
- •5xx rate under stress windows: typically below ~0.7%
- •MTTR for common service failures: ~18–25 min using runbooks
AI Image Analyzer
Computer vision web application for image inspection workflows, combining backend analysis services with a modern frontend and cloud-native deployment practices.
Tech Stack:
PythonFastAPIReactComputer VisionPIL/OpenCVAzure Container AppsHighlights:
- ✓Image analysis for color profiling, object detection, and face detection
- ✓Drag-and-drop upload flow for quick interactive testing
- ✓Frontend and API separation for maintainable architecture
- +3 more features
Impact Metrics:
- •Processing throughput: ~25–40 images/min on standard app-tier sizing
- •Median end-to-end analysis time: ~1.8 s for typical image payloads
- •Build-to-deploy time: ~8–14 min via GitHub Actions pipeline
AI Image Captioner
Multimodal captioning toolkit that compares BLIP-family models and supports both interactive and batch workflows for practical image-to-text generation tasks.
Tech Stack:
PythonGradioTransformersPyTorchBLIPBLIP-2Highlights:
- ✓Interactive Gradio web interface for real-time captioning
- ✓Multiple model options for speed vs quality tradeoffs
- ✓Batch caption generation for local image directories
- +2 more features
Impact Metrics:
- •Single-image caption latency: ~2.2–4.5 s depending on model selection
- •Batch caption throughput: ~90–160 images/hour on commodity GPU tiers
- •Caption quality: consistently strong on clear, single-subject imagery
AI Chat Assistant
Conversational AI interface focused on responsive user experience, context-aware dialogue, and lightweight deployment for fast iteration and demoability.
Tech Stack:
PythonGradioGoogle Gemini APIHugging Face SpacesHighlights:
- ✓Real-time conversational AI with Google Gemini
- ✓Context-aware multi-turn conversations
- ✓Streaming response generation
- +3 more features
Impact Metrics:
- •Initial token response time: ~0.8–1.3 s in normal traffic windows
- •Session continuity: multi-turn context retained across typical chat flows
- •Availability: stable demo uptime with lightweight operational overhead