Skip to main content

Command Palette

Search for a command to run...

LLMs

From Load Balancers to LLMs

A technical blog by Udbhav Somani exploring the intersection of Distributed Systems, Backend Engineering, and AI Infrastructure.

About

This publication documents a "0 to 1" journey of a Backend Developer into the world of AI at Scale. The content focuses on the architectural "plumbing" of AI—how models are served, scaled, and orchestrated across distributed clusters.

Core Topics

  • Agentic AI: Research and implementation of autonomous agent systems.
  • Agentic AI at Scale: Orchestrating and scaling agent collaborations in production.
  • Distributed Inference: High-performance model serving and low-latency architectures.
  • Big Tech Paper Deep Dives: Simplified architectural breakdowns of engineering papers from Uber (Michelangelo), Netflix (Metaflow), Google, and others.

Target Audience

Backend Engineers, System Architects, and AI Infrastructure (MLOps) professionals.

Writing Style

  • Systems-first approach.
  • Focus on latency, throughput, and reliability over theoretical math.
  • Heavy use of architectural diagrams and infrastructure-as-code patterns.

Social & Contact

Key Links

  • /about: The mission statement and roadmap.
  • /rss: XML feed for automated content discovery.