Skip to content
View badnikhil's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Organizations

@da-Circle @collective-qubits-society @token-maxing

Block or report badnikhil

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
badnikhil/README.md

Nikhil — Systems Programmer & Full-Stack Architect

"I count bits to save a clock. I don't use valgrind; I know where I left my pointers. But when it's time to ship, I make the Cloud bend to my will."

They say software is eating the world, but most of it is choking on bloated frameworks and electron apps. I prefer staying close to the metal. I write code that respects the CPU, compiles fast, and runs faster.

I'm a systems programmer, open-source maintainer, and now an enterprise-grade full-stack architect. Whether I'm writing bare-metal OS kernels, optimizing CUDA matrix multiplications, or orchestrating distributed containerized microservices on Google Cloud—I build systems that scale.


🔧 The Workbench

Here is what's compiling on my machine:

  • The TodoDistributed Task Orchestration. A massively over-engineered, full-stack microservice platform (FastAPI, React, Postgres, Redis). Features WebSockets, Celery background workers, and a full OpenTelemetry observability pipeline (Loki, Tempo, Prometheus, Grafana). Containerized and automated via GitHub Actions on a Google Cloud ARM64 VM.
  • WebAIAutonomous AI Orchestration. A native VS Code AI agent that bypasses API rate limits by reverse-engineering frontend APIs and using headless browser session-state emulation. Features an AST-aware unified diff patching engine and a fault-tolerant, 15-iteration self-correcting runtime.
  • bindbc-cudaOfficial Author & Maintainer. When I saw the D ecosystem relying on a dead CUDA library, I wrote a modern replacement from scratch for dcompute.
  • MyOS — Wrote a 64-bit operating system from the UEFI bootloader up. Custom memory allocator, ACPI, xHCI (USB 3.0). Listed in Awesome OS because apparently, kids these days find writing kernels "awesome".
  • CAMM (CUDA Matrix Multiplication) — Pushed register-level tiling and memory coalescing to hit 93% of NVIDIA's cuBLAS throughput.
  • API Dash (GSoC 2026) — Gutting the core into pure Dart and building a multi-protocol (gRPC, MQTT, WebSocket) developer CLI. With Lots of PRs spanning core architecture refactors and model migrations.

⚙️ The Toolbelt

I don't chase trends. I use what works.

  • The Old Guard: C, C++, x86-64 Assembly. (Where the real work happens).
  • The Compute: CUDA. (If you aren't thinking in warps and shared memory tiles, you're just heating up the room).
  • The Full-Stack & Cloud: Python (FastAPI), TypeScript / React, Next.js, Docker, Nginx, Google Cloud (GCP). (Deploying containerized microservices and automated CI/CD pipelines at scale).
  • The Day Job: Flutter / Dart. (Because shipping Extremely Optimized e-commerce apps at Johar Basket and building production infra as CTO at Doordripp pays the bills).

📡 The Specs

  • ☁️ Full-Stack DevOps Architect — Mastering Docker, Nginx, CI/CD, and OpenTelemetry to build resilient, auto-scaling SaaS platforms.
  • 🔬 Google TPU Research Cloud — Accessing heavy iron for high-performance compute.
  • ☁️ AWS Certified Solutions Architect — I know how the "cloud" works (it's just someone else's Bash).
  • 🎯 GSoC 2026 Contributor — API Dash. Implementing low-latency multi-protocol support (gRPC, MQTT, WebSocket), and engineering a unified cross-platform CLI.

📊 Telemetry


"Any fool can write code that a computer can understand. Good programmers write code that humans can understand... and great programmers write code that doesn't waste my L1 cache or crash my Docker containers."

Pinned Loading

  1. the-todo the-todo Public

    Sprinting years of senior engineering in a single todo app.

    Python

  2. CAMM CAMM Public

    CAMM: CUDA Accelerated Matrix Multiplication

    Cuda 2

  3. MyOS MyOS Public

    I just hate using people's code :\

    C 8 2

  4. file_organizer file_organizer Public

    a simple tool to organize your messy folders

    C 5 1

  5. bindbc-cuda bindbc-cuda Public

    Dynamic and static D language bindings for the CUDA Driver API, built on the bindbc loader framework.

    D 3

  6. silicon-to-assembly silicon-to-assembly Public

    An in-depth exploration of assembly language, focusing on the underlying execution model and computer architecture.

    11