top of page
Search


DeepSeek-R1 7B on OCI Ampere A1: Full CPU Inference Guide — No GPU Required
Every time someone mentions running an LLM in production, the first instinct is to reach for a GPU instance. A100s, H100s, L40s — the cost spirals fast, and for most enterprise inference workloads, you're paying for compute you don't need. I've been running inference experiments on OCI Ampere A1 instances for a while now, and the results keep surprising me in a good way. DeepSeek-R1 7B is the model that changed the conversation. Released by DeepSeek AI, this reasoning-optimis

Nikhil Verma
May 39 min read


Building a Graph RAG Pipeline for Medical Records: From Theory to Production
How Knowledge Graphs Transform AI-Powered Healthcare Search I have been building RAG systems for a while now. And for most use cases, the standard approach works fine — chunk your documents, embed them, store them in a vector database, and retrieve the most similar chunks when someone asks a question. Simple. Scalable. Gets the job done. But then I tried to apply it to medical records. And it fell apart almost immediately. Not because the technology is bad. But because medica

Nikhil Verma
Apr 2411 min read


Automating Cilium CNI Installation on OCI OKE with Pulumi and Python
In the realm of cloud-native applications, Kubernetes stands as the top choice for orchestrating containerized workloads. However,...

Nikhil Verma
Oct 8, 20259 min read
bottom of page