Learning gradient descent

Being unemployed feels a lot like working at an early-stage startup: limited resources, a short runway, and a clear goal (hopefully). The question is how you get there.

You prioritize.

So why spend time learning gradient descent? At first glance, it feels wasteful. I can already build features using Claude’s APIs. No one expects me to train transformers from scratch—I’m not competing for ML roles.

The point isn’t mastery; it’s fluency.

I want to reason intelligently about tradeoffs, limits, and failure modes. To know why things work, not just that they work. That does come at a cost. I haven’t touched calculus in five years, so I’m brushing up on concepts like partial derivatives.

I’m time-boxing this side quest, but I’m confident the ROI will be there.