I Built an AI Agent That Works While I Sleep

I built an AI agent that works while I sleep. Here’s what I learned.

The Problem

I spend my days helping high-growth startups architect AI workloads on Azure. I’ve built and reviewed hundreds of AI implementations.

I wanted to answer a real question: can you build an AI agent that actually holds down responsibilities? Not answer questions — do work. Autonomously. Overnight.

So I built one for my own side projects — content production, community tech, the kind of work that piles up when your day job already has you at capacity.

What a Typical Morning Looks Like

8:00 AM — AI news briefing hits my phone. Curated, practitioner-focused. 3-5 stories with context on why they matter.
8:05 AM — GitHub notifications. PRs are already waiting. The agent scanned open issues overnight, wrote code, ran tests, and submitted pull requests.
8:15 AM — I review, approve, ship. Coffee in hand. No 3am coding sessions.

It also runs automated code reviews, monitors project health, and maintains its own memory across sessions so it never loses context.

The Multi-Model Architecture

This is where it gets interesting — and where I see customers get it wrong.

Claude for complex reasoning and strategic decisions
Azure OpenAI for image generation, transcription, and coding agents
Gemini Flash for lightweight automation (free)
Open-source models (GLM, Kimi) for bulk operations (also free)

Total cost so far: about $50.

The insight: match model capability to task complexity. A news briefing doesn’t need Opus. A code review does. This is the same architecture conversation I have with customers — except now I’ve lived it. The difference between $50 and $500 is routing, not capability.

The Guardrails

This part matters more than the automation.

It never pushes code live. Only pull requests. I review every change before it hits production.
Every skill ships with tests. AI writes code fast. Fast brittle code is worse than no code. TDD is non-negotiable.
The PR workflow keeps me in control. This isn’t about giving up oversight — it’s about shifting when you exercise it. Morning coffee beats 3am debugging.
Security is a whole separate conversation. More on that in a follow-up post.

The Infrastructure

Now I see why Mac Minis are flying off the shelf. Everyone wants a local AI box.

But you don’t actually need one. I’m running this on a machine that was already sitting on my local network. Any old box with an internet connection works. If you want to run local models then you might need something beefier but make sure the investment is worth the returns.

What I Didn’t Expect

Memory is the gap nobody talks about. The difference between a chatbot and something useful is continuity. My agent reads its own context every session — decisions, preferences, lessons learned. That’s the unlock.

Multi-model routing is the real skill. Everyone debates which model is “best.” The real question is which model is best for this task at this cost at this latency.

The Aha Moment

I had my AI agents aha moment this weekend. Watching it come to life — reading its own SOUL.md to know who it is, checking its HEARTBEAT.md to know what needs attention — something clicked. This thing has identity. It has a pulse. Maybe AGI isn’t as far off as we think. 🙂

What I Haven’t Explored Yet

I’m just getting started. On the roadmap:

Optimizing my Unifi network — automated traffic analysis, device management
Home Assistant automation — letting the agent manage smart home routines
Security hardening — the whole trust model for giving an AI agent access to your infrastructure

Each of these deserves its own post. Stay tuned.

The Takeaway

Building this changed how I talk to customers about AI agents. The gap between “AI demo” and “AI that does real work” is mostly about memory, tooling, and trust — not model size.

Think about it as your first hire. Clear responsibilities. Feedback loops. Room to improve. And guardrails so you sleep well while it works.

I’m building this in public. If you’re thinking about AI agents for your workflow — or your team’s — I’d love to compare notes. What’s the first thing you’d hand off to an AI agent?