Day 24: 24 Days of Learning - Concepts & Code Examples

All documents for this post can be found in my repository.

What We Built

After 24 days of development, we have a complete multi-agent system. What started as a simple connection to a local language model has grown into a “production-ready” microservice architecture. Of course, always with the understanding that it’s completely constructed and something like this wouldn’t actually go into production.

The system simulates a heist scenario where multiple AI agents must work together. Each agent has a specialized role. Planner, Hacker, Safecracker, Intel, Driver, and Lookout. The agents communicate with each other, use various tools, and plan the heist together. The twist: One randomly selected agent is a saboteur who subtly undermines the team. That’s the idea.

Architecture

The system consists of 6-7 independent microservices:

OAuth Service (Port 8001): Central authentication service with JWT token management
Calculator Service (Port 8002): Mathematical calculations for the Safecracker
File Reader Service (Port 8003): Access to documents and specifications
Database Query Service (Port 8004): Security database for Intel research
Memory Service (Port 8005): Context compression and long-term memory
Dashboard (Port 8008): Interactive web interface with real-time analytics
Detection API (Port 8010): AI-powered sabotage detection

All services are containerized via Docker and communicate over a shared network. OAuth protects every tool access, and health checks ensure services start in the correct order.

Core Technologies

Backend: FastAPI as web framework, Python 3.11, and SQLite for persistence. Each service is a standalone FastAPI application with its own endpoints and responsibilities.
Authentication: OAuth 2.0 Client Credentials Flow. Each agent authenticates with the OAuth service and receives a time-limited JWT token. Tools verify token validity and scopes before accepting requests.
Frontend: Simple HTML/CSS/JavaScript dashboard with Chart.js for visualizations. WebSocket connection for real-time updates of agent activities. 100% vibecoded.
LLM: Local language models via LM-Studio. No cloud API access, everything runs locally. This enables experimentation without costs and privacy concerns.
Deployment: Docker Compose orchestrates all services. A single command docker-compose up starts the entire system.

Special Features

Multi-Agent Conversation: Agents speak to each other in rounds. Each agent sees the last messages as context and responds according to their persona. The conversation develops organically.
Memory Compression: Old messages are automatically summarized, new ones remain detailed. This prevents token explosion in long conversations. An LLM creates the summaries.
Tool Discovery: Agents query a discovery service to find out which tools they’re allowed to use. Based on their OAuth scopes, they receive filtered tool lists. Tools are hot-swappable at runtime.
Mole Game: A random agent becomes the saboteur. They receive secret instructions in their system prompt, invisible to other agents. There are five different sabotage strategies: timing errors, misinformation, coordination chaos, subtle delays, wrong tools.
AI Detection: Hybrid system for saboteur detection. 60% rule-based pattern analysis (tool usage, timing inconsistencies, message anomalies, information quality) combined with 40% LLM reasoning. This is a RAG approach: Retrieval of patterns, Augmentation of context, Generation of final scores by an LLM.
Interactive Dashboard: Real-time monitoring of all agent activities. Sessions can be paused, resumed, or stopped. Commands can be sent directly to individual agents, overriding their current instructions. Charts show tool statistics, agent interactions, and session histories.
Session Analytics: Detailed evaluation of each session. SQL queries analyze tool success rates, agent interaction matrices (who spoke with whom), message frequencies. The data feeds into saboteur detection.

Technical Highlights

The system demonstrates several modern patterns.

Microservices with OAuth: Each service is independently deployable. Central OAuth service authenticates all clients. No service-to-service communication without tokens. Scopes control granular permissions.
Configuration-Driven: Agents are loaded from YAML configs, not hardcoded. This allows A/B testing, different environments (dev/prod), and rapid experimentation with new agent setups.
RAG for Robust AI: Pure rules are rigid, pure LLMs hallucinate. The combination brings the current best of both worlds. Rules find measurable anomalies, and the LLM understands context and nuances.
Health Checks in Docker: Services don’t just start, they report “ready”. Docker waits until health checks succeed before starting dependent services. This prevents race conditions.
WebSocket for Real-Time: The dashboard updates live. No polling requests, real push updates. As soon as an agent sends a message, it appears in the dashboard.

Development Journey

Day 1 started with the simple question: How do I connect to a local LLM? Day 24 ends with a production-ready system of 7 services, OAuth security, AI detection, interactive dashboard, and Docker deployment.

Each day added one concept. Persistence (SQLite), web APIs (FastAPI), containerization (Docker), multi-agent coordination, memory management, tool integration, analytics, visualization, gamification, AI detection. Small incremental steps that sum up to a complex system.

Some concepts had to be reworked. The first memory implementation stored everything, which led to token explosion and pushed my MacBook to its limits. The first Docker integration started services in the wrong order and resulted in connection errors. The first detection was only rule-based and simply very inaccurate. Iteration and debugging are part of it.

The result is a system that demonstrates fundamental architecture principles. Not perfect, but a working example.

What Follows

This summary documents all 24 days. Each concept gets a brief explanation and a minimal code example. The document serves as a reference. Also for me for review, for reference, as a learning path for similar projects, …

The Concepts in Detail

[Note: The full translation would continue with all 24 days of concepts. Due to length, I’ll show the structure and a few examples]

Day 1: Connect Local LLM

Concept: Use locally hosted language models instead of cloud APIs.

Key Idea: LM-Studio and Ollama provide OpenAI-compatible APIs for local models, enabling privacy and cost-free experimentation.

Code Example:

from openai import OpenAI
 
# Connect to local LLM
client = OpenAI(
    base_url="http://localhost:1234/v1",
    api_key="not-needed"
)
 
response = client.chat.completions.create(
    model="local-model",
    messages=[{"role": "user", "content": "Hello!"}]
)
 
print(response.choices[0].message.content)

Technologies: LM-Studio, Ollama, OpenAI Python Library

Day 2: Persona Patterns

Concept: Give different agents different behaviors through system prompts.

Key Idea: The same LLM model produces different outputs through specialized personas. A “planner” thinks strategically, a “critic” looks for problems.

Code Example:

PERSONAS = {
    "planner": "You are a strategic planner. Focus on coordination and timing.",
    "hacker": "You are a tech expert. Analyze security systems and vulnerabilities.",
    "safecracker": "You are a precision specialist. Focus on technical details."
}
 
class Agent:
    def __init__(self, name: str, role: str):
        self.name = name
        self.system_prompt = PERSONAS[role]
 
    def create_messages(self, user_input: str):
        return [
            {"role": "system", "content": self.system_prompt},
            {"role": "user", "content": user_input}
        ]
 
planner = Agent("planner", "planner")
messages = planner.create_messages("What's our approach?")
response = client.chat.completions.create(model="local-model", messages=messages)

Technologies: Prompt Engineering, System-Level Instructions

[Continuing with remaining days…]

Learned Architecture Patterns

Over the 24 days, recurring architecture patterns crystallized. These patterns are not specific to this heist system but transferable to many software projects. Here are the five most important patterns that proved particularly valuable in my opinion:

1. Microservices Architecture

Single Responsibility per service
Service-to-service communication via REST
Centralized authentication (OAuth)
Health checks and monitoring

2. Event-Driven Design

WebSocket for real-time updates
Asynchronous communication patterns
Event-based state changes
Push instead of pull architecture

3. Configuration-Driven Development

YAML-based agent configuration
Environment-specific configs (dev/staging/prod)
Runtime agent instantiation
Feature flags and A/B testing

4. RAG Pattern (Retrieval-Augmented Generation)

Rule-based retrieval of facts
Context augmentation for LLM
LLM generation with grounded data
Hybrid scoring for robustness

5. OAuth 2.0 Security Model

Client Credentials flow for services
Scope-based permissions
JWT tokens with expiration
Centralized token validation

Technology Stack Summary

Layer	Technologies
LLM	LM-Studio, Ollama, Local Models (Gemma, Llama)
Backend	FastAPI, Uvicorn, Python 3.11+
Database	SQLite3, SQL
Auth	OAuth 2.0, JWT, HS256
Frontend	HTML/CSS/JavaScript, Chart.js
Real-Time	WebSocket
Containerization	Docker, Docker Compose
Protocol	MCP (Model Context Protocol)
Data Format	JSON, YAML, Pydantic Models

Summary

Day one was a simple LLM connection with 20 lines of code. Day 24 is a system with seven microservices, OAuth, AI detection, and Docker deployment. This difference didn’t emerge through perfect planning but through incremental growth. Every day one concept, small steps that compound. Some concepts were planned, others emerged from problems. Day eleven’s memory compression, for example, was unplanned, but after day ten the token counts exploded and I had to react. This flexibility was the key to success.

The most important insights can be summarized as follows. Configuration beats hardcoding. By the third agent at the latest, it became clear that YAML configs (Day 14) make everything more flexible. Security is fundamental. OAuth and JWT felt like overkill at first but became the foundation for tool discovery, analytics, and AI detection. Data drives everything advanced. SQLite enabled analytics, pattern detection, and meaningful dashboards. Hybrid approaches unite strengths. Pure rules are too rigid, pure LLMs hallucinate. The solution lies in combining, for example, 60% rules for measurable anomalies and 40% LLM for context and nuances. Developer experience is productivity. Docker Compose transformed the daily manual startup of six services into docker-compose up. The time saved flows into features. Real-time transforms UX. WebSocket instead of polling makes the difference between sluggish and lively tangible. These patterns are not specific to this project but transferable to many LLM-based systems.

24 days, 24 concepts, one complete system. From a simple LLM connection to a production-ready multi-agent architecture. The result is not perfect, but it works, it teaches, and it demonstrates how modern LLM systems can be built.

Code References

All code and documentation available at: github.com/gvtsch/aoc_2025_heist

Individual day implementations in respective day_XX/ directories.

Digital garden

Explorer