The Problem
Mid-Atlantic Climate Systems runs a 22-vehicle HVAC fleet across three counties. Every morning, a dispatcher spent four hours manually assigning service calls to technicians based on geography, urgency, and parts inventory — a brittle process that broke down the moment a single truck got delayed or a job ran long. Fuel costs were $14,200/month and climbing. Idle time averaged 3.4 hours per technician per day.
The company evaluated cloud-based route optimization SaaS products. Three problems surfaced: (1) every provider required uploading customer addresses and service histories to their cloud, (2) per-vehicle pricing would cost $900/month at their fleet size, and (3) no provider could integrate with their legacy on-premise inventory system without a custom API build that nobody wanted to quote.
The Deployment
StarBloom Consulting deployed a private AI cluster on a single Dell Precision workstation the company already owned — a machine previously used for CAD that had been sitting idle. The stack:
- Ollama running Qwen 2.5 Coder 7B (Q4_K_M quantized) for the optimization engine
- n8n for workflow orchestration — ingesting the morning dispatch CSV, technician GPS pings, and inventory levels from the on-prem SQL Server
- Open WebUI as the dispatcher’s interface — a chat-style prompt where they describe exceptions (“Truck 14 is delayed 90 minutes, redistribute his route”)
- A custom Python agent that runs a constraint-satisfaction solver every 15 minutes, re-optimizing all active routes against real-time technician positions and parts availability
The entire stack runs air-gapped. No API keys. No cloud billing. The workstation’s RTX A2000 GPU handles inference comfortably at 4K context — the optimization payloads are structured JSON, not long-form text.
Results
| Metric | Before | After | Delta |
|---|---|---|---|
| Morning dispatch time | 4 hours (manual) | 90 seconds (automated) | −99.4% |
| Monthly fuel cost | $14,200 | $8,804 | −38% |
| Idle time per technician | 3.4 hrs/day | 1.3 hrs/day | −2.1 hrs |
| Missed appointment windows | 12–15/week | 2–3/week | −80% |
| Dispatcher overtime | 18 hrs/month | 0 hrs/month | −100% |
The $950 deployment fee was recovered in the first two weeks of fuel savings alone. The company now runs the same stack for HVAC load calculations and parts inventory forecasting — both on the same hardware, at zero additional cost.
What Made This Work
- The hardware already existed. The CAD workstation was idle. No CapEx required.
- The data never left the building. Customer addresses, service histories, and inventory data stayed on the SQL Server they’d been running for years.
- The interface is a chat box. The dispatcher doesn’t know what a constraint solver is. She types “Truck 14 delayed, fix the afternoon” and the system redistributes routes in 90 seconds.
- No recurring costs. The models are open-source. The orchestration is open-source. The only cost was deployment.
This is the pattern we repeat across every engagement: find the idle hardware, deploy the open-source stack, build the bridge between the terminal and the shop floor.