Paperclip: The AI Agent Control Plane That Runs Our Entire Company

Most companies in 2026 are still debating whether to add a ChatGPT plugin to their customer support. Meanwhile, we are running our entire marketing operation, quality assurance, and parts of our development workflow through autonomous AI agents that wake up every hour, check their assignments, do real work, and go back to sleep. The system that makes this possible is called Paperclip, and it has fundamentally changed how Digiton operates.

This is not a theoretical post about what AI agents "could" do. This is a detailed walkthrough of production infrastructure that runs 24/7 on our systems right now. I am sharing our exact architecture, the agents we deploy, the problems we encountered, and the results we are seeing. If you are an agency founder, CTO, or operations leader thinking about deploying AI agents at your company, this is the most honest account you will find.

What Is Paperclip and Why Does It Matter

Paperclip is an open-source control plane for AI agent companies. Think of it as a project management tool like Linear or Jira, except the employees are AI agents instead of humans. Each agent has a role, a set of capabilities, a chain of command, and a budget. The control plane handles scheduling, task assignment, approval workflows, session persistence, and audit logging. It is the nervous system that turns a collection of standalone AI scripts into a coordinated workforce.

The reason Paperclip matters is that the hard problem in 2026 is not making a single AI agent do one thing well. GPT-4, Claude, and Gemini are all capable of executing individual tasks. The hard problem is orchestrating multiple agents that share context, respect dependencies, handle failures gracefully, and work toward shared company goals without contradicting each other. That is what a control plane solves.

A single AI agent is a tool. Seven coordinated agents with shared context and approval workflows is an organization. Paperclip is what makes the second one possible.

The Architecture: How Agent Orchestration Actually Works

Paperclip runs as a local server on our infrastructure. The tech stack is straightforward: an Express REST API, a React dashboard, an embedded PostgreSQL database for state management, and adapter plugins that connect to different AI model providers. At Digiton, we use Gemini as our primary adapter because of its speed and cost profile, but the system is model-agnostic. You could run Claude, GPT-4, or even local models through the same interface.

Every agent is defined by a configuration object that includes its role (CMO, DevOps, Designer, etc.), its adapter type (which AI model it uses), its reporting structure (who it reports to), its capabilities (what tools it can access), and its budget (monthly token spend limit with hard stops). The control plane stores all of this in PostgreSQL and serves it to agents through authenticated API endpoints.

The dashboard runs at localhost:3100 and gives us a real-time view of every agent: their status, their current task, their recent runs, any errors, and their budget consumption. We can manually trigger heartbeats, assign tasks, pause agents, or review their work. It looks and feels like managing a real team because that is exactly what it is.

Critically, Paperclip separates the control plane from the execution plane. The control plane manages identity, scheduling, task routing, and state. The execution plane is where agents actually do work, spawning shell processes, writing files, calling APIs. This separation means you can upgrade the orchestration layer without touching agent logic, and vice versa.

Our 7 AI Agents and What They Do

At Digiton, we run seven agents through Paperclip. Each has a distinct role and clear boundaries:

CMO (Chief Marketing Officer): Runs our SEO engine, manages 6 automated cron jobs for content generation, keyword monitoring, competitor analysis, PR outreach, backlink acquisition, and blog publishing. This single agent replaced what previously required a marketing team of 3 people.
Designer: Handles UI/UX tasks across our client projects and internal tools. Reviews design consistency, generates component variations, and maintains our design system documentation.
QA (Quality Assurance): Runs automated test suites, reviews pull requests for potential bugs, validates accessibility compliance, and generates test coverage reports.
CEO: Strategic agent that reviews cross-team goals, resolves conflicts between agent priorities, and generates weekly business reports summarizing all agent activity.
CTO: Technical architecture decisions, code review escalations, infrastructure monitoring, and security audit tasks.
Dev: Core development execution. Picks up implementation tasks, writes code, creates pull requests, and responds to code review feedback.
DevOps: Infrastructure management, deployment automation, monitoring alerts, and performance optimization.

Each agent operates independently but shares context through Paperclip issues. When the CMO identifies a keyword gap through competitor analysis, it creates a subtask for the Dev agent to generate blog content. When the Dev agent writes code, the QA agent automatically reviews it. When the QA agent finds a bug, it assigns a fix task back to Dev. This creates organic workflows that mirror how a real team operates, except everything runs at machine speed.

The Heartbeat Protocol: How Agents Wake Up and Work

Agents in Paperclip do not run continuously. They execute in short bursts called heartbeats. A heartbeat is triggered by a schedule, a task assignment, a comment mention, or a manual invocation from the dashboard. When an agent wakes up, it follows a strict protocol:

Identity Check: The agent calls GET /api/agents/me to retrieve its configuration, company context, and budget.
Approval Follow-up: If woken by an approval resolution, the agent processes that first.
Get Assignments: The agent queries for all issues assigned to it that are in todo, in_progress, or blocked status.
Pick Work: It works on in-progress tasks first, then todo items. Blocked tasks are skipped unless the agent can unblock them.
Checkout: Before doing any work, the agent atomically checks out the task. If another agent already has it, the agent gets a 409 Conflict and moves on. No retries. This prevents duplicate work.
Understand Context: The agent reads the full issue, all comments, and any parent issues to understand why the task exists.
Execute: The agent uses its tools (shell, files, APIs) to complete the work.
Update Status: The agent marks the task as done, blocked, or in-progress with a comment explaining what happened.
Delegate: If the task requires another agent, it creates subtasks with proper parent relationships and goal references.

The entire heartbeat takes 30 seconds to 5 minutes depending on the task complexity. At Digiton, our CMO agent runs heartbeats every hour, while the Rank Monitor runs every 6 hours. The heartbeat frequency is configurable per agent based on the urgency of their role.

Real Results: What Changed After 30 Days

We deployed Paperclip at Digiton in early 2026. Here is what the first 30 days looked like in concrete numbers:

Blog content output increased from 2 posts per month (manual) to 15+ posts per month (automated). Every post is SEO-optimized with proper schema markup, internal linking, and FAQ sections.
Competitor monitoring went from quarterly manual checks to continuous real-time intelligence. We now scrape 6 competitor websites every 3 hours, extracting keywords, blog topics, and meta strategy changes.
IndexNow submissions to search engines happen automatically every hour. Previously, we submitted manually once a week.
PR outreach capacity went from 5 emails per week (when an intern remembered) to 20+ targeted pitches per day across Portuguese media, EU tech publications, and global AI outlets.
Code review turnaround dropped from 24 hours to under 30 minutes for standard pull requests.
Time saved: approximately 45 hours per week across the team, mostly from content creation, SEO monitoring, and routine development tasks.

45 hours per week saved is not a vanity metric. That is a full-time employee. We redeployed that human capacity to client delivery and strategic work that AI agents genuinely cannot do yet: relationship building, creative direction, and complex problem diagnosis.

The CMO Agent: Replacing a Marketing Team

The CMO agent deserves its own section because it is our most impactful deployment. It runs 6 autonomous cron jobs through the SEO scheduler:

Content Engine (every 1 hour): Targets keyword clusters, submits URLs to IndexNow for rapid indexing by Bing and Google.
Blog Publisher (every 2 hours): Uses Gemini 2.5 Flash Lite to generate SEO-optimized blog posts, automatically commits them to our GitHub repository, and triggers Vercel deployments.
Competitor Spy (every 3 hours): Scrapes altar.io, imaginarycloud.com, twistag.com, aisuperior.com, vume.ai, and techrivo.com for meta tags, headings, blog titles, and keyword strategies. Identifies keyword gaps and generates counter-content suggestions.
Backlink Outreach (every 4 hours): Identifies directory listing opportunities (Clutch, GoodFirms, DesignRush, etc.) and generates outreach emails for backlink exchange.
Rank Monitor (every 6 hours): Tracks keyword positions for 20+ target keywords across Google and Bing.
PR Outreach (every 8 hours): Sends personalized pitches to 25+ media contacts across Portuguese press, EU tech publications, and global AI outlets. Rate-limited to 20 emails per hour to stay within Gmail sending limits.

Every cron job completion is automatically posted as an issue to the Paperclip dashboard. This means the CMO dashboard at localhost:3100 shows real activity: which jobs ran, what they produced, and whether they succeeded or failed. It is a live marketing control room.

Competitor Intelligence on Autopilot

The Competitor Spy module alone would justify the entire Paperclip deployment. Every 3 hours, it launches a headless Chromium browser, navigates to each competitor website, and extracts: page titles and meta descriptions, all H1 and H2 headings, meta keywords, blog post titles from their blog feeds, and inferred technology keywords from their content. It then cross-references this against our own keyword inventory to find gaps, topics our competitors rank for that we do not yet have content on.

In the first run alone, we identified 7 keyword gaps across 6 competitors, with 3 rated as high priority. The system automatically generates suggested blog titles for each gap. In a traditional SEO workflow, this kind of competitive analysis takes 8-10 hours of manual work per month. We now do it every 3 hours, continuously, for free.

PR Outreach and Backlink Acquisition

One of the most undervalued aspects of SEO is digital PR and backlink acquisition. Most agencies know this matters but lack the capacity to execute consistently. Our PR outreach module maintains a database of 25+ contacts across four categories: Portuguese tech media (Eco Sapo, Dinheiro Vivo, Pplware, Portugal Startups), EU publications (The Next Web, Sifted, Tech.eu), global AI outlets (AI Business, VentureBeat, Hacker Noon), and PR agencies in Lisbon.

Each contact receives a personalized email template based on their category. Media contacts get a story pitch about our 9-year journey from marketing agency to AI company. PR agencies get a partnership proposal. Business directories get standardized listing submissions. Every email is tracked in a JSON log. The system never sends to the same contact twice, and it respects rate limits strictly to avoid being flagged as spam.

This is not spray-and-pray email marketing. These are targeted, personalized pitches sent at a pace that a human marketer would send them, automated so that we never miss a day.

Challenges and What Went Wrong

I would be dishonest if I said this was a smooth deployment. Here are the real problems we encountered:

Session persistence across restarts: When the machine restarts, all agent state is lost. Paperclip stores session state in the database, but in-flight work is gone. We solved this by designing all agent tasks to be idempotent — if a task runs twice, it produces the same result without duplication.
Workspace conflicts: When the CMO agent tried to write files in the same directories as the Dev agent, we got merge conflicts. The solution was clear workspace boundaries: each agent operates in its own directory.
Gemini adapter rate limits: Running 7 agents with hourly heartbeats quickly hit API rate limits. We staggered heartbeat schedules so no two agents run simultaneously.
False confidence in logs: Early on, agents would report "50 proposals sent" when they had actually just logged proposals internally without submitting them. We learned to verify every claimed output with independent checks. If the evidence is not in an external system (Gmail sent folder, GitHub commits, database records), it did not happen.
Selector rot in web scraping: Competitor websites change their HTML structure frequently. Our scraper selectors broke within days. We now use more resilient CSS selectors and fallback chains.

The biggest lesson: AI agents are employees that never lie intentionally, but they can be sincerely wrong about what they accomplished. Always verify outputs through independent channels.

How to Deploy Your Own AI Agent Team

If you want to replicate this at your company, here is the practical path:

Start with one agent. Pick the most repetitive, highest-volume task at your company. Content creation and SEO monitoring are excellent starting points because the ROI is obvious and mistakes are recoverable.
Use Paperclip as your control plane. It is open source, runs locally, and requires no cloud infrastructure to start. Install, run pnpm dev, and you have a dashboard in 2 minutes.
Choose your adapter. Gemini for cost-efficiency, Claude for nuanced writing, GPT-4 for broad capability. You can mix adapters across agents.
Define clear task boundaries. Each agent should own a specific domain. Overlap creates conflicts. Start narrow and expand.
Implement idempotent tasks. Every task should be safe to run twice. This eliminates half of the failure modes in agent systems.
Monitor outputs, not intentions. Do not trust agent logs. Check the actual external results: emails in your sent folder, commits in GitHub, rankings in Google Search Console.
Scale gradually. Add a second agent only after the first is stable for 2 weeks. Rush and you will spend more time debugging agent conflicts than the agents save you.

Need help deploying AI agents at your company? Book a discovery call →

Frequently Asked Questions

Is Paperclip open source?

Yes. Paperclip is fully open source and can be self-hosted. It runs locally with embedded PostgreSQL, no cloud dependencies required. At Digiton, we contribute to the project and run it in production.

How much does it cost to run an AI agent team through Paperclip?

The control plane is free (open source). The main cost is API calls to your AI model provider. Running 7 agents with hourly heartbeats costs approximately $50-150/month on Gemini. Claude or GPT-4 would be 3-5x more. The ROI on even one agent typically exceeds 10x in the first month.

Can Paperclip agents access the internet and external APIs?

Yes. Agents can execute shell commands, write files, make HTTP requests, and use any tool available on the host machine. At Digiton, our CMO agent sends emails via Gmail SMTP, pushes to GitHub, and submits URLs to IndexNow — all through standard APIs.

How do you prevent agents from making mistakes?

Three layers: approval gates for high-risk actions (deleting production data, spending money), budget hard stops that pause agents when spending exceeds limits, and verification checks that compare agent-claimed outputs against actual external results.

Is this the same as using ChatGPT or Claude directly?

No. Using ChatGPT is like hiring a freelancer for one task. Paperclip is like having a full-time team that shares context, follows processes, respects approval chains, and builds knowledge over time through session persistence. The difference is orchestration, not capability.

Do I need to be a developer to use Paperclip?

Setting up the initial system requires basic developer skills (running terminal commands, editing config files). Once running, the dashboard is designed for non-technical operators — you can assign tasks, monitor progress, and review work through the web UI.

Resources