52 weeks of changelogs

For my entire career, there have been two types of work: boring and interesting. Boring tasks have fallen into two categories: automatable and unautomatable.

Chart showing the evolution of work automation constraints over time

For the last half-decade, I've been automating the boring parts of my work, recognizing that many automations have been limited by my free time, skill, or technology. There has been a relatively small overlap between boring and automatable work.

But these constraints no longer apply. In the last 6 months, I've had to challenge myself to rethink what's possible.

I'm continually trying to increase the amount of time I spend in an enthusiastic state, with the hypothesis that we produce our highest and best work when we're creating from a place of passion and love.

The great news is: AI is allowing us to do just that.

A great example is writing changelogs. This has been a largely unautomatable, recurring task with a majority of undifferentiated work and a sprinkle of skill.

Changelogs have been a thorn in my side since I became a DevRel, and I'd be lying if I said I didn't dread writing them.

In 2024/2025 we shipped 52 consecutive weekly changelogs. This is how we did it with the Claude Agent SDK. ¹ ²

Mindset

These changelogs aren't meant to ship without review. The AI drafts with me as the audience, not our users.

It includes images, links to Slack threads, and internal context. Then I do the editorial work: crafting story and tone, mostly by cutting what's internal or irrelevant.

Early iterations were written for our users, but I found myself delegating critical judgment calls to an agent: what to omit, what to include, what to emphasize.

By framing myself as the audience, I'm first learning what we built, then communicating that to users as clearly as possible.

That's the differentiated skill I bring as a professional... and the entire point of using an agent: isolate undifferentiated work and delegate it.

Changelogs are important and they do require skill, it's just that much of the work is boring: our goal is to isolate that which is not.

Coding agents

Perhaps unsurprisingly, coding agents turned out to be the best fit for the task.

Marketing-specific (and even workflow-building) tools silo automation from your existing infrastructure.

Coding agents compose naturally with developer tools: MCP servers, GitHub, CI pipelines. As it turns out, content is just code.

I found the Claude Agent SDK to be a simple, declarative way to build multi-agent workflows. Combined with Skills and MCP servers, it provided a consistent, extensible foundation.

It was also shockingly fast to build & iterate on.

The experience makes me believe that there will be a convergence between content and coding agents in 2026 - either everyone will become an engineer or generalist engineers will become content creators.

Context and tools

I used MCP servers to perform actions and Claude Skills for relevant context.

The system uses five MCP servers: two custom and three official.

Custom MCP servers

I wrote two custom MCP servers for deterministic operations:

Slack server: Fetches and structures messages from our feature announcement channel, including all responses, threads, and images
GitHub changelog server: Handles PR creation and frontmatter formatting

These are really just tools wrapped in the MCP interface. The SDK's create_sdk_mcp_server makes this easy and gives me deterministic control over what data gets extracted.

Official MCP servers

Three official MCP servers work without custom configuration.

GitHub: Has 90+ tools organized into selectable toolsets via headers (I use pull_requests and repos).
Documentation servers (Mintlify and Replit): Agents get context about Replit features and Mintlify components. They can reference docs and use correct syntax without me embedding it in prompts.

MCP servers are a universal standard: any MCP server works with any LLM that supports the protocol. We can reuse these strategies across workflows, repositories, and agents. ³

Skills

Skills came to prominence while I was building with the SDK. They provide domain expertise through progressive disclosure: load content on-demand rather than embedding it in prompts.

Skills in our workflow

Brand guidelines: Our brand guidelines, including tone, voice, and style
Changelog formatting: How to format a changelog by our standards
Documentation quality: Verbatim this guide
Media insertion: How to properly insert images and videos into a changelog

Skills solved one of the biggest issues I was having with the MVP: supplying the right context at the right time.

Interestingly, the final SDK output runs through the command line, so the agent is executed as one long bash command. This has implications for cross-platform development. ⁴

From Matt to multi-agent

The agent architecture mirrors my human workflow:

Changelog Writer: Fetches Slack updates and drafts content
Template Formatter: Reformats to match the template structure
Review & Feedback: Reviews tone, quality, and accuracy
PR Writer: Creates a GitHub PR with formatted content

Mermaid diagram

Introspecting your own workflows is a necessary prerequisite for building agents. The same way a product thinker must understand their own product, agent builders should understand their own workflows.

An early MVP chained prompts in sequence. No context sharing, no revision, no tools. The multi-agent rewrite fixed that. Agents share context, revise based on feedback, and compose with MCP servers and Skills.

150 lines of Python

The entire multi-agent orchestration fits in one file: about 150 lines total. The rest is skills, docs, & servers.

The structure breaks down into five parts:

Configure MCP servers. Map server names to their configurations
Define granular permissions. Each permission is a string that controls what actions an agent can take. This includes file operations (Read, Write, Edit) and MCP server tools (mcp__slack_updates__fetch_messages_from_channel).
Group permissions by agent. Each agent gets only the permissions it needs for its specific task. The changelog_writer can fetch from Slack and write docs. The template_formatter can only edit today's changelog file.
Define each agent. Each AgentDefinition includes a description (for routing), a prompt, a model choice, and its permission group. Anthropic's model lineup works well for this: Opus, Sonnet, and Haiku all have a role.
Run the orchestrator. The ClaudeSDKClient coordinates everything: routing between agents, enforcing permissions, and connecting to MCP servers.

Code example

# 1. Configure MCP servers
MCP_SERVERS = {
    "github": McpHttpServerConfig(
        type="http",
        url="https://api.githubcopilot.com/mcp/",
        headers={
            "X-MCP-Toolsets": "pull_requests,repos",
            "Authorization": f"Bearer {GITHUB_TOKEN}",
        },
    ),
    "slack_updates": create_sdk_mcp_server(
        name="slack_updates",
        tools=[fetch_messages_from_channel],
    ),
}
 
# 2. Define permissions
permissions = {
    "read_docs": "Read(./docs/**/*.md)",
    "fetch_messages": "mcp__slack_updates__...",
}
 
# 3. Group by agent
permission_groups = {
    "changelog_writer": [
        permissions["fetch_messages"],
        permissions["read_docs"],
    ],
    "template_formatter": [
        permissions["edit_changelog"],
    ],
}
 
# 4. Define agents
options = ClaudeAgentOptions(
    agents={
        "changelog_writer": AgentDefinition(
            description="Fetch and summarize",
            prompt="You are a changelog writer...",
            model="sonnet",  # Use Sonnet for complex tasks
            tools=permission_groups["changelog_writer"],
        ),
        "template_formatter": AgentDefinition(
            description="Format to template",
            prompt="Format the changelog...",
            model="haiku",  # Use Haiku for simple formatting
            tools=permission_groups["template_formatter"],
        ),
    },
    mcp_servers=MCP_SERVERS,
)
 
# 5. Run orchestrator (also uses Sonnet)
async with ClaudeSDKClient(options) as client:
    await client.query(prompt=USER_PROMPT)

The Agent SDK abstracts away the complexity of multi-agent coordination, permission enforcement, and tool routing. You define what each agent does and what it can access.

Media handling

One of the most boring parts of changelogs is handling media: images, videos, GIFs, etc.

The custom Slack MCP server recursively crawls every message and thread, extracting all images and maintaining context about which feature they document. On average, each changelog pulls 15-20 images across 8-12 threads.

The system over-fetches initially, then CI handles cleanup: lint fails on unused images, compression reduces file sizes by 40-60%, and GIFs must stay under 2MB.

Mintlify is especially nice because it handles image hosting directly from our GitHub repo and auto-deploys branches to a preview URL. Each changelog is a self-contained, previewable document that's ready for review immediately after it's generated.

Results

After 52 weeks in production, each changelog went from 2 hours to 10 minutes. More importantly, our users stayed informed. That consistency compounded into trust.

The system runs entirely on Replit with no infrastructure maintenance. ⁵ Running on the Claude API costs under $1 per changelog, totaling ~$52 annually for a system that saved $15,000+ in labor.

Generating changelogs allowed me to build flows for monthly updates, which I send via email and a 2025 Replit Roundup (forthcoming).

The pattern extends beyond changelogs. Any recurring content task with structured inputs (release notes, status updates, weekly digests) can follow this architecture. Want to pull from Linear instead of Slack? Add an MCP server. Need agents to understand your design system? Add a Skill. The architecture doesn't change. You just extend the tooling.

The differentiated work is editorial judgment. Everything else is delegatable. Automating it means more time creating from passion.

And increasingly, the tool doing the delegating is a coding agent, even when the task has nothing to do with code.

I am aware this SDK is not 52 weeks old at the time of this writing. An earlier, much more crude version ran on several chained prompts. ↩
The source code is available on GitHub. ↩
If you'd like to deploy your own, I've found Replit to be the best option. ↩
Linux's ARG_MAX is half of macOS (~131k vs ~262k bytes). The app worked on my Mac but failed silently on remote Linux when prompts exceeded the limit. Skills solved this by loading content on-demand. If you're deploying on Linux, test there early. Thanks to Thariq for the debugging assist. ↩
You can import and deploy your own version by visiting https://replit.new/github.com/mattppal/shipping-szn ↩