The Complete Guide to Local WordPress Development and Testing

Developer workspace with a colorful code editor on a screen lit by ambient blue and red lighting

A good local WordPress development setup is invisible. You clone a repo, run one command, and you’re editing code against a real WordPress site within a minute. Tests run on every save. Bugs reproduce instantly. Changes ship to a real staging environment with one click. The whole machine fades into the background and you spend your time on the work that matters.

A bad setup is the opposite. You spend half your morning fighting Docker, the other half discovering why the bug you fixed locally still happens on production. Friction compounds. By the time you’re three months into a project, every change costs you twice what it should.

This is a complete guide to setting up the good version. It covers every tool worth knowing in 2026 (and a few that aren’t), how to pick the right stack for what you’re doing, the configurations I actually use, the testing patterns that scale from solo work to a team with AI agents in the loop, and the gotchas you’ll hit along the way.

The framing throughout is AI-first. I run a team of AI agents shipping production WordPress code every day, and I’m building the platform that makes that work for everyone else at AgentVania. The tools and patterns that hold up when an AI agent is one of your developers are the same ones that make life easier for a human developer. If you build for the harder case, the easier case comes free.

1. Why local WordPress development matters

The case for local development is so old it barely needs making, but humour me — the constraints have shifted in 2026, and the modern answer isn’t the same as it was five years ago.

You need a local environment because:

Iteration speed. Editing files locally is hundreds of milliseconds faster than editing over SFTP or through wp-admin. Compounded across a day, that’s hours.
Reproducibility. A real WordPress site that mirrors production lets you reproduce bugs that don’t happen in your head and prove fixes that don’t happen by hope.
Testing. PHPUnit and Playwright don’t run against production. They run against an isolated WordPress install, which has to come from somewhere.
CI. The same environment definition that runs on your laptop should run in GitHub Actions. The local environment is the source of the CI environment, not a separate thing.
AI agents. This is the new one. An AI agent on your team writing code, running tests, and verifying its own work needs a programmatic environment to operate against. The local environment is now also the agent environment.

The last point reshapes which tools deserve serious consideration. A local environment that only works through a GUI excludes AI agents from your workflow by construction. That’s a choice — and increasingly a costly one.

2. What you actually need from a local environment

Eight requirements. The first four are universal. The last four are what an AI-first workflow adds; they happen to also be exactly what a productive human workflow wants.

1. A real WordPress install. Not a partial mock, not a stubbed runtime. The actual WordPress core, running against a real database (or close to one), serving real admin pages over HTTP.

2. Configurable WP / PHP versions. You should be able to point the environment at any WordPress core version (current stable, trunk, a specific tag) and any PHP version (8.1, 8.2, 8.3, 8.4) without reinstalling anything. Compatibility matrices are routine in plugin work.

3. WP-CLI access. Every meaningful operation in WordPress — activating plugins, importing content, updating options, creating users, exporting databases — has a WP-CLI command. Your environment must expose wp as a callable command against the running install.

4. Reset to clean. You should be able to destroy the entire environment and rebuild it from scratch in seconds. Persistent test cruft is the silent killer of testing discipline.

5. Environment defined in a file. Not configured through a wizard. Not stored in some per-machine app data folder. A config file committed to the repo, alongside the code, so anyone (human or agent) cloning the repo gets an identical environment from one command.

6. Every operation as a CLI command. “Create site,” “install plugin,” “import database,” “run tests,” “reset state” — all callable from a script. No required clicks anywhere.

7. Programmatic state inspection. You need to query “is this plugin active?”, “what does this option store?”, “what errors did the last cron run produce?” — without screenshotting an admin screen and parsing pixels.

8. Reproducible snapshots. Spin a fresh environment, run a test scenario, capture the result, tear it down. Repeatedly. Cheaply. The cost of a single test run sets the ceiling on how many tests you’ll actually run.

GUI-driven tools score 0 out of 4 on requirements 5–8 by default. Some have community workarounds that lift the score. None compare to tools that were built for the command line from day one. Keep this scorecard in mind as you read the tool landscape below.

3. The tool landscape

There are more local WordPress tools than most people realise, and the field has shifted significantly in 2025–2026. Here’s the complete picture, grouped by category.

Desktop GUI apps

Local WP (formerly Local by Flywheel). The friendliest GUI in the space and the tool most WordPress devs reach for first. Click “Create Site,” pick PHP and WP versions, you’re in. It bundles Mailpit, SSL, Live Link tunnels, cloud backups, and a blueprint system for saving site configurations. There’s no official CLI, but the community has filled the gap: salcode/wpcli-localwp-setup wires up WP-CLI directly, and a March 2026 AI agent skill wraps WP-CLI for LLM use. Excellent for human-first work, possible to drive from agents with extra effort.

DevKinsta. Kinsta’s free local dev tool. Polished UI, push-to-staging integration if you host on Kinsta. Docker under the hood, GUI on top. A reasonable pick if you’re already in their ecosystem. If you’re not, you’re using a hosting vendor’s tool for portability reasons, which is backwards.

WordPress Studio (the WP.com one, formerly Studio by WordPress.com). New since 2024, growing fast. Built on wp-now — WordPress compiled to WebAssembly with SQLite instead of MySQL, no Docker required. Spins up in seconds. Defaults to SQLite but configurable to MySQL by editing wp-config.php. The 2025 SQLite driver rewrite (AST-based, replicates MySQL’s information_schema) closed most of the compatibility gap. Has a polished GUI plus some CLI access. The fastest path to “WordPress, right now” without thinking about servers.

Docker-based tools

wp-env. WordPress’s official Docker-based environment, distributed as @wordpress/env on npm. Configured by a .wp-env.json file in your repo, so the environment is checked in alongside the code. Two commands: wp-env start, wp-env run cli wp .... Ships a pre-wired PHPUnit test environment via a tests-cli container. Less polished than the desktop GUIs but infinitely more scriptable. Purpose-built for plugin and theme development.

DDEV. Full-featured Docker dev environment, YAML config, multi-stack (WordPress, Laravel, Drupal, plain PHP, more). ddev wp runs WP-CLI inside the container. ddev pull syncs database and files from staging. Different projects can run different PHP/MySQL versions side by side without conflict. Works on macOS (with OrbStack, Docker Desktop, Colima, or Lima), Windows (WSL2), and Linux. The agency consensus in 2026, and the strongest choice for full WordPress sites where the unit of work is bigger than one plugin.

Lando. YAML-configured Docker dev environments. Powerful, flexible, supports many stacks beyond WordPress. Plays in DDEV’s space and has been there longer; DDEV has eaten most of its momentum in the past two years. Still well-loved at agencies that use Lando across non-WP projects too.

Custom Docker Compose. Some teams roll their own docker-compose.yml. Total control, total maintenance burden. The right answer when your stack is unusual (multisite plus WPML plus custom services plus specific PHP-FPM tuning) and the wrong answer for plain plugin dev.

For a deeper comparison of when to reach for Local WP versus the Docker family specifically, I covered that in Local WP vs Docker: When to Use Each for WordPress Development — this guide takes a broader view across the whole landscape, but the Local-WP-vs-Docker question is the most common branching point and that post zooms in on it.

WebAssembly tools

WordPress Playground. The same WASM-based engine as Studio, but shipped as a browser tool and a CLI (@wp-playground/cli). The killer feature is blueprints: a JSON file describing exactly which WP version, which plugins (from URLs or zips), which settings, which demo content. Send someone a blueprint URL and they get an identical WordPress site in their browser in five seconds. For bug reproductions and demos, nothing else comes close. Same SQLite caveat as Studio, same new driver improvements.

Legacy and specialised

VVV (Varying Vagrant Vagrants). Vagrant-based, heavier than the Docker options, older school. Still the recommended path if you’re contributing to WordPress core itself, where you need the exact reference setup the core team uses.

Trellis / Bedrock. The Roots stack. Composer-managed WordPress, Ansible-provisioned dev/staging/prod. Not a “spin up and click” tool but a whole methodology for treating WordPress like a real PHP project. Excellent if you’re already living that way, massive overkill if you’re not.

Laravel Valet (with the WordPress driver). Mac-only, lightweight, fast. Real PHP, real MySQL, no containers. Some Laravel devs who also do WordPress swear by it. Less common in the pure-WP world.

MAMP / XAMPP. Still around, still works. If you’re maintaining a 12-year-old site on PHP 7.4 and don’t want to learn Docker, fine. Otherwise, move on.

Cloud ephemeral

InstaWP. Cloud rather than local, but worth naming. Spin up a real hosted WP site in 30 seconds, default lifespan a couple of days, extendable. Has an API for scripted provisioning. Great for sending a colleague something to click on, and handy for content and marketing screenshots. Not a primary dev environment but it occupies the same mental slot for some tasks.

Helpers worth knowing

OrbStack. Not a WP tool, but the single biggest performance upgrade you can make to your local dev setup if you’re on a Mac. It replaces Docker Desktop with a lighter, faster, native engine. Free for personal use.

Install with Homebrew:

brew install --cask orbstack
open -a OrbStack

On first launch, grant the system permissions it asks for (network extension, virtualization entitlements — standard Mac dev tool stuff). If you’re migrating from Docker Desktop, OrbStack offers a one-click import of your existing containers, images, and volumes on first run.

After that, every Docker-based WP tool just works on top of OrbStack with zero config changes. wp-env’s npx wp-env start finds the Docker socket OrbStack exposes the same way it would find Docker Desktop’s. ddev start, lando start, any docker-compose up you run — they all sit on top of the OrbStack engine without needing to know. You don’t reconfigure your tools; you just have a faster engine under them.

What you gain: 3–5× faster startup on the same workloads, roughly 40% lower memory pressure, no licensing concerns for personal or small-team use, and a clean orb CLI for managing containers directly (orb ps, orb logs, orb shell <container>) if you want it. What you give up: nothing meaningful unless your workflow depended on a Docker Desktop-specific feature like the Docker Scout panel.

For wp-env specifically, this is the change that makes the “start a fresh test environment” cycle feel cheap enough to do constantly rather than something you avoid.

4. Picking your stack by use case

Run every tool above against the eight requirements, and the right answer becomes a function of what you’re trying to do. Here’s the mapping I use.

You’re developing a WordPress plugin or theme, especially one you’ll distribute on WordPress.org or sell commercially. Use wp-env. The environment definition lives in your plugin repo as .wp-env.json, the PHPUnit test infrastructure is pre-wired, and the same definition runs in GitHub Actions for free. Add OrbStack underneath for speed. This is what I use on every plugin I maintain.

You’re building a full WordPress site — custom theme, custom plugins, integrations with external services, the works. Use DDEV. wp-env is purpose-built for plugin/theme dev and gets cramped when the unit of work is bigger. DDEV gives you the broader stack support, the ddev pull workflow for syncing from staging, and the flexibility to add Redis, Elasticsearch, Mailpit, or whatever your stack needs as additional containers.

You’re doing client work that needs to feel like a real long-lived WordPress site you can come back to next month. Use Local WP. It’s optimised for exactly this. Persistent state, friendly UI, Push/Pull to WP Engine and Flywheel, no Docker tax.

You need to reproduce a customer-reported bug, or demonstrate that a bug is fixed, in a way you can share. Use a WordPress Playground blueprint. The reproduction is a single JSON file you can commit to a PR, send to a customer, or hand to QA. They open one URL and they’re looking at the same broken state you’re looking at. Round-trips on bug reports drop from days to minutes once you adopt this pattern. I cover the authoring details in §7.

You’re prototyping a theme or small plugin and don’t want to wait for Docker. Use WordPress Studio. Sub-second startup, polished GUI, real WordPress. Just know the SQLite caveat: if you’re testing something that depends on MySQL-specific behaviour, switch the Studio site to MySQL via wp-config.php, or fall back to wp-env.

You’re contributing to WordPress core. Use VVV. It’s the reference setup the core team itself uses. The other tools work for core contribution too, but VVV is the most aligned with how the core team thinks about the environment.

You’re at an agency running 20+ WordPress projects across multiple developers. Use DDEV as your standard. Commit .ddev/config.yaml to every project, everyone clones and runs ddev start, you have consistency across the team. The ddev pull workflow lets developers sync from the staging environment, which removes the “but my local doesn’t match the bug we’re seeing” friction.

You need to send a colleague a real hosted WordPress site to click on (not just local). Use InstaWP. Cloud-ephemeral, public URL, lifespan you control.

5. Setting it up — the AI-first stack, concrete

Here’s the configuration I use for a real production WordPress plugin with several add-ons. Adapt to your project; the structure transfers cleanly.

OrbStack

If you’re on a Mac and don’t have it yet:

brew install --cask orbstack
open -a OrbStack    # first launch, grant permissions when prompted

OrbStack provides the Docker engine. wp-env, DDEV, and anything else Docker-based will use it transparently.

wp-env for plugin development

Create .wp-env.json at the root of your plugin repo:

{
  "core": "WordPress/WordPress#6.8",
  "phpVersion": "8.2",
  "plugins": [
    ".",
    "../my-plugin-addon-feed-to-post",
    "../my-plugin-addon-full-text",
    "../my-plugin-addon-filtering"
  ],
  "themes": [],
  "config": {
    "WP_DEBUG": true,
    "WP_DEBUG_LOG": true,
    "WP_DEBUG_DISPLAY": false,
    "SCRIPT_DEBUG": true
  },
  "mappings": {
    "wp-content/mu-plugins/query-monitor-loader.php": "./wp-env/mu-plugins/query-monitor-loader.php"
  }
}

The plugins array mounts your plugin (from .) plus any sibling repos that depend on it. The core ref pins to a specific WordPress version — flip to WordPress/WordPress#6.9 or WordPress/WordPress#trunk to test against a different release.

Run it:

npx @wordpress/env@latest start

That brings up two containers: a WordPress container at http://localhost:8888 and a tests container for PHPUnit. The first startup takes a couple of minutes (image pull); subsequent starts are seconds.

WP-CLI:

npx wp-env run cli wp plugin list
npx wp-env run cli wp option update timezone_string America/New_York
npx wp-env run cli wp post create --post_title="Test" --post_status=publish

Bootstrap script for repeatable state

The raw .wp-env.json gives you an empty WordPress. For real testing you want plugins activated, a known timezone (so timezone bugs reproduce), some demo content. Wrap that in a script:

#!/usr/bin/env bash
# wp-env/bootstrap.sh — idempotent setup after wp-env start

set -e
WP="npx wp-env run cli wp"

# Activate all mounted plugins
$WP plugin activate --all

# Non-UTC timezone so timezone bugs surface locally
$WP option update timezone_string America/New_York

# Theme
$WP theme activate twentytwentyfour

# Test page
$WP post list --post_type=page --field=ID | grep -q . || \
    $WP post create --post_type=page --post_title="Test page" --post_status=publish

# Seed sources, plugins, whatever else your dev workflow needs
# ...

echo "Bootstrap complete. Admin: http://localhost:8888/wp-admin (admin / password)"

Add npm scripts to your package.json so the team converges on the same commands:

{
  "scripts": {
    "wp:up": "npx @wordpress/env@latest start && bash wp-env/bootstrap.sh",
    "wp:reset": "npx @wordpress/env@latest destroy && npm run wp:up",
    "wp:cli": "npx @wordpress/env@latest run cli wp",
    "wp:test": "npx @wordpress/env@latest run tests-cli wp test"
  }
}

Now anyone (or any AI agent) running pnpm wp:up after cloning the repo has the same environment you do.

DDEV alternative

For full-site projects, the equivalent setup:

brew install ddev/ddev/ddev
cd ~/sites/my-wordpress-site
ddev config --project-type=wordpress --project-name=my-site --docroot=public
ddev start
ddev wp core download --path=public
ddev wp core install --url=$(ddev describe -j | jq -r .raw.primary_url) \
    --title="My Site" --admin_user=admin --admin_password=password \
    [email protected] --skip-email

DDEV’s ddev pull integrates with hosting providers to sync from staging, which becomes essential at agency scale:

ddev pull pantheon --skip-files
# or for custom syncs:
ddev pull provider --skip-files

Playground blueprint template

A starter blueprint for a bug-reproduction scenario:

{
  "$schema": "https://playground.wordpress.net/blueprint-schema.json",
  "preferredVersions": {
    "wp": "6.8",
    "php": "8.2"
  },
  "features": {
    "networking": true
  },
  "steps": [
    {
      "step": "login",
      "username": "admin",
      "password": "password"
    },
    {
      "step": "installPlugin",
      "pluginData": {
        "resource": "url",
        "url": "https://your-host/my-plugin-pr-build.zip"
      },
      "options": { "activate": true }
    },
    {
      "step": "runPHP",
      "code": "<?php require_once '/wordpress/wp-load.php'; update_option('myplugin_test_pending_count', 7);"
    },
    {
      "step": "setSiteOptions",
      "options": {
        "timezone_string": "America/New_York"
      }
    },
    {
      "step": "goTo",
      "url": "/wp-admin/index.php"
    }
  ]
}

Run it locally to verify:

npx @wp-playground/cli@latest server \
  --blueprint ./qa/blueprints/issue-707-pending-badge.json \
  --blueprint-may-read-adjacent-files \
  --port 9707

Open http://127.0.0.1:9707/wp-admin/ and the scenario is staged. More on the workflow patterns around blueprints in §7.

6. Testing patterns

A real local environment makes a half-dozen testing patterns cheap. Here’s what each is for and how to set it up against the stack above.

Unit tests with PHPUnit + Brain Monkey

For pure-logic tests that don’t need WordPress loaded, use Brain Monkey with PHPUnit. WordPress functions are stubbed; tests run in milliseconds against your composer autoload:

composer require --dev brain/monkey phpunit/phpunit

Brain Monkey lets you assert on add_filter/add_action calls without booting WordPress. These tests run constantly while you’re coding because they’re nearly free. They’re also the only tests an LLM-driven agent can usefully run in a feedback loop without spinning up an environment.

Integration tests via wp-env’s `tests-cli` container

For tests that need real WordPress, wp-env ships a separate tests-cli container with a clean WordPress test database:

npx wp-env run tests-cli wp test     # custom test runner
# or directly:
npx wp-env run tests-cli ./vendor/bin/phpunit

The test database resets between runs, so your tests can freely create posts, users, options. The first time you run this, your phpunit.xml.dist needs to load the wp-env test bootstrap:

<phpunit bootstrap="tests/bootstrap.php" colors="true">
  <testsuites>
    <testsuite name="integration">
      <directory>tests/integration</directory>
    </testsuite>
  </testsuites>
</phpunit>

And tests/bootstrap.php loads the WordPress test environment that wp-env provides:

<?php
$_tests_dir = getenv('WP_TESTS_DIR') ?: '/wordpress-phpunit';
require_once $_tests_dir . '/includes/functions.php';

tests_add_filter('muplugins_loaded', function () {
    require dirname(__DIR__) . '/your-plugin.php';
});

require $_tests_dir . '/includes/bootstrap.php';

Manual QA against the dev WordPress

The dev WordPress at http://localhost:8888 is the right place to click around and see your changes. Add Query Monitor as a mu-plugin via your .wp-env.json mappings so you can inspect database queries, hooks fired, and HTTP requests on every page load.

Cross-browser checks via Playwright

For UI checks that span Chrome, Safari, Firefox, and Edge — a standard QA matrix in the WordPress world — Playwright is the right tool:

npm install -D @playwright/test
npx playwright install

A test that verifies the admin sidebar renders correctly:

import { test, expect } from '@playwright/test';

test('admin sidebar has no horizontal scroll', async ({ page }) => {
  await page.goto('http://localhost:8888/wp-admin/');
  await page.fill('#user_login', 'admin');
  await page.fill('#user_pass', 'password');
  await page.click('#wp-submit');
  await page.goto('http://localhost:8888/wp-admin/admin.php?page=my-plugin-hub');
  const sidebar = page.locator('#adminmenu');
  const scrollWidth = await sidebar.evaluate(el => el.scrollWidth);
  const clientWidth = await sidebar.evaluate(el => el.clientWidth);
  expect(scrollWidth).toBeLessThanOrEqual(clientWidth);
});

Same script runs in CI. When you have an AI agent on the team, the agent can author Playwright tests for the changes it makes, run them locally, attach screenshots to the PR.

Visual regression and agent-driven screenshot review

A useful pattern that’s emerged once you have AI agents in the loop: instead of asserting on DOM structure, assert on what the page actually looks like. Playwright has built-in visual regression via expect(page).toHaveScreenshot() — first run captures the baseline image, subsequent runs fail if pixels drift beyond a threshold. The diff image lands in the test report, which the agent can attach to the PR as evidence:

test('Aggregator admin badge renders correctly', async ({ page }) => {
  await page.goto('http://localhost:8888/wp-admin/');
  // ... login + setup ...
  await expect(page.locator('#adminmenu')).toHaveScreenshot('admin-menu-with-badge.png');
});

The second pattern is even more useful: have the agent review its own screenshots using multimodal vision. Claude (and similar multimodal models) can read screenshots and judge whether the rendered UI matches the intent. The workflow:

Agent implements a UI change
Agent runs a Playwright script that captures key screenshots at each state
Agent passes those screenshots to itself (or to a reviewer agent) and asks “does this match what we expected?”
Agent attaches the screenshots to the PR along with its own assessment

This is what closes the loop on “the agent verified its own visual work.” Combined with the pixel-level regression catch from toHaveScreenshot(), you get both kinds of coverage: small drift caught by the baseline comparison, semantic correctness checked by the vision review.

You don’t need a video recording tool for this. Screenshots are easier to inspect than video frames, faster to capture, and what multimodal models read natively today.

Database state inspection

Two patterns. For ad-hoc queries, wp eval against the running container:

npx wp-env run cli wp eval 'global $wpdb; var_dump($wpdb->get_results("SELECT * FROM {$wpdb->prefix}options WHERE option_name LIKE \"myplugin_%\""));'

For ongoing development, install Query Monitor (wp plugin install query-monitor --activate) and use it from the admin bar. Both work for humans; only the WP-CLI path works for agents, which is fine — that’s the one that scales.

Snapshot / reset workflows

The fast cycle is wp-env start once, npm run wp:reset when you need a clean slate, wp-env stop when you’re done for the day. The reset is destructive — bring back the demo state with your bootstrap script.

For finer-grained snapshots within a session, dump and restore the database:

npx wp-env run cli wp db export /tmp/snapshot.sql
# ... do destructive things ...
npx wp-env run cli wp db import /tmp/snapshot.sql

This is the section that justifies its weight in gold. WordPress Playground blueprints have changed how my team thinks about bug reports.

The old workflow

Customer reports a bug. Support writes back asking for the WP version, the PHP version, the active plugins, the theme, the steps to reproduce. Customer answers in pieces over three days. Developer eventually has enough to attempt a reproduction. Developer’s local environment doesn’t quite match, so the bug doesn’t reproduce. Two more rounds. Eventually the bug reproduces. The fix takes 20 minutes; the reproduction took six days.

The blueprint workflow

Customer reports a bug. Support sends them a Playground blueprint URL with the plugin and the steps pre-staged. Customer clicks, hits the same broken state, confirms “yes, that’s exactly what I see.” Total time: two minutes. Developer pulls the same blueprint, fixes the bug, ships a new blueprint in the PR that demonstrates the fix from a clean install. QA opens the new blueprint, verifies, signs off.

This works because a blueprint captures every dimension of the environment — WP version, PHP version, plugin versions, settings, content, demo data — in a single JSON file. The file is the reproduction.

Conventions I use

In every WordPress plugin repo, I add a qa/blueprints/ directory. One blueprint per bug class. Naming: issue-NNN-short-description.json where NNN is the GitHub issue number. Each blueprint contains a comment block at the top describing what scenario it stages and which bug it demonstrates.

Every PR that has user-visible behaviour ships a blueprint. The PR description links it. Reviewers run it. If a reviewer’s experience doesn’t match what the blueprint shows, that’s a clear signal something is environment-dependent and needs investigation.

Blueprints reference plugin zips by URL. For public plugins, you can host them anywhere (GitHub releases, a CDN, the plugin’s own download URL). For private/in-development builds, you need the zip somewhere the Playground iframe can fetch. GitHub Actions artifacts are behind auth, which Playground can’t follow. Three workable patterns:

GitHub Releases, even for in-development versions. Create a pre-release, attach the build zip, blueprint references it.
Cloudflare R2 / S3 with a short-lived signed URL. Cheap, fast, ephemeral.
A small Cloudflare Worker redirector at a domain you control (we use qa.wprssaggregator.com for this), which points to the right artifact and lets you swap targets without changing the blueprint.

Customer-facing repros

The blueprint pattern works in reverse too. Customers can send you their environment as a blueprint. Make a “Report a bug” form that asks for the WordPress version, plugin versions, and a description, then auto-generates a blueprint URL the customer can verify before sending. You’ve effectively turned the customer into a reproducer of their own bug.

8. CI integration

The local environment definition that runs on your laptop should run identically in GitHub Actions. With wp-env or DDEV, this is essentially free.

wp-env in GitHub Actions

A workflow that runs PHPUnit on every push:

name: tests
on: [push, pull_request]
jobs:
  phpunit:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        wp: ["6.7", "6.8", "trunk"]
        php: ["8.1", "8.2", "8.3"]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: "20" }
      - name: Configure wp-env for matrix
        run: |
          jq --arg wp "${{ matrix.wp }}" --arg php "${{ matrix.php }}" \
             '.core = "WordPress/WordPress#\($wp)" | .phpVersion = $php' \
             .wp-env.json > .wp-env.tmp && mv .wp-env.tmp .wp-env.json
      - name: Start wp-env
        run: npx @wordpress/env@latest start
      - name: Run PHPUnit
        run: npx @wordpress/env@latest run tests-cli ./vendor/bin/phpunit

That’s it. The matrix runs your tests against every WP × PHP combination in parallel. The .wp-env.json you use locally is the same one CI uses; the only difference is matrix overrides.

Playwright in GitHub Actions

For cross-browser UI tests, add a separate job:

playwright:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-node@v4
      with: { node-version: "20" }
    - run: npm ci && npx playwright install --with-deps
    - run: npx @wordpress/env@latest start
    - run: npm run bootstrap   # your wp-cli setup
    - run: npx playwright test
    - uses: actions/upload-artifact@v4
      if: failure()
      with: { name: playwright-report, path: playwright-report/ }

The Playwright report uploaded on failure gives you the videos and screenshots of what went wrong. Combined with blueprint-based bug reports, you have a complete loop: reproduction in CI → identical reproduction locally → fix → verify in CI.

Linking blueprints in PR comments

When a CI build produces a build artifact (a built plugin zip), a small workflow step can generate a blueprint URL that loads that exact build into Playground, and post it as a PR comment. Reviewers click, see the scenario, sign off. I’ve seen this pattern transform review cycles.

9. The AI-first stack at scale

Walking through what a day in the life of an AI agent on the team looks like, against the stack above.

A customer reports a bug to support: “Imported feed items show the wrong publish date — 5 hours earlier than the actual date — on my site running UTC-5.” Support tags the ticket with a draft GitHub issue. The AI agent assigned to triage reads the issue.

The agent reproduces the bug. It runs pnpm wp:up against the plugin repo, which gives it a fresh WordPress install with the plugin active and timezone pre-set to America/New_York. It uses WP-CLI to import the customer’s reported feed and triggers a fetch. Then it queries the database for the imported items’ post_date_gmt and post_date, compares them to what the source feed actually contained, and confirms the discrepancy.

The agent identifies the root cause by reading the relevant code. The conversion from feed pubDate to WordPress post_date is happening twice — once in the import handler and once in the display layer. The fix is a one-line change in the import handler.

The agent writes a Playground blueprint at qa/blueprints/issue-NNN-timezone.json that stages the bug from a clean install. It runs the blueprint locally, captures screenshots showing the wrong date, attaches them to the PR. It applies the fix on a new branch, re-runs the blueprint with the fix applied (via installPlugin pointing at the fresh build), confirms the date is now correct. It writes a regression test in tests/integration/test-timezone-handling.php that fails without the fix and passes with it. It pushes the branch, opens the PR with:

A link to the Playground blueprint that reproduces the bug
A link to a second blueprint that demonstrates the fix
Screenshots of before and after
The regression test
A QA checklist for human reviewers

The PR triggers CI. wp-env runs PHPUnit across the matrix; Playwright runs the cross-browser checks. Everything passes. Gaby on the QA team opens the blueprint, sees the same bug the agent saw, opens the fix blueprint, sees it’s gone, ticks off the checklist, approves.

The PR merges. The customer gets an email pointing to a third blueprint that demonstrates the fix on their reported scenario. They reply: “Confirmed, thank you.”

End to end, no clicks from the developer side. The whole loop runs on infrastructure that costs nothing beyond a Docker daemon and a few hundred lines of YAML.

This is what an AI-first local WordPress dev stack makes possible. None of it is hypothetical. I run this every day on a real production plugin and I’m building the platform that makes it available to other teams at AgentVania.

10. Common pitfalls

A grab-bag of gotchas, hard-won.

Docker file permission weirdness on macOS. The user inside a wp-env container is www-data (UID 33), but files mounted from your Mac are owned by your user. Plugins that create files at runtime (caches, logs) sometimes write them as www-data and your editor then can’t modify them. Fix: set the directory’s group sticky bit, or chmod -R 777 for dev environments (don’t do this in production).

SQLite ≠ MySQL. Studio and browser Playground use SQLite by default. The new SQLite driver handles most MySQL syntax via AST translation, but advanced queries (window functions, certain JOIN patterns, MySQL-specific functions) can behave differently. If you’re testing a query that relies on MySQL-specific semantics, run it against wp-env (real MySQL) before trusting the SQLite result.

wp-env shell quoting. npx wp-env run cli sh -c "command1 && command2" mangles compound commands because of how the args get parsed. For multi-step shell work, drop down to Docker directly: docker exec -w /var/www/html wordpress-1 bash -c "...".

Plugin licensing in Playground. Most premium plugins use a licensing layer (Freemius, EDD, custom) that needs to phone home. Playground can’t do that reliably (network blocked, ephemeral domain, no persistent license storage). For dev environments, either (a) use the plugin’s built-in dev-license mechanism if it has one, (b) inject a mu-plugin that short-circuits the license check, or (c) accept that some premium features won’t be testable inside Playground and use wp-env for those.

Onboarding wizards. Many plugins show an onboarding wizard on a fresh install. In a scripted/Playground environment, the wizard masks the plugin’s actual UI. Pre-set whatever option the plugin uses to mark onboarding complete (myplugin_version, acme_setup_complete, etc.) in your blueprint or bootstrap script.

OrbStack first-launch permissions. On fresh Mac installs, OrbStack needs system permissions on first launch (network extension, virtualisation entitlements). Open the app once from the Applications folder, click through the prompts, then your scripts will work.

Localhost vs host.docker.internal. From inside a wp-env container, your Mac’s localhost is not the container’s localhost. To reach a service running on your Mac (e.g., a local SMTP server), use host.docker.internal instead.

The Mac/Windows/Linux split. wp-env works on all three but startup time and IO performance vary significantly. Linux is fastest, OrbStack-on-Mac is good, Docker-Desktop-on-Mac is slowest, Windows-WSL2 is in between. If you’re hiring, this matters.

What to install today

If you’re starting fresh and want the AI-first stack:

OrbStack (brew install --cask orbstack)
Node 20+ (you probably have this)
@wordpress/env (npm install -g @wordpress/env)
@wp-playground/cli (no global install needed; use npx per invocation)
Playwright (npm install -D @playwright/test per repo)

Plus pick one of:

DDEV if your work includes full WordPress sites: brew install ddev/ddev/ddev
Local WP if you have client sites you want a friendly GUI for: download from localwp.com

That’s the toolbox. Most of the cost is in the discipline of using it, not in the install.

Where this is going

The local dev environment question is the first place the AI shift shows up in WordPress, but it’s not the last. WP-CLI is positioning itself as the agent-ready foundation for WordPress, with MCP support and the new Abilities API turning standard WP-CLI commands into something LLMs can call reliably. Hosting providers are shipping MCP servers. The whole WordPress ecosystem is shifting from “GUIs for humans” to “interfaces for both.”

The teams that picked CLI-first dev environments two years ago are walking into the AI era already set up. The teams that picked GUI-first tools are going to have to migrate. The good news is that the tools to do this well already exist, free and open-source, and have for years. You just have to choose them deliberately.

I’m shipping production WordPress code every day with AI agents on the team. At AgentVania I’m building the platform that makes that work for any team. If you’re thinking about how to bring AI into your WordPress development workflow, this stack is the starting line.

The Complete Guide to Local WordPress Development and Testing

1. Why local WordPress development matters

2. What you actually need from a local environment

3. The tool landscape

Desktop GUI apps

Docker-based tools

WebAssembly tools

Legacy and specialised

Cloud ephemeral

Helpers worth knowing

4. Picking your stack by use case

5. Setting it up — the AI-first stack, concrete

OrbStack

wp-env for plugin development

Bootstrap script for repeatable state

DDEV alternative

Playground blueprint template

6. Testing patterns

Unit tests with PHPUnit + Brain Monkey

Integration tests via wp-env’s `tests-cli` container

Manual QA against the dev WordPress

Cross-browser checks via Playwright

Visual regression and agent-driven screenshot review

Database state inspection

Snapshot / reset workflows

The old workflow

The blueprint workflow

Conventions I use

Customer-facing repros

8. CI integration

wp-env in GitHub Actions

Playwright in GitHub Actions

Linking blueprints in PR comments

9. The AI-first stack at scale

10. Common pitfalls

What to install today

Where this is going

Further reading

Latest Padel Match

1. Why local WordPress development matters

2. What you actually need from a local environment

3. The tool landscape

Desktop GUI apps

Docker-based tools

WebAssembly tools

Legacy and specialised

Cloud ephemeral

Helpers worth knowing

4. Picking your stack by use case

5. Setting it up — the AI-first stack, concrete

OrbStack

wp-env for plugin development

Bootstrap script for repeatable state

DDEV alternative

Playground blueprint template

6. Testing patterns

Unit tests with PHPUnit + Brain Monkey

Integration tests via wp-env’s tests-cli container

Manual QA against the dev WordPress

Cross-browser checks via Playwright

Visual regression and agent-driven screenshot review

Database state inspection

Snapshot / reset workflows

7. Sharing reproductions with Playground blueprints

The old workflow

The blueprint workflow

Conventions I use

The sharing problem

Customer-facing repros

8. CI integration

wp-env in GitHub Actions

Playwright in GitHub Actions

Linking blueprints in PR comments

9. The AI-first stack at scale

10. Common pitfalls

What to install today

Where this is going

Further reading

Related

Local WP vs Docker: When to Use Each for WordPress Development

Is There a WordPress Replacement in 2026? I Went Looking

What’s Beyond WordPress?

My Thoughts on WordPress in 2020

Best WordPress Alternatives in 2026

Are Acquisitions in the WordPress Space a Good or Bad Thing?

About Jean Galea

Leave a Reply Cancel reply

Latest Padel Match

Integration tests via wp-env’s `tests-cli` container