Quick Start

Install AppClaw globally and start automating in seconds.

Terminal
# Install
npm install -g appclaw

# Run with a natural language goal
appclaw "Open Settings and turn on Wi-Fi"

# Run a YAML flow
appclaw --flow my-test.yaml

# Interactive playground
appclaw --playground

CLI Options

All the flags you can pass to appclaw.

Platform & Device

Flag Description
--platform <os> Target platform: android or ios
--device-type <type> iOS only: simulator or real
--device <name> Device name (partial match, e.g. "iPhone 17 Pro")
--udid <udid> Device UDID (skips the device picker)

Execution

Flag Description
--flow <file> Run a declarative YAML flow file
--env <name> Environment for variable/secret resolution
--playground Launch the interactive REPL for building flows
--record Record a goal execution for later replay
--replay <file> Replay a previously recorded session
--plan Decompose a complex goal into sub-goals
--json JSON output mode (for IDE extensions)

Explorer (Test Generation)

Flag Description
--explore <prd> Generate test flows from a PRD document
--num-flows <N> Number of flows to generate (default: 5)
--no-crawl Skip device crawling, use PRD only
--output-dir <dir> Output directory (default: generated-flows)
--max-screens <N> Max screens to crawl (default: 10)
--max-depth <N> Max navigation depth (default: 3)

Execution Modes

AppClaw has three distinct ways to automate mobile apps.

Agent Mode

Give AppClaw a goal in plain English. The AI agent takes a screenshot, reasons about what it sees, and decides what to tap, type, or swipe — step by step until the goal is complete.

Agent Mode
appclaw "Search for 'Appium 3.0' on YouTube and find the TestMu AI video"

YAML Flows

Define repeatable, version-controlled test flows in YAML. Each step is a natural language instruction — no element selectors, no brittle locators.

YAML Flow
appclaw --flow tests/youtube-search.yaml --env dev

Playground

An interactive REPL where you type one instruction at a time and see it execute immediately. Great for exploring an app and building flows interactively.

Playground
appclaw --playground --platform ios --device-type simulator

Designing YAML Flows

YAML flows are the heart of AppClaw's repeatable automation. Write your test steps in plain English — AppClaw figures out how to execute them on the device. No XPath, no accessibility IDs, no brittle selectors.

Key Idea

Each step is a natural language instruction like tap Login or wait for the home screen to be visible. AppClaw uses AI to find the right elements on screen.

Flat Format

The simplest YAML structure — a metadata header separated by --- from a flat list of steps.

settings-wifi.yaml
name: Turn on Wi-Fi
platform: android
---
- open Settings app
- tap Connections
- wait 1s
- tap Wi-Fi
- verify Wi-Fi is visible
- done

Metadata Fields

Field Description
name Display name for the flow
description Optional description of what the flow does
platform android or ios — fallback if no --platform CLI flag
appId App bundle/package ID for launchApp steps
env Environment name — resolves variables from .appclaw/env/<name>.yaml

Phased Format

For structured tests, organize your steps into three phases: setup, steps, and assertions. This gives clearer reporting and separates initialization from the actual test logic.

youtube-search.yaml
name: YouTube Search
description: Searches YouTube and verifies video results
platform: android
env: dev
---
setup:
  - open ${variables.app_name} app
  - wait until search icon is visible

steps:
  - click on search icon
  - type '${secrets.search_query}'
  - wait 3s
  - click on the first result from the list
  - wait for the search results to be visible
  - scroll down

assertions:
  - verify ${variables.expected_channel} is visible

Phases Explained

Phase Purpose
setup Initialization — launch the app, navigate to starting screen, dismiss popups. Failures here skip the test.
steps The main test actions — the interactions you're actually testing.
assertions Verification checks — confirm the expected outcome. You can also mix in actions here if needed.

Variables & Secrets

Keep your flows flexible and secure with variable interpolation.

Variables ${variables.X}

Loaded from environment files. Values appear in logs.

Secrets ${secrets.X}

Resolved from shell environment variables at runtime. Always shown as *** in logs.

Environment File

Create .appclaw/env/<name>.yaml in your project root:

.appclaw/env/dev.yaml
variables:
  app_name: youtube
  expected_channel: TestMu AI
  timeout: 30
  locale: en-US

Then reference it in your YAML header with env: dev, or pass --env dev on the CLI.

Inline Variables

For self-contained flows, embed variables directly in the YAML header:

Inline env block
name: Self-contained flow
env:
  variables:
    app_name: youtube
    search_term: appium 3.0
---
- open ${variables.app_name} app
- type ${variables.search_term}
Resolution Order

--env CLI flag wins over the YAML env: field, which wins over inline env: blocks. Secrets always come from shell environment variables.

Tap / Click

Tap on an element by describing its label. AppClaw matches it against visible text and elements on screen.

Natural language
- tap Login
- click on the search icon
- press Submit
- select the first item
- choose English
- pick the blue option
- navigate to Settings
- toggle Dark Mode
- enable Notifications
- close the popup
- dismiss the dialog

All of these are equivalent — they find the element and tap it. Use whichever reads most naturally.

Structured YAML
- tap: "Login Button"

Type Text

Type text into the currently focused field, or specify a target field.

Natural language
# Type into focused field
- type "hello world"
- enter text "user@example.com"

# Type into a specific field
- type "john@example.com" in email field
- enter "password123" into password field

# Search (types the text)
- search for "Appium 3.0"
- look for "restaurants nearby"
Structured YAML
- type: "hello world"

Wait / Pause

Pause execution for a fixed duration.

Natural language
- wait 3s
- wait 1.5 seconds
- sleep 500ms
- pause 2 sec
- wait               # defaults to 2 seconds
- wait a moment      # defaults to 2 seconds
Structured YAML
- wait: 3          # seconds

Wait Until

Wait dynamically until a condition is met. Polls the screen every 500ms up to a timeout (default 10s). Uses AI vision to understand the screen — you can describe what you expect to see in plain English.

Wait for something to appear
- wait until search icon is visible
- wait for the search results to be visible
- wait for the home screen to be loaded
- wait until "Welcome back" appears
- wait 15s until login button is visible  # custom timeout
Wait for something to disappear
- wait until loading spinner is gone
- wait for the popup to be hidden
- wait until progress bar disappeared
Wait for screen to stabilize
- wait until screen is loaded
- wait until screen is stable
- wait 5s until screen is ready
Structured YAML
# With custom timeout
- waitUntil: "Login button"
  timeout: 15

# Wait for element to disappear
- waitUntilGone: "Loading spinner"
  timeout: 20

# Screen loaded (DOM stability check)
- waitUntil: "screen loaded"
Smart Vision

When you write something descriptive like wait for the search results to be visible, AppClaw uses AI vision to understand the screen holistically — it checks whether results are actually shown, not just whether the literal words "search results" appear. You can describe what you expect to see naturally.

Scroll / Swipe

Scroll or swipe in any direction, optionally repeating multiple times or scrolling until an element is found.

Basic scroll / swipe
- scroll down
- scroll up 3 times
- swipe left
- swipe right 2 times
Scroll until found
# Scroll until an element appears
- scroll down until "Terms & Conditions" is visible
- scroll down 5 times to find "Accept"
- scroll down to see "Load More"
Structured YAML
- scrollAssert: "Terms & Conditions"
  direction: down
  maxScrolls: 5

Drag / Slider

Drag one element to another — sliders, carousels, reorderable lists, and any drag-and-drop interaction. Requires vision mode (AGENT_MODE=vision).

Natural language
- drag the green circle slider to the +100 mark
- slide the price handle to +80
- move the volume knob to maximum
Structured YAML — shorthand
# "drag: from to to"
- drag: "green circle slider to +100 mark"
Structured YAML — explicit from/to
- drag:
    from: green circle slider
    to:   +100 mark
Vision required

Drag uses AI vision to locate both the source and target by visual description. Set AGENT_MODE=vision and VISION_LOCATE_PROVIDER=stark with a valid LLM_API_KEY.

Assert / Verify

Verify that something is visible on screen. Works with both literal text and visual/semantic descriptions via AI vision.

Natural language
- verify "Welcome back" is visible
- assert Dashboard is visible
- check that the login button is on the screen
- verify TestMu AI is visible
Structured YAML
- assert: "Welcome back"
- verify: "Dashboard"    # alias for assert
- check:  "Login button"  # alias for assert

Full Command Reference

Every supported step kind at a glance.

Kind Parameters Description
openApp query Open an app by name
launchApp Launch app defined in appId metadata
tap label Tap element by visible text/label
type text, target? Type text, optionally into a named field
enter Press Enter / Return key
back Press the Back button
home Press the Home button
wait seconds Pause for a fixed duration
waitUntil condition, text?, timeout Poll until visible/gone/screenLoaded
swipe direction, repeat? Swipe up/down/left/right
drag from, to Drag from one element to another (vision mode)
assert text Verify text or description is visible
scrollAssert text, direction, maxScrolls Scroll until text found
getInfo query Ask the AI a question about the screen
done message? Signal flow completion

Vision Modes

Control how AppClaw locates elements on screen.

Agent Mode AGENT_MODE

Value Behavior
dom Default. Uses the app's DOM/accessibility tree to find elements.
vision Uses AI vision (screenshots + LLM) as the primary strategy for all interactions.

Vision Mode VISION_MODE

Value Behavior
fallback Default. Try DOM first, fall back to vision if no match found.
always Skip DOM entirely, use vision for every interaction.
never DOM only. No vision fallback.

Environment Variables

All environment variables recognized by AppClaw. These are especially useful for CI/CD pipelines.

LLM Configuration

Variable Description
LLM_PROVIDER LLM provider: anthropic, openai, gemini, groq, ollama
LLM_API_KEY API key for the chosen provider
LLM_MODEL Specific model name to use
LLM_THINKING Extended thinking: on or off (default: on)
LLM_THINKING_BUDGET Max thinking tokens: 1–10000 (default: 128)
LLM_SCREENSHOT_MAX_EDGE_PX Downscale screenshots to this max edge (0 = disabled)

Device & Platform

Variable Description
PLATFORM Same as --platform flag
DEVICE_TYPE Same as --device-type flag
DEVICE_UDID Same as --udid flag
DEVICE_NAME Same as --device flag

Vision

Variable Description
VISION_MODE always, fallback, or never
AGENT_MODE dom or vision
GEMINI_API_KEY Gemini API key for Stark vision — only needed when LLM_PROVIDER is not gemini and AGENT_MODE=vision. If provider is already Gemini, LLM_API_KEY is reused automatically.

Execution Tuning

Variable Description
MAX_STEPS Max steps per goal (default: 30)
STEP_DELAY Delay between steps in ms (default: 500)
MAX_ELEMENTS Max DOM elements to parse (default: 40)
MAX_HISTORY_STEPS Max action history retained (default: 10)

MCP Connection

Variable Description
MCP_TRANSPORT stdio or sse (default: stdio)
MCP_HOST MCP server host (default: localhost)
MCP_PORT MCP server port (default: 8080)

LambdaTest Cloud

Run AppClaw tests on real iOS and Android devices in the cloud — no local device or emulator required. AppClaw integrates with LambdaTest's real device cloud via its Appium-compatible hub.

Setup

Add your LambdaTest credentials and target device to .env:

.env
# Enable LambdaTest cloud
CLOUD_PROVIDER=lambdatest

# LambdaTest credentials (from app.lambdatest.com → Profile → Access Key)
LAMBDATEST_USERNAME=your_username
LAMBDATEST_ACCESS_KEY=your_access_key

# Target device
LAMBDATEST_DEVICE_NAME=iPhone 14
LAMBDATEST_OS_VERSION=16
PLATFORM=ios

# Your app (upload via LambdaTest portal, copy the lt:// ID)
LAMBDATEST_APP=lt://APP10xxxxxxxxxxxxxxxx

# LLM for AI-powered automation
LLM_PROVIDER=gemini
LLM_API_KEY=your_gemini_api_key

CLI Usage

Once .env is configured, run AppClaw exactly as you would locally:

Terminal
# Run a natural-language goal on a cloud device
appclaw "Open the app and navigate to the checkout screen"

# Run a YAML flow on a cloud device
appclaw --flow flows/checkout.yaml

SDK Usage

No SDK changes needed — when CLOUD_PROVIDER=lambdatest is set in .env, the SDK automatically routes the session through LambdaTest:

TypeScript
import { AppClaw } from 'appclaw';

// CLOUD_PROVIDER=lambdatest is read from .env automatically
const app = new AppClaw({
  provider:    'gemini',
  apiKey:      process.env.LLM_API_KEY,
  reportName:  'Checkout — LambdaTest',
});

await app.run('open the app');
await app.run('tap Add to Cart');
await app.run('tap Checkout');

await app.teardown();  // report saved to .appclaw/runs/

Environment Variables

Variable Required Description
CLOUD_PROVIDER Yes Set to lambdatest to enable cloud execution
LAMBDATEST_USERNAME Yes Your LambdaTest account username
LAMBDATEST_ACCESS_KEY Yes Your LambdaTest access key (from Profile → Access Key)
LAMBDATEST_DEVICE_NAME Yes Cloud device to use, e.g. iPhone 14, Galaxy S23
LAMBDATEST_OS_VERSION Yes OS version, e.g. 16 (iOS) or 13 (Android)
LAMBDATEST_APP No App ID from the LambdaTest portal (format: lt://APP…)
LAMBDATEST_BUILD_NAME No Build label shown in the LambdaTest dashboard
LAMBDATEST_PROJECT_NAME No Project label shown in the LambdaTest dashboard
LAMBDATEST_VIDEO No Record session video. Default: true
LAMBDATEST_NETWORK No Capture network logs. Default: false
No code changes required

Switching between local and cloud execution is purely config — set CLOUD_PROVIDER=lambdatest in your CI environment and remove it for local runs. Your YAML flows and SDK tests stay identical.

Node.js / TypeScript SDK

AppClaw ships a first-class programmatic API so you can drive mobile automation directly from Node.js or TypeScript — no CLI required. The SDK is the natural fit for QA automation inside test runners (Vitest, Jest, Mocha), CI pipelines, and any script that needs to control a device programmatically.

When to use the SDK vs the CLI

Use the CLI for one-off tasks and interactive exploration. Use the SDK when you want to run flows inside a test suite, assert on results, share a device connection across multiple flows, or integrate AppClaw into a larger automation pipeline.

Architecture

The SDK exposes a single AppClaw class that manages the full lifecycle:

  • Lazy MCP connect — the Appium connection is opened on the first runFlow() or runGoal() call, not on construction.
  • Connection reuse — subsequent calls share the same underlying connection, so you pay the startup cost once per test suite.
  • Explicit teardown — call teardown() in your afterAll hook to close the connection cleanly.
  • Silent by default — spinners and terminal colours are suppressed automatically, keeping CI logs clean.

Installation

AppClaw is a single package — the SDK is built in, nothing extra to install.

Terminal
npm install appclaw

Create a .env file in your project root (or pass options directly to the constructor — see Options Reference).

.env
LLM_PROVIDER=anthropic
LLM_API_KEY=sk-ant-...
PLATFORM=android

runFlow()

Parse and execute a YAML flow file against a connected device. Returns a FlowResult you can assert on.

TypeScript
import { AppClaw } from 'appclaw';

const app = new AppClaw({
  provider: 'anthropic',
  apiKey:   process.env.ANTHROPIC_API_KEY,
  platform: 'android',
});

const result = await app.runFlow('./flows/checkout.yaml');

console.log(result.success);    // true
console.log(result.stepsUsed);  // 6
console.log(result.stepsTotal); // 6

await app.teardown();

FlowResult shape

Field Type Description
success boolean Whether all steps completed successfully
stepsUsed number Steps executed before completion or failure
stepsTotal number Total steps in the flow (including unexecuted)
failedStep number? 1-based index of the step that failed
failedPhase string? setup | test | assertion
error string? Human-readable failure reason

runGoal()

Execute a plain-English goal using the agent loop — same as passing a goal string to the CLI. Returns an AgentResult.

TypeScript
import { AppClaw } from 'appclaw';

const app = new AppClaw({ provider: 'anthropic', apiKey: process.env.ANTHROPIC_API_KEY });

const result = await app.runGoal(
  'Log in with email qa@company.com and password Test1234'
);

console.log(result.success);   // true
console.log(result.stepsUsed); // 4
console.log(result.reason);    // "Logged in successfully"

await app.teardown();
Flow vs Goal

Use runFlow() for repeatable QA scenarios — structured, deterministic, zero LLM cost. Use runGoal() for exploratory tasks or when you need the agent to adapt to dynamic screen states.

run()

Execute a single natural-language instruction directly on the device — the programmatic equivalent of typing a command in the playground REPL. Each call is one atomic action: parse the instruction, execute it, return the result.

TypeScript
import { AppClaw } from 'appclaw';

const app = new AppClaw({ provider: 'gemini', apiKey: process.env.GEMINI_API_KEY, platform: 'android' });

await app.run('open YouTube app');       // regex match — no LLM call
await app.run('tap Search');             // regex match — no LLM call
await app.run('type Appium 3.0');        // regex match — no LLM call
await app.run('tap the search button');  // LLM fallback → tap
await app.run('wait 2 seconds');         // regex match — no LLM call
await app.run('scroll down');            // regex match — no LLM call

await app.teardown();  // report written to .appclaw/runs/

How instructions are resolved

  1. Regex match — common patterns (open X, tap X, type X, wait N seconds, scroll down, …) are resolved instantly with zero LLM cost.
  2. LLM fallback — anything that doesn't match a regex is sent to the configured LLM, which classifies it into a structured action (tap, type, swipe, etc.).

RunResult shape

Field Type Description
success boolean Whether the action completed successfully
action string Resolved step kind: tap | type | openApp | wait | swipe | …
message string Human-readable description of what happened
run() vs runGoal() vs runFlow()

Use run() when you want full control — one deterministic step at a time, easy to integrate with any test framework. Use runGoal() when you want the agent to figure out the steps itself. Use runFlow() for declarative YAML test cases you want to version-control.

Reports

Reports are enabled by default when using the SDK. After teardown() is called, AppClaw writes an HTML report to .appclaw/runs/ — one screenshot per step, plus a full execution summary. No extra configuration needed.

TypeScript
const app = new AppClaw({
  provider:   'gemini',
  apiKey:     process.env.GEMINI_API_KEY,
  platform:   'android',
  reportName: 'YouTube Search',  // shown in the report viewer
});

await app.run('open YouTube app');
await app.run('tap Search');
await app.run('type Appium 3.0');
await app.run('tap the search button');

await app.teardown();
// ↑ writes report to .appclaw/runs/<runId>/

Screen recording

Pass video: true to record the screen for the entire run and embed the video in the report. Recording starts automatically on the first run() call and stops in teardown().

TypeScript
const app = new AppClaw({
  provider:   'gemini',
  apiKey:     process.env.GEMINI_API_KEY,
  platform:   'android',
  reportName: 'YouTube Search',
  video:      true,               // record screen for the whole run
});

await app.run('open YouTube app');
await app.run('tap Search');
await app.run('type Appium 3.0');
await app.run('tap the search button');

await app.teardown();
// ↑ report includes recording.mp4 under the Recording tab
Parallel-safe

Each AppClaw instance records its own session independently — parallel tests do not interfere. Port allocation (MJPEG, system port) is also handled automatically per instance.

Viewing the report

Run the built-in report server after your tests complete:

Shell
npx appclaw --report

This starts a local server and opens the report in your browser. Every run is listed with its steps, screenshots, pass/fail status, and timing.

Report file layout

File tree
.appclaw/
  runs/
    runs.json              # global run index
    <runId>/
      manifest.json        # full run data (steps, timing, success)
      steps/
        step-000.png       # screenshot after step 1
        step-001.png       # screenshot after step 2
        step-002.png

Disabling reports

Set report: false to skip report generation (e.g. in performance-sensitive CI pipelines):

TypeScript
const app = new AppClaw({
  provider: 'gemini',
  apiKey:   process.env.GEMINI_API_KEY,
  report:   false,   // disable report generation
});

Using with Vitest / Jest

Create one AppClaw instance per test file, connect once in beforeAll, and tear down in afterAll. Individual tests call runFlow() or runGoal() and assert on the result.

tests/checkout.test.ts
import { describe, it, expect, afterAll } from 'vitest';
import { AppClaw } from 'appclaw';

const app = new AppClaw({
  provider: 'anthropic',
  apiKey:   process.env.ANTHROPIC_API_KEY,
  platform: 'android',
  maxSteps: 20,
});

afterAll(() => app.teardown());

describe('Checkout flow', () => {
  it('completes purchase as a logged-in user', async () => {
    const result = await app.runFlow('./flows/checkout.yaml');
    expect(result.success).toBe(true);
  });

  it('handles empty cart gracefully', async () => {
    const result = await app.runFlow('./flows/checkout-empty-cart.yaml');
    expect(result.success).toBe(true);
  });

  it('completes in under 15 steps', async () => {
    const result = await app.runFlow('./flows/checkout.yaml');
    expect(result.stepsUsed).toBeLessThan(15);
  });
});

Phased flows & assertion results

For flows that use setup / steps / assertions sections, the failedPhase field tells you exactly where execution broke down:

TypeScript
const result = await app.runFlow('./flows/login-phased.yaml');

if (!result.success) {
  // failedPhase: 'setup' | 'test' | 'assertion'
  console.error(`Failed in ${result.failedPhase} phase`);
  console.error(`Step ${result.failedStep}: ${result.error}`);
}

CI Scripts

For CI pipelines that don't use a test framework, run flows sequentially and exit non-zero on failure. The SDK's silent: true default keeps logs clean.

scripts/smoke-test.ts
import { AppClaw } from 'appclaw';

const app = new AppClaw({
  provider: 'google',
  apiKey:   process.env.GEMINI_API_KEY,
  platform: 'android',
  silent:   true,  // no spinners in CI
});

const flows = [
  './flows/login.yaml',
  './flows/checkout.yaml',
  './flows/search.yaml',
];

for (const flow of flows) {
  const result = await app.runFlow(flow);

  if (!result.success) {
    console.error(`FAILED: ${flow} ${result.error}`);
    await app.teardown();
    process.exit(1);
  }

  console.log(`PASSED: ${flow} (${result.stepsUsed} steps)`);
}

await app.teardown();
console.log('All flows passed.');

Run it with tsx (no compilation step needed):

Terminal
npx tsx scripts/smoke-test.ts

Options Reference

All fields passed to new AppClaw(options). Every field is optional — unset fields fall back to .env values or built-in defaults, matching CLI behaviour exactly.

Option Type Default Description
provider string 'gemini' 'anthropic' | 'openai' | 'gemini' | 'groq' | 'ollama'
apiKey string API key for the chosen LLM provider
model string Provider default Model ID override (e.g. 'claude-opus-4-6')
platform string 'android' | 'ios'
agentMode string 'dom' 'dom' uses accessibility tree; 'vision' uses AI vision
maxSteps number 30 Maximum agent steps before giving up (applies to runGoal)
stepDelay number 500 Delay between steps in milliseconds
silent boolean true Suppress spinners and terminal colour output. Set false to debug locally.
report boolean true Auto-generate an HTML report to .appclaw/runs/ on teardown(). Set false to disable.
reportName string 'AppClaw SDK Run' Name shown in the report viewer.
video boolean false Record the screen for the entire run and embed the video under the Recording tab in the report. Recording starts on the first run() call and stops automatically in teardown(). Requires Appium screen recording support.
mcpTransport string 'stdio' 'stdio' (local appium-mcp) | 'sse' (remote server)
mcpHost string 'localhost' appium-mcp host when transport is 'sse'
mcpPort number 8080 appium-mcp port when transport is 'sse'

TypeScript types

All public types are exported from the top-level 'appclaw' import:

TypeScript
import {
  AppClaw,
  type AppClawOptions,   // constructor options
  type FlowResult,       // returned by runFlow()
  type RunResult,        // returned by run()
  type AgentResult,      // returned by runGoal()
  type RunYamlFlowOptions // second arg to runFlow()
} from 'appclaw';

GitHub Actions

Run AppClaw mobile UI automation flows and AI-driven goals directly in GitHub Actions — Android emulator or iOS simulator included, zero boilerplate.

Available on the GitHub Marketplace as AppClaw Mobile Tests.

workflow.yml
uses: AppiumTestDistribution/AppClaw@v1
with:
  flow: flows/login.yaml
  platform: android
  api-key: ${{ secrets.LLM_API_KEY }}

Quick Start

Android — run a YAML flow

android-flow.yml
name: Mobile Tests
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: AppiumTestDistribution/AppClaw@v1
        with:
          flow: flows/login.yaml
          platform: android
          api-key: ${{ secrets.LLM_API_KEY }}

Android — natural language goal

android-goal.yml
- uses: AppiumTestDistribution/AppClaw@v1
  with:
    goal: 'Open YouTube, search for Appium 3.0, verify the first result is visible'
    platform: android
    api-key: ${{ secrets.LLM_API_KEY }}

iOS — run a YAML flow

ios-flow.yml
jobs:
  test:
    runs-on: macos-14  # iOS requires macOS (Apple Silicon)
    steps:
      - uses: actions/checkout@v4

      - uses: AppiumTestDistribution/AppClaw@v1
        with:
          flow: flows/ios-login.yaml
          platform: ios
          api-key: ${{ secrets.LLM_API_KEY }}

Inputs

All inputs are passed via the with: block in your workflow.

Input Required Default Description
flow one of* Path to a YAML flow file relative to repo root
goal one of* Natural language goal executed by the LLM agent
platform no android Target platform: android or ios
provider no gemini LLM provider: gemini, anthropic, openai, groq
api-key yes LLM API key — stored as LLM_API_KEY
model no provider default LLM model ID to pin (e.g. gemini-2.0-flash)
agent-mode no dom dom (element locators) or vision (screenshot AI)
max-steps no 30 Maximum agent steps before the run fails
step-delay no 500 Milliseconds between steps
android-api-level no 33 Android emulator API level (33 = Android 13)
android-profile no pixel_6 Android AVD hardware profile
android-target no default Emulator target: default or google_apis
ios-device-type no simulator iOS device type: simulator or real
ios-simulator-name no iPhone 16 iOS simulator model to boot (e.g. iPhone 15, iPad Air)
ios-simulator-os no latest iOS version filter for simulator selection (e.g. 18.4)
mcp-debug no false Enable MCP debug logging (MCP_DEBUG=1). Useful for diagnosing CI timeouts.
cloud-provider no local Cloud provider: lambdatest. Leave empty for local.
lambdatest-username no** LambdaTest account username
lambdatest-access-key no** LambdaTest access key
lambdatest-device-name no** Cloud device name (e.g. Pixel 7)
lambdatest-os-version no** Cloud OS version (e.g. 13, 16)
lambdatest-app no LambdaTest app ID (lt://APP...)
report no true Upload HTML report as workflow artifact
report-name no appclaw-report Name of the uploaded artifact
appclaw-version no latest npm package version to pin

* Provide either flow or goal, not both.

** Required when cloud-provider: lambdatest.

Secrets Setup

Go to your repo → Settings → Secrets and variables → Actions → New repository secret:

Secret name Description
LLM_API_KEY Your API key — works for any provider (Gemini, Anthropic, OpenAI, Groq)
LT_USERNAME LambdaTest username (only if using cloud devices)
LT_ACCESS_KEY LambdaTest access key (only if using cloud devices)
LT_APP_ID LambdaTest app ID (only if using cloud devices)

Examples

Parallel matrix — run multiple flows concurrently

matrix-parallel.yml
jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        flow:
          - flows/login.yaml
          - flows/search.yaml
          - flows/checkout.yaml
    steps:
      - uses: actions/checkout@v4

      - uses: AppiumTestDistribution/AppClaw@v1
        with:
          flow: ${{ matrix.flow }}
          platform: android
          api-key: ${{ secrets.LLM_API_KEY }}
          report-name: report-${{ strategy.job-index }}

LambdaTest cloud devices

Run iOS tests on Ubuntu — no macOS runner needed.

lambdatest.yml
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: AppiumTestDistribution/AppClaw@v1
        with:
          flow: flows/ios-login.yaml
          platform: ios
          api-key: ${{ secrets.LLM_API_KEY }}
          cloud-provider: lambdatest
          lambdatest-username: ${{ secrets.LT_USERNAME }}
          lambdatest-access-key: ${{ secrets.LT_ACCESS_KEY }}
          lambdatest-device-name: 'iPhone 14'
          lambdatest-os-version: '16'
          lambdatest-app: ${{ secrets.LT_APP_ID }}

Vision mode (screenshot-based AI)

vision-mode.yml
- uses: AppiumTestDistribution/AppClaw@v1
  with:
    flow: flows/onboarding.yaml
    platform: android
    agent-mode: vision
    api-key: ${{ secrets.LLM_API_KEY }}

Pin model for cost control

pin-model.yml
- uses: AppiumTestDistribution/AppClaw@v1
  with:
    flow: flows/smoke.yaml
    platform: android
    api-key: ${{ secrets.LLM_API_KEY }}
    model: 'gemini-2.0-flash'  # cheaper/faster than pro

Nightly regression on a schedule

nightly.yml
on:
  schedule:
    - cron: '0 2 * * *'  # 2 AM UTC every night

jobs:
  nightly:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: AppiumTestDistribution/AppClaw@v1
        with:
          flow: flows/full-regression.yaml
          platform: android
          api-key: ${{ secrets.LLM_API_KEY }}
          report-name: nightly-report-${{ github.run_id }}

Reports

When report: true (default), an HTML report is uploaded as a workflow artifact after each run. Download it from the Actions run summary → Artifacts. The report includes:

  • Step-by-step screenshots with tap overlays
  • Pass/fail status per step
  • Execution timeline
  • Screen recording (if video: true is set in your flow)

Use report path in a downstream step

report-path.yml
- uses: AppiumTestDistribution/AppClaw@v1
  id: appclaw
  with:
    flow: flows/login.yaml
    platform: android
    api-key: ${{ secrets.LLM_API_KEY }}

- name: Print report location
  run: echo "Report at ${{ steps.appclaw.outputs.report-path }}"

Runner Requirements

Platform Runner Notes
android ubuntu-latest Free tier. KVM-enabled. Emulator boots in ~4-6 min.
ios macos-14 Apple Silicon. macOS minutes cost ~10x Linux.

iOS tip: For faster iOS CI, use LambdaTest cloud devices on ubuntu-latest instead of a macOS runner.

App Guides / Overview

What are App Guides?

App Guides (AppGuides) are per-app knowledge snippets injected directly into the agent's context window at the start of every automation run. They encode navigation patterns, gesture shortcuts, and common action paths for a specific app — so the agent never needs to rediscover them by trial and error.

Context Engineering for Mobile

AppGuides are AppClaw's implementation of context engineering — the practice of giving the LLM exactly the right knowledge to act correctly, rather than relying on the model's general training alone. For mobile automation, this means app-specific navigation knowledge baked into the system prompt before the agent ever looks at the screen.

How it works

When AppClaw starts a run against a known app, it automatically loads the matching guide and prepends it to the agent's system prompt with an APP_GUIDE (AppName): prefix. The LLM sees this contextual knowledge before it takes any action — making the first step decisive rather than exploratory.

System prompt injection (simplified)
APP_GUIDE (WhatsApp):

## WhatsApp Navigation
- Bottom tabs: Chats | Updates | Communities | Calls
- New chat: floating pencil/message icon (bottom-right)
- Search: magnifying-glass icon at the top of Chats

## Messaging
- Open a chat → type in the message bar at the bottom → send via arrow icon
- Attach media: paperclip icon next to message bar
- Voice note: long-press the microphone icon

Resolution order

  1. Custom guide.appclaw/guides/<appId>.md (highest priority, overrides built-ins)
  2. Built-in guide — bundled guides for 10 common apps
  3. No guide — agent explores the app from scratch using only what it sees on screen
App Guides / Built-in Guides

Built-in Guides

AppClaw ships with guides for the most commonly automated apps on both Android and iOS. These activate automatically when AppClaw detects the matching package name or bundle ID.

App Platform App ID / Bundle ID
Gmail Android com.google.android.gm
Gmail iOS com.google.gmail
YouTube Android com.google.android.youtube
YouTube iOS com.google.ios.youtube
WhatsApp Android com.whatsapp
WhatsApp iOS net.whatsapp.WhatsApp
Chrome Android com.android.chrome
Chrome iOS com.google.chrome
Settings Android com.android.settings
Settings iOS com.apple.Preferences

Example: WhatsApp Guide

APP_GUIDE (WhatsApp)
## WhatsApp Navigation
- Bottom tabs: Chats | Updates | Communities | Calls
- New chat: floating pencil/message icon (bottom-right)
- Search: magnifying-glass icon at the top of Chats

## Messaging
- Open a chat → type in the message bar at the bottom → send via arrow icon
- Attach media: paperclip icon next to message bar
- Voice note: long-press the microphone icon
- Emoji/stickers: smiley face icon on the left of message bar

## Common Actions
- Star a message: long-press message → star icon
- Forward: long-press message → forward arrow
- Delete: long-press message → trash icon
- Group info: tap the group name at the top of the chat

Example: YouTube Guide

APP_GUIDE (YouTube)
## YouTube Navigation
- Bottom nav: Home | Shorts | + (upload) | Subscriptions | Library
- Search: magnifying-glass icon (top-right)
- Tap a video thumbnail to play; double-tap left/right to seek ±10 s

## Searching
- Tap the search icon → type query → press Enter or tap search icon again
- Filter results: tap "Filters" after searching

## Playback
- Full screen: rotate device or tap the expand icon (bottom-right of player)
- Quality: tap ⋮ inside player → Quality
- Captions: tap CC icon inside player
App Guides / Custom Guides

Custom Guides

Add a guide for any app — or override a built-in — by dropping a Markdown file at .appclaw/guides/<appId>.md in your project directory. Custom guides always take priority over built-ins.

Custom guides always win

If a custom guide exists for an app ID, it replaces the built-in entirely. To extend a built-in guide, copy its contents into your custom file and add your own sections.

Creating a custom guide

  1. Find your app's package name (Android) or bundle ID (iOS). You can get this from the appId field in your YAML flow, or by inspecting the device.
  2. Create the directory .appclaw/guides/ in your project root.
  3. Write a Markdown file named <appId>.md.
.appclaw/guides/com.myapp.android.md
## Main Navigation
- Bottom tabs: Home | Search | Orders | Profile
- Hamburger menu (top-left) → categories and account settings

## Checkout Flow
- Cart icon is always in the top-right corner
- Tap "Proceed to Checkout" → select address → choose payment → Place Order
- Apply coupon: tap "Have a coupon?" on the order summary screen

## Product Search
- Tap the search bar at the top; supports filters: Brand | Price | Rating
- Long-press any product thumbnail to preview without navigating away

Once in place, AppClaw picks it up automatically — no code changes or restarts needed.

Tips for writing good guides

Do Why
Describe where things are, not what they say UI labels change; positions are stable
List gestures explicitly ("swipe right to archive") The agent can't infer non-obvious gestures from a screenshot
Use bullet points over prose Every token counts — bullets are faster for the model to parse
Document multi-step paths (Settings → Account → Privacy) Saves the agent multiple round-trips for deeply nested flows
Keep it short (under 500 tokens) Guides are injected on every step — brevity reduces cost