Quick Start
Install AppClaw globally and start automating in seconds.
# Install npm install -g appclaw # Run with a natural language goal appclaw "Open Settings and turn on Wi-Fi" # Run a YAML flow appclaw --flow my-test.yaml # Interactive playground appclaw --playground
CLI Options
All the flags you can pass to appclaw.
Platform & Device
| Flag | Description |
|---|---|
| --platform <os> | Target platform: android or ios |
| --device-type <type> | iOS only: simulator or real |
| --device <name> | Device name (partial match, e.g. "iPhone 17 Pro") |
| --udid <udid> | Device UDID (skips the device picker) |
Execution
| Flag | Description |
|---|---|
| --flow <file> | Run a declarative YAML flow file |
| --env <name> | Environment for variable/secret resolution |
| --playground | Launch the interactive REPL for building flows |
| --record | Record a goal execution for later replay |
| --replay <file> | Replay a previously recorded session |
| --plan | Decompose a complex goal into sub-goals |
| --json | JSON output mode (for IDE extensions) |
Explorer (Test Generation)
| Flag | Description |
|---|---|
| --explore <prd> | Generate test flows from a PRD document |
| --num-flows <N> | Number of flows to generate (default: 5) |
| --no-crawl | Skip device crawling, use PRD only |
| --output-dir <dir> | Output directory (default: generated-flows) |
| --max-screens <N> | Max screens to crawl (default: 10) |
| --max-depth <N> | Max navigation depth (default: 3) |
Execution Modes
AppClaw has three distinct ways to automate mobile apps.
Agent Mode
Give AppClaw a goal in plain English. The AI agent takes a screenshot, reasons about what it sees, and decides what to tap, type, or swipe — step by step until the goal is complete.
appclaw "Search for 'Appium 3.0' on YouTube and find the TestMu AI video"
YAML Flows
Define repeatable, version-controlled test flows in YAML. Each step is a natural language instruction — no element selectors, no brittle locators.
appclaw --flow tests/youtube-search.yaml --env dev
Playground
An interactive REPL where you type one instruction at a time and see it execute immediately. Great for exploring an app and building flows interactively.
appclaw --playground --platform ios --device-type simulator
Designing YAML Flows
YAML flows are the heart of AppClaw's repeatable automation. Write your test steps in plain English — AppClaw figures out how to execute them on the device. No XPath, no accessibility IDs, no brittle selectors.
Each step is a natural language instruction like tap Login or
wait for the home screen to be visible. AppClaw uses AI to find the right
elements on screen.
Flat Format
The simplest YAML structure — a metadata header separated by
--- from a flat list of steps.
name: Turn on Wi-Fi platform: android --- - open Settings app - tap Connections - wait 1s - tap Wi-Fi - verify Wi-Fi is visible - done
Metadata Fields
| Field | Description |
|---|---|
| name | Display name for the flow |
| description | Optional description of what the flow does |
| platform |
android or ios — fallback if no
--platform CLI flag
|
| appId | App bundle/package ID for launchApp steps |
| env |
Environment name — resolves variables from
.appclaw/env/<name>.yaml
|
Phased Format
For structured tests, organize your steps into three phases: setup, steps, and assertions. This gives clearer reporting and separates initialization from the actual test logic.
name: YouTube Search description: Searches YouTube and verifies video results platform: android env: dev --- setup: - open ${variables.app_name} app - wait until search icon is visible steps: - click on search icon - type '${secrets.search_query}' - wait 3s - click on the first result from the list - wait for the search results to be visible - scroll down assertions: - verify ${variables.expected_channel} is visible
Phases Explained
| Phase | Purpose |
|---|---|
| setup | Initialization — launch the app, navigate to starting screen, dismiss popups. Failures here skip the test. |
| steps | The main test actions — the interactions you're actually testing. |
| assertions | Verification checks — confirm the expected outcome. You can also mix in actions here if needed. |
Variables & Secrets
Keep your flows flexible and secure with variable interpolation.
Variables ${variables.X}
Loaded from environment files. Values appear in logs.
Secrets ${secrets.X}
Resolved from shell environment variables at runtime. Always shown as
*** in logs.
Environment File
Create .appclaw/env/<name>.yaml in your project root:
variables: app_name: youtube expected_channel: TestMu AI timeout: 30 locale: en-US
Then reference it in your YAML header with env: dev, or pass
--env dev on the CLI.
Inline Variables
For self-contained flows, embed variables directly in the YAML header:
name: Self-contained flow env: variables: app_name: youtube search_term: appium 3.0 --- - open ${variables.app_name} app - type ${variables.search_term}
--env CLI flag wins over the YAML env: field, which wins
over inline env: blocks. Secrets always come from shell environment
variables.
Tap / Click
Tap on an element by describing its label. AppClaw matches it against visible text and elements on screen.
- tap Login - click on the search icon - press Submit - select the first item - choose English - pick the blue option - navigate to Settings - toggle Dark Mode - enable Notifications - close the popup - dismiss the dialog
All of these are equivalent — they find the element and tap it. Use whichever reads most naturally.
- tap: "Login Button"
Type Text
Type text into the currently focused field, or specify a target field.
# Type into focused field - type "hello world" - enter text "user@example.com" # Type into a specific field - type "john@example.com" in email field - enter "password123" into password field # Search (types the text) - search for "Appium 3.0" - look for "restaurants nearby"
- type: "hello world"
Wait / Pause
Pause execution for a fixed duration.
- wait 3s - wait 1.5 seconds - sleep 500ms - pause 2 sec - wait # defaults to 2 seconds - wait a moment # defaults to 2 seconds
- wait: 3 # seconds
Wait Until
Wait dynamically until a condition is met. Polls the screen every 500ms up to a timeout (default 10s). Uses AI vision to understand the screen — you can describe what you expect to see in plain English.
- wait until search icon is visible - wait for the search results to be visible - wait for the home screen to be loaded - wait until "Welcome back" appears - wait 15s until login button is visible # custom timeout
- wait until loading spinner is gone - wait for the popup to be hidden - wait until progress bar disappeared
- wait until screen is loaded - wait until screen is stable - wait 5s until screen is ready
# With custom timeout - waitUntil: "Login button" timeout: 15 # Wait for element to disappear - waitUntilGone: "Loading spinner" timeout: 20 # Screen loaded (DOM stability check) - waitUntil: "screen loaded"
When you write something descriptive like
wait for the search results to be visible, AppClaw uses AI vision to
understand the screen holistically — it checks whether results are actually
shown, not just whether the literal words "search results" appear. You can describe
what you expect to see naturally.
Scroll / Swipe
Scroll or swipe in any direction, optionally repeating multiple times or scrolling until an element is found.
- scroll down - scroll up 3 times - swipe left - swipe right 2 times
# Scroll until an element appears - scroll down until "Terms & Conditions" is visible - scroll down 5 times to find "Accept" - scroll down to see "Load More"
- scrollAssert: "Terms & Conditions" direction: down maxScrolls: 5
Drag / Slider
Drag one element to another — sliders, carousels, reorderable lists, and any
drag-and-drop interaction. Requires vision mode (AGENT_MODE=vision).
- drag the green circle slider to the +100 mark - slide the price handle to +80 - move the volume knob to maximum
# "drag: from to to" - drag: "green circle slider to +100 mark"
- drag: from: green circle slider to: +100 mark
Drag uses AI vision to locate both the source and target by visual description. Set
AGENT_MODE=vision and VISION_LOCATE_PROVIDER=stark with a
valid LLM_API_KEY.
Assert / Verify
Verify that something is visible on screen. Works with both literal text and visual/semantic descriptions via AI vision.
- verify "Welcome back" is visible - assert Dashboard is visible - check that the login button is on the screen - verify TestMu AI is visible
- assert: "Welcome back" - verify: "Dashboard" # alias for assert - check: "Login button" # alias for assert
Full Command Reference
Every supported step kind at a glance.
| Kind | Parameters | Description |
|---|---|---|
| openApp | query | Open an app by name |
| launchApp | — | Launch app defined in appId metadata |
| tap | label | Tap element by visible text/label |
| type | text, target? | Type text, optionally into a named field |
| enter | — | Press Enter / Return key |
| back | — | Press the Back button |
| home | — | Press the Home button |
| wait | seconds | Pause for a fixed duration |
| waitUntil | condition, text?, timeout | Poll until visible/gone/screenLoaded |
| swipe | direction, repeat? | Swipe up/down/left/right |
| drag | from, to | Drag from one element to another (vision mode) |
| assert | text | Verify text or description is visible |
| scrollAssert | text, direction, maxScrolls | Scroll until text found |
| getInfo | query | Ask the AI a question about the screen |
| done | message? | Signal flow completion |
Vision Modes
Control how AppClaw locates elements on screen.
Agent Mode AGENT_MODE
| Value | Behavior |
|---|---|
| dom | Default. Uses the app's DOM/accessibility tree to find elements. |
| vision | Uses AI vision (screenshots + LLM) as the primary strategy for all interactions. |
Vision Mode VISION_MODE
| Value | Behavior |
|---|---|
| fallback | Default. Try DOM first, fall back to vision if no match found. |
| always | Skip DOM entirely, use vision for every interaction. |
| never | DOM only. No vision fallback. |
Environment Variables
All environment variables recognized by AppClaw. These are especially useful for CI/CD pipelines.
LLM Configuration
| Variable | Description |
|---|---|
| LLM_PROVIDER |
LLM provider: anthropic, openai, gemini,
groq, ollama
|
| LLM_API_KEY | API key for the chosen provider |
| LLM_MODEL | Specific model name to use |
| LLM_THINKING | Extended thinking: on or off (default: on) |
| LLM_THINKING_BUDGET | Max thinking tokens: 1–10000 (default: 128) |
| LLM_SCREENSHOT_MAX_EDGE_PX | Downscale screenshots to this max edge (0 = disabled) |
Device & Platform
| Variable | Description |
|---|---|
| PLATFORM | Same as --platform flag |
| DEVICE_TYPE | Same as --device-type flag |
| DEVICE_UDID | Same as --udid flag |
| DEVICE_NAME | Same as --device flag |
Vision
| Variable | Description |
|---|---|
| VISION_MODE | always, fallback, or never |
| AGENT_MODE | dom or vision |
| GEMINI_API_KEY |
Gemini API key for Stark vision — only needed when LLM_PROVIDER is
not gemini and AGENT_MODE=vision. If provider is already
Gemini, LLM_API_KEY is reused automatically.
|
Execution Tuning
| Variable | Description |
|---|---|
| MAX_STEPS | Max steps per goal (default: 30) |
| STEP_DELAY | Delay between steps in ms (default: 500) |
| MAX_ELEMENTS | Max DOM elements to parse (default: 40) |
| MAX_HISTORY_STEPS | Max action history retained (default: 10) |
MCP Connection
| Variable | Description |
|---|---|
| MCP_TRANSPORT | stdio or sse (default: stdio) |
| MCP_HOST | MCP server host (default: localhost) |
| MCP_PORT | MCP server port (default: 8080) |
LambdaTest Cloud
Run AppClaw tests on real iOS and Android devices in the cloud — no local device or emulator required. AppClaw integrates with LambdaTest's real device cloud via its Appium-compatible hub.
Setup
Add your LambdaTest credentials and target device to .env:
# Enable LambdaTest cloud CLOUD_PROVIDER=lambdatest # LambdaTest credentials (from app.lambdatest.com → Profile → Access Key) LAMBDATEST_USERNAME=your_username LAMBDATEST_ACCESS_KEY=your_access_key # Target device LAMBDATEST_DEVICE_NAME=iPhone 14 LAMBDATEST_OS_VERSION=16 PLATFORM=ios # Your app (upload via LambdaTest portal, copy the lt:// ID) LAMBDATEST_APP=lt://APP10xxxxxxxxxxxxxxxx # LLM for AI-powered automation LLM_PROVIDER=gemini LLM_API_KEY=your_gemini_api_key
CLI Usage
Once .env is configured, run AppClaw exactly as you would locally:
# Run a natural-language goal on a cloud device appclaw "Open the app and navigate to the checkout screen" # Run a YAML flow on a cloud device appclaw --flow flows/checkout.yaml
SDK Usage
No SDK changes needed — when CLOUD_PROVIDER=lambdatest is set in
.env, the SDK automatically routes the session through LambdaTest:
import { AppClaw } from 'appclaw'; // CLOUD_PROVIDER=lambdatest is read from .env automatically const app = new AppClaw({ provider: 'gemini', apiKey: process.env.LLM_API_KEY, reportName: 'Checkout — LambdaTest', }); await app.run('open the app'); await app.run('tap Add to Cart'); await app.run('tap Checkout'); await app.teardown(); // report saved to .appclaw/runs/
Environment Variables
| Variable | Required | Description |
|---|---|---|
| CLOUD_PROVIDER | Yes | Set to lambdatest to enable cloud execution |
| LAMBDATEST_USERNAME | Yes | Your LambdaTest account username |
| LAMBDATEST_ACCESS_KEY | Yes | Your LambdaTest access key (from Profile → Access Key) |
| LAMBDATEST_DEVICE_NAME | Yes | Cloud device to use, e.g. iPhone 14, Galaxy S23 |
| LAMBDATEST_OS_VERSION | Yes | OS version, e.g. 16 (iOS) or 13 (Android) |
| LAMBDATEST_APP | No | App ID from the LambdaTest portal (format: lt://APP…) |
| LAMBDATEST_BUILD_NAME | No | Build label shown in the LambdaTest dashboard |
| LAMBDATEST_PROJECT_NAME | No | Project label shown in the LambdaTest dashboard |
| LAMBDATEST_VIDEO | No | Record session video. Default: true |
| LAMBDATEST_NETWORK | No | Capture network logs. Default: false |
Switching between local and cloud execution is purely config — set
CLOUD_PROVIDER=lambdatest in your CI environment and remove it for local
runs. Your YAML flows and SDK tests stay identical.
Node.js / TypeScript SDK
AppClaw ships a first-class programmatic API so you can drive mobile automation directly from Node.js or TypeScript — no CLI required. The SDK is the natural fit for QA automation inside test runners (Vitest, Jest, Mocha), CI pipelines, and any script that needs to control a device programmatically.
Use the CLI for one-off tasks and interactive exploration. Use the SDK when you want to run flows inside a test suite, assert on results, share a device connection across multiple flows, or integrate AppClaw into a larger automation pipeline.
Architecture
The SDK exposes a single AppClaw class that manages the full lifecycle:
-
Lazy MCP connect — the Appium connection is opened on the first
runFlow()orrunGoal()call, not on construction. - Connection reuse — subsequent calls share the same underlying connection, so you pay the startup cost once per test suite.
-
Explicit teardown — call
teardown()in yourafterAllhook to close the connection cleanly. - Silent by default — spinners and terminal colours are suppressed automatically, keeping CI logs clean.
Installation
AppClaw is a single package — the SDK is built in, nothing extra to install.
npm install appclaw
Create a .env file in your project root (or pass options directly to the
constructor — see Options Reference).
LLM_PROVIDER=anthropic LLM_API_KEY=sk-ant-... PLATFORM=android
runFlow()
Parse and execute a YAML flow file against a connected device. Returns a
FlowResult you can assert on.
import { AppClaw } from 'appclaw'; const app = new AppClaw({ provider: 'anthropic', apiKey: process.env.ANTHROPIC_API_KEY, platform: 'android', }); const result = await app.runFlow('./flows/checkout.yaml'); console.log(result.success); // true console.log(result.stepsUsed); // 6 console.log(result.stepsTotal); // 6 await app.teardown();
FlowResult shape
| Field | Type | Description |
|---|---|---|
| success | boolean | Whether all steps completed successfully |
| stepsUsed | number | Steps executed before completion or failure |
| stepsTotal | number | Total steps in the flow (including unexecuted) |
| failedStep | number? | 1-based index of the step that failed |
| failedPhase | string? | setup | test | assertion |
| error | string? | Human-readable failure reason |
runGoal()
Execute a plain-English goal using the agent loop — same as passing a goal string to the
CLI. Returns an AgentResult.
import { AppClaw } from 'appclaw'; const app = new AppClaw({ provider: 'anthropic', apiKey: process.env.ANTHROPIC_API_KEY }); const result = await app.runGoal( 'Log in with email qa@company.com and password Test1234' ); console.log(result.success); // true console.log(result.stepsUsed); // 4 console.log(result.reason); // "Logged in successfully" await app.teardown();
Use runFlow() for repeatable QA scenarios — structured, deterministic, zero LLM cost. Use runGoal() for exploratory tasks or when you need the agent to adapt to dynamic screen states.
run()
Execute a single natural-language instruction directly on the device — the programmatic equivalent of typing a command in the playground REPL. Each call is one atomic action: parse the instruction, execute it, return the result.
import { AppClaw } from 'appclaw'; const app = new AppClaw({ provider: 'gemini', apiKey: process.env.GEMINI_API_KEY, platform: 'android' }); await app.run('open YouTube app'); // regex match — no LLM call await app.run('tap Search'); // regex match — no LLM call await app.run('type Appium 3.0'); // regex match — no LLM call await app.run('tap the search button'); // LLM fallback → tap await app.run('wait 2 seconds'); // regex match — no LLM call await app.run('scroll down'); // regex match — no LLM call await app.teardown(); // report written to .appclaw/runs/
How instructions are resolved
-
Regex match — common patterns (
open X,tap X,type X,wait N seconds,scroll down, …) are resolved instantly with zero LLM cost. -
LLM fallback — anything that doesn't match a regex is sent to the
configured LLM, which classifies it into a structured action (
tap,type,swipe, etc.).
RunResult shape
| Field | Type | Description |
|---|---|---|
| success | boolean | Whether the action completed successfully |
| action | string |
Resolved step kind: tap | type | openApp |
wait | swipe | …
|
| message | string | Human-readable description of what happened |
Use run() when you want full control — one deterministic step at a time, easy to integrate with any test framework. Use runGoal() when you want the agent to figure out the steps itself. Use runFlow() for declarative YAML test cases you want to version-control.
Reports
Reports are enabled by default when using the SDK. After
teardown() is called, AppClaw writes an HTML report to
.appclaw/runs/ — one screenshot per step, plus a full execution summary. No
extra configuration needed.
const app = new AppClaw({ provider: 'gemini', apiKey: process.env.GEMINI_API_KEY, platform: 'android', reportName: 'YouTube Search', // shown in the report viewer }); await app.run('open YouTube app'); await app.run('tap Search'); await app.run('type Appium 3.0'); await app.run('tap the search button'); await app.teardown(); // ↑ writes report to .appclaw/runs/<runId>/
Screen recording
Pass video: true to record the screen for the entire run and embed the
video in the report. Recording starts automatically on the first run() call
and stops in teardown().
const app = new AppClaw({ provider: 'gemini', apiKey: process.env.GEMINI_API_KEY, platform: 'android', reportName: 'YouTube Search', video: true, // record screen for the whole run }); await app.run('open YouTube app'); await app.run('tap Search'); await app.run('type Appium 3.0'); await app.run('tap the search button'); await app.teardown(); // ↑ report includes recording.mp4 under the Recording tab
Each AppClaw instance records its own session independently — parallel
tests do not interfere. Port allocation (MJPEG, system port) is also handled
automatically per instance.
Viewing the report
Run the built-in report server after your tests complete:
npx appclaw --report
This starts a local server and opens the report in your browser. Every run is listed with its steps, screenshots, pass/fail status, and timing.
Report file layout
.appclaw/
runs/
runs.json # global run index
<runId>/
manifest.json # full run data (steps, timing, success)
steps/
step-000.png # screenshot after step 1
step-001.png # screenshot after step 2
step-002.png
Disabling reports
Set report: false to skip report generation (e.g. in performance-sensitive
CI pipelines):
const app = new AppClaw({ provider: 'gemini', apiKey: process.env.GEMINI_API_KEY, report: false, // disable report generation });
Using with Vitest / Jest
Create one AppClaw instance per test file, connect once in
beforeAll, and tear down in afterAll. Individual tests call
runFlow() or runGoal() and assert on the result.
import { describe, it, expect, afterAll } from 'vitest'; import { AppClaw } from 'appclaw'; const app = new AppClaw({ provider: 'anthropic', apiKey: process.env.ANTHROPIC_API_KEY, platform: 'android', maxSteps: 20, }); afterAll(() => app.teardown()); describe('Checkout flow', () => { it('completes purchase as a logged-in user', async () => { const result = await app.runFlow('./flows/checkout.yaml'); expect(result.success).toBe(true); }); it('handles empty cart gracefully', async () => { const result = await app.runFlow('./flows/checkout-empty-cart.yaml'); expect(result.success).toBe(true); }); it('completes in under 15 steps', async () => { const result = await app.runFlow('./flows/checkout.yaml'); expect(result.stepsUsed).toBeLessThan(15); }); });
Phased flows & assertion results
For flows that use setup / steps /
assertions sections, the failedPhase field tells you exactly
where execution broke down:
const result = await app.runFlow('./flows/login-phased.yaml'); if (!result.success) { // failedPhase: 'setup' | 'test' | 'assertion' console.error(`Failed in ${result.failedPhase} phase`); console.error(`Step ${result.failedStep}: ${result.error}`); }
CI Scripts
For CI pipelines that don't use a test framework, run flows sequentially and exit
non-zero on failure. The SDK's silent: true default keeps logs clean.
import { AppClaw } from 'appclaw'; const app = new AppClaw({ provider: 'google', apiKey: process.env.GEMINI_API_KEY, platform: 'android', silent: true, // no spinners in CI }); const flows = [ './flows/login.yaml', './flows/checkout.yaml', './flows/search.yaml', ]; for (const flow of flows) { const result = await app.runFlow(flow); if (!result.success) { console.error(`FAILED: ${flow} — ${result.error}`); await app.teardown(); process.exit(1); } console.log(`PASSED: ${flow} (${result.stepsUsed} steps)`); } await app.teardown(); console.log('All flows passed.');
Run it with tsx (no compilation step needed):
npx tsx scripts/smoke-test.ts
Options Reference
All fields passed to new AppClaw(options). Every field is optional — unset
fields fall back to .env values or built-in defaults, matching CLI
behaviour exactly.
| Option | Type | Default | Description |
|---|---|---|---|
| provider | string | 'gemini' |
'anthropic' | 'openai' | 'gemini' |
'groq' | 'ollama'
|
| apiKey | string | — | API key for the chosen LLM provider |
| model | string | Provider default | Model ID override (e.g. 'claude-opus-4-6') |
| platform | string | — | 'android' | 'ios' |
| agentMode | string | 'dom' |
'dom' uses accessibility tree; 'vision' uses AI vision
|
| maxSteps | number | 30 |
Maximum agent steps before giving up (applies to runGoal) |
| stepDelay | number | 500 |
Delay between steps in milliseconds |
| silent | boolean | true |
Suppress spinners and terminal colour output. Set false to debug
locally.
|
| report | boolean | true |
Auto-generate an HTML report to .appclaw/runs/ on
teardown(). Set false to disable.
|
| reportName | string | 'AppClaw SDK Run' |
Name shown in the report viewer. |
| video | boolean | false |
Record the screen for the entire run and embed the video under the
Recording tab in the report. Recording starts on the first
run() call and stops automatically in teardown().
Requires Appium screen recording support.
|
| mcpTransport | string | 'stdio' |
'stdio' (local appium-mcp) | 'sse' (remote server)
|
| mcpHost | string | 'localhost' |
appium-mcp host when transport is 'sse' |
| mcpPort | number | 8080 |
appium-mcp port when transport is 'sse' |
TypeScript types
All public types are exported from the top-level 'appclaw' import:
import { AppClaw, type AppClawOptions, // constructor options type FlowResult, // returned by runFlow() type RunResult, // returned by run() type AgentResult, // returned by runGoal() type RunYamlFlowOptions // second arg to runFlow() } from 'appclaw';
GitHub Actions
Run AppClaw mobile UI automation flows and AI-driven goals directly in GitHub Actions — Android emulator or iOS simulator included, zero boilerplate.
Available on the GitHub Marketplace as AppClaw Mobile Tests.
uses: AppiumTestDistribution/AppClaw@v1 with: flow: flows/login.yaml platform: android api-key: ${{ secrets.LLM_API_KEY }}
Quick Start
Android — run a YAML flow
name: Mobile Tests on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: AppiumTestDistribution/AppClaw@v1 with: flow: flows/login.yaml platform: android api-key: ${{ secrets.LLM_API_KEY }}
Android — natural language goal
- uses: AppiumTestDistribution/AppClaw@v1 with: goal: 'Open YouTube, search for Appium 3.0, verify the first result is visible' platform: android api-key: ${{ secrets.LLM_API_KEY }}
iOS — run a YAML flow
jobs: test: runs-on: macos-14 # iOS requires macOS (Apple Silicon) steps: - uses: actions/checkout@v4 - uses: AppiumTestDistribution/AppClaw@v1 with: flow: flows/ios-login.yaml platform: ios api-key: ${{ secrets.LLM_API_KEY }}
Inputs
All inputs are passed via the with: block in your workflow.
| Input | Required | Default | Description |
|---|---|---|---|
flow |
one of* | — | Path to a YAML flow file relative to repo root |
goal |
one of* | — | Natural language goal executed by the LLM agent |
platform |
no | android |
Target platform: android or ios |
provider |
no | gemini |
LLM provider: gemini, anthropic, openai,
groq
|
api-key |
yes | — | LLM API key — stored as LLM_API_KEY |
model |
no | provider default | LLM model ID to pin (e.g. gemini-2.0-flash) |
agent-mode |
no | dom |
dom (element locators) or vision (screenshot AI) |
max-steps |
no | 30 |
Maximum agent steps before the run fails |
step-delay |
no | 500 |
Milliseconds between steps |
android-api-level |
no | 33 |
Android emulator API level (33 = Android 13) |
android-profile |
no | pixel_6 |
Android AVD hardware profile |
android-target |
no | default |
Emulator target: default or google_apis |
ios-device-type |
no | simulator |
iOS device type: simulator or real |
ios-simulator-name |
no | iPhone 16 |
iOS simulator model to boot (e.g. iPhone 15, iPad Air)
|
ios-simulator-os |
no | latest | iOS version filter for simulator selection (e.g. 18.4) |
mcp-debug |
no | false |
Enable MCP debug logging (MCP_DEBUG=1). Useful for diagnosing CI
timeouts.
|
cloud-provider |
no | local | Cloud provider: lambdatest. Leave empty for local. |
lambdatest-username |
no** | — | LambdaTest account username |
lambdatest-access-key |
no** | — | LambdaTest access key |
lambdatest-device-name |
no** | — | Cloud device name (e.g. Pixel 7) |
lambdatest-os-version |
no** | — | Cloud OS version (e.g. 13, 16) |
lambdatest-app |
no | — | LambdaTest app ID (lt://APP...) |
report |
no | true |
Upload HTML report as workflow artifact |
report-name |
no | appclaw-report |
Name of the uploaded artifact |
appclaw-version |
no | latest |
npm package version to pin |
* Provide either flow or goal, not both.
** Required when cloud-provider: lambdatest.
Secrets Setup
Go to your repo → Settings → Secrets and variables → Actions → New repository secret:
| Secret name | Description |
|---|---|
LLM_API_KEY |
Your API key — works for any provider (Gemini, Anthropic, OpenAI, Groq) |
LT_USERNAME |
LambdaTest username (only if using cloud devices) |
LT_ACCESS_KEY |
LambdaTest access key (only if using cloud devices) |
LT_APP_ID |
LambdaTest app ID (only if using cloud devices) |
Examples
Parallel matrix — run multiple flows concurrently
jobs: test: runs-on: ubuntu-latest strategy: fail-fast: false matrix: flow: - flows/login.yaml - flows/search.yaml - flows/checkout.yaml steps: - uses: actions/checkout@v4 - uses: AppiumTestDistribution/AppClaw@v1 with: flow: ${{ matrix.flow }} platform: android api-key: ${{ secrets.LLM_API_KEY }} report-name: report-${{ strategy.job-index }}
LambdaTest cloud devices
Run iOS tests on Ubuntu — no macOS runner needed.
jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: AppiumTestDistribution/AppClaw@v1 with: flow: flows/ios-login.yaml platform: ios api-key: ${{ secrets.LLM_API_KEY }} cloud-provider: lambdatest lambdatest-username: ${{ secrets.LT_USERNAME }} lambdatest-access-key: ${{ secrets.LT_ACCESS_KEY }} lambdatest-device-name: 'iPhone 14' lambdatest-os-version: '16' lambdatest-app: ${{ secrets.LT_APP_ID }}
Vision mode (screenshot-based AI)
- uses: AppiumTestDistribution/AppClaw@v1 with: flow: flows/onboarding.yaml platform: android agent-mode: vision api-key: ${{ secrets.LLM_API_KEY }}
Pin model for cost control
- uses: AppiumTestDistribution/AppClaw@v1 with: flow: flows/smoke.yaml platform: android api-key: ${{ secrets.LLM_API_KEY }} model: 'gemini-2.0-flash' # cheaper/faster than pro
Nightly regression on a schedule
on: schedule: - cron: '0 2 * * *' # 2 AM UTC every night jobs: nightly: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: AppiumTestDistribution/AppClaw@v1 with: flow: flows/full-regression.yaml platform: android api-key: ${{ secrets.LLM_API_KEY }} report-name: nightly-report-${{ github.run_id }}
Reports
When report: true (default), an HTML report is uploaded as a workflow
artifact after each run. Download it from the
Actions run summary → Artifacts. The report includes:
- Step-by-step screenshots with tap overlays
- Pass/fail status per step
- Execution timeline
- Screen recording (if
video: trueis set in your flow)
Use report path in a downstream step
- uses: AppiumTestDistribution/AppClaw@v1 id: appclaw with: flow: flows/login.yaml platform: android api-key: ${{ secrets.LLM_API_KEY }} - name: Print report location run: echo "Report at ${{ steps.appclaw.outputs.report-path }}"
Runner Requirements
| Platform | Runner | Notes |
|---|---|---|
android |
ubuntu-latest |
Free tier. KVM-enabled. Emulator boots in ~4-6 min. |
ios |
macos-14 |
Apple Silicon. macOS minutes cost ~10x Linux. |
iOS tip: For faster iOS CI, use LambdaTest cloud devices on
ubuntu-latest
instead of a macOS runner.
What are App Guides?
App Guides (AppGuides) are per-app knowledge snippets injected directly into the agent's context window at the start of every automation run. They encode navigation patterns, gesture shortcuts, and common action paths for a specific app — so the agent never needs to rediscover them by trial and error.
AppGuides are AppClaw's implementation of context engineering — the practice of giving the LLM exactly the right knowledge to act correctly, rather than relying on the model's general training alone. For mobile automation, this means app-specific navigation knowledge baked into the system prompt before the agent ever looks at the screen.
How it works
When AppClaw starts a run against a known app, it automatically loads the matching guide
and prepends it to the agent's system prompt with an
APP_GUIDE (AppName): prefix. The LLM sees this contextual knowledge before
it takes any action — making the first step decisive rather than exploratory.
APP_GUIDE (WhatsApp): ## WhatsApp Navigation - Bottom tabs: Chats | Updates | Communities | Calls - New chat: floating pencil/message icon (bottom-right) - Search: magnifying-glass icon at the top of Chats ## Messaging - Open a chat → type in the message bar at the bottom → send via arrow icon - Attach media: paperclip icon next to message bar - Voice note: long-press the microphone icon
Resolution order
-
Custom guide —
.appclaw/guides/<appId>.md(highest priority, overrides built-ins) - Built-in guide — bundled guides for 10 common apps
- No guide — agent explores the app from scratch using only what it sees on screen
Built-in Guides
AppClaw ships with guides for the most commonly automated apps on both Android and iOS. These activate automatically when AppClaw detects the matching package name or bundle ID.
| App | Platform | App ID / Bundle ID |
|---|---|---|
| Gmail | Android | com.google.android.gm |
| Gmail | iOS | com.google.gmail |
| YouTube | Android | com.google.android.youtube |
| YouTube | iOS | com.google.ios.youtube |
| Android | com.whatsapp |
|
| iOS | net.whatsapp.WhatsApp |
|
| Chrome | Android | com.android.chrome |
| Chrome | iOS | com.google.chrome |
| Settings | Android | com.android.settings |
| Settings | iOS | com.apple.Preferences |
Example: WhatsApp Guide
## WhatsApp Navigation - Bottom tabs: Chats | Updates | Communities | Calls - New chat: floating pencil/message icon (bottom-right) - Search: magnifying-glass icon at the top of Chats ## Messaging - Open a chat → type in the message bar at the bottom → send via arrow icon - Attach media: paperclip icon next to message bar - Voice note: long-press the microphone icon - Emoji/stickers: smiley face icon on the left of message bar ## Common Actions - Star a message: long-press message → star icon - Forward: long-press message → forward arrow - Delete: long-press message → trash icon - Group info: tap the group name at the top of the chat
Example: YouTube Guide
## YouTube Navigation - Bottom nav: Home | Shorts | + (upload) | Subscriptions | Library - Search: magnifying-glass icon (top-right) - Tap a video thumbnail to play; double-tap left/right to seek ±10 s ## Searching - Tap the search icon → type query → press Enter or tap search icon again - Filter results: tap "Filters" after searching ## Playback - Full screen: rotate device or tap the expand icon (bottom-right of player) - Quality: tap ⋮ inside player → Quality - Captions: tap CC icon inside player
Custom Guides
Add a guide for any app — or override a built-in — by dropping a Markdown file at
.appclaw/guides/<appId>.md in your project directory. Custom guides
always take priority over built-ins.
If a custom guide exists for an app ID, it replaces the built-in entirely. To extend a built-in guide, copy its contents into your custom file and add your own sections.
Creating a custom guide
-
Find your app's package name (Android) or bundle ID (iOS). You can get this from the
appIdfield in your YAML flow, or by inspecting the device. - Create the directory
.appclaw/guides/in your project root. - Write a Markdown file named
<appId>.md.
## Main Navigation - Bottom tabs: Home | Search | Orders | Profile - Hamburger menu (top-left) → categories and account settings ## Checkout Flow - Cart icon is always in the top-right corner - Tap "Proceed to Checkout" → select address → choose payment → Place Order - Apply coupon: tap "Have a coupon?" on the order summary screen ## Product Search - Tap the search bar at the top; supports filters: Brand | Price | Rating - Long-press any product thumbnail to preview without navigating away
Once in place, AppClaw picks it up automatically — no code changes or restarts needed.
Tips for writing good guides
| Do | Why |
|---|---|
| Describe where things are, not what they say | UI labels change; positions are stable |
| List gestures explicitly ("swipe right to archive") | The agent can't infer non-obvious gestures from a screenshot |
| Use bullet points over prose | Every token counts — bullets are faster for the model to parse |
| Document multi-step paths (Settings → Account → Privacy) | Saves the agent multiple round-trips for deeply nested flows |
| Keep it short (under 500 tokens) | Guides are injected on every step — brevity reduces cost |