Tell your phone
what to do.

Built for QA, mobile developers, and automation teams who need real device control, not brittle scripts. AppClaw is an on-device AI agent: describe a goal in plain English — it reads the screen, plans the next action, and runs it on Android and iOS.

% Benchmark: Vision used about 70% fewer tokens than the DOM-based run, roughly 7k vs 23k.

Works with Claude, GPT-4, Gemini, Groq, and Ollama

AppClaw demo — AI automating a mobile app in real time
appclaw
$ appclaw "Send a WhatsApp message to Mom saying good morning"   1/30 launch → "com.whatsapp" (Open WhatsApp) 2/30 find_and_tap → "Mom" (Find Mom in chat list) 3/30 type → "message_input" text="good morning" 4/30 submit_message (Find and tap Send button) 5/30 done (Message sent successfully)   ✓ Goal completed in 5 steps   ── Token Usage ────────────── Model: gemini-2.0-flash Total: 15,425 tokens Est. cost: $0.001874

Works with any LLM provider

A
Anthropic
O
OpenAI
G
Google
GQ
Groq
🦙
Ollama
The Loop

Perceive. Reason. Act.

Every step follows the same agentic loop until the goal is done or max steps are reached.

1

Perceive

Reads the live screen via appium-mcp's page source. Parses native XML into a structured list of UI elements.

2

Reason

Sends the goal + current screen state to an LLM. Gets back a JSON action decision — what to tap, type, or swipe next.

3

Act

Executes the chosen action via appium-mcp tools — click, set_value, scroll, swipe, launch, and 26 more.

4

Repeat

Loops back to step 1 with the new screen state. Continues until the goal is achieved or max steps reached.

Architecture

Pure agentic brain,
zero device logic

AppClaw is the decision-maker. appium-mcp handles the device. Clean separation via MCP protocol.

Powered by the Model Context Protocol

AppClaw consumes appium-mcp's 32 tools over stdio or SSE. It never touches the device directly — it only decides which tool to call next.

32 MCP Tools

Tap, type, swipe, screenshot, launch, install, and more

Multi-Provider LLMs

Swap between Claude, GPT-4, Gemini, Groq, or local Ollama

Goal Decomposition

Complex multi-app goals are split into sequential sub-goals

Cloud or Local Devices

USB, emulator, simulator, or remote device farms via SSE

AppClaw (TypeScript)
Agentic Loop Perception Skills
LLM Layer Recovery Recorder
MCP Protocol
appium-mcp (32 tools)
tap type swipe screenshot launch ...
Appium
UiAutomator2 XCUITest
Android / iOS Device
Power Features

Beyond simple automation

Built-in intelligence for the hardest parts of mobile automation.

Smart Actions

16 Agent Actions with Built-in Skills

From simple taps to multi-step compound operations. Smart typing detects non-editable wrappers and finds the real input. submit_message works across WhatsApp, Telegram, Slack, and more.

tap type swipe find_and_tap read_screen ask_user
// Smart Type: auto-detects real input
smart_type "search_field"
  Target is non-editable wrapper
  Clicking to navigate...
  Re-reading page source...
  Found EditText (id: input_23)
  ✓ Typed into real input field
Recording

Record & Adaptive Replay

Record any goal execution and replay it without LLM costs. The replayer doesn't blindly repeat coordinates — it reads the current screen, matches elements, and adapts to layout changes.

--record --replay Adaptive
Planning

Goal Decomposition

Complex multi-app tasks are automatically broken into sequential sub-goals. "Copy the weather and send it on Slack" becomes 4 focused steps, each tracked and executed independently.

--plan Multi-app Sub-goals
32
MCP tools consumed
16
Agent actions
5
LLM providers
2
Platforms (iOS + Android)
Quickstart

Up and running
in 60 seconds

Install, add your LLM credentials, connect an Android or iOS device, and run. Four steps to your first AI-driven goal on real hardware or simulators.

1

Install

Requires Node.js 18+. Install AppClaw globally from npm.

2

Configure your LLM

Create a .env file and set your provider and API key. Works with Anthropic, OpenAI, Google, Groq, or local Ollama. AppClaw is free to run; you bring your own key and pay your LLM provider per their pricing (or use a local model).

3

Connect a device

Plug in an Android phone or iPhone, or use an Android emulator / iOS Simulator. Verify Android with adb devices; for iOS, use your usual Xcode / Appium pairing. Then start the Appium server.

4

Run a goal

Pass any goal in plain English, run a declarative YAML flow, or use interactive mode. AppClaw connects to your device and executes autonomously.

# Install AppClaw $ npm install -g appclaw   # Create .env with your LLM provider $ echo "LLM_PROVIDER=gemini" > .env $ echo "LLM_API_KEY=your-key" >> .env   # Verify device is connected $ adb devices   # Run a goal in plain English $ appclaw "Open Settings and turn on WiFi"   # Or run a declarative YAML flow (no LLM needed) $ appclaw --flow examples/flows/google-search.yaml   # Record, replay, or decompose complex goals $ appclaw --record "Send hello on WhatsApp" $ appclaw --replay recordings/rec-*.json $ appclaw --plan "Copy weather and send on Slack"

Ready to automate
your mobile apps?

Open source under Apache 2.0 — no license fee for AppClaw. You supply your own LLM API key (provider charges apply) or run a local model. Automate Android and iOS apps with AI today.

Install & quickstart Star on GitHub

Apache 2.0 · BYO LLM key · Community driven

Copied to clipboard!