AppClaw is an AI agent that automates any mobile app. Describe a goal in plain English — it sees the screen, reasons about what to do, and executes actions across Android and iOS.
Works with Claude, GPT-4, Gemini, Groq, and Ollama
Works with any LLM provider
An agentic AI loop that perceives, reasons, and acts on any mobile app — no selectors, no scripting.
Parses native UI hierarchy (XML) into structured elements. Understands buttons, inputs, toggles, and text — no brittle selectors needed.
Describe what you want in plain English. The AI figures out the sequence of taps, types, scrolls, and swipes to get it done.
Stuck detection, checkpoint rollback, and alternative path suggestions. Adapts when UI changes or actions fail.
Android via UiAutomator2 and iOS via XCUITest. Same agent, same goals — different platforms handled transparently.
Record goal executions and replay without LLM costs. Adaptive replayer handles layout changes across runs.
Pauses for OTP codes, CAPTCHAs, or ambiguous choices. Asks the user and resumes seamlessly.
Every step follows the same agentic loop until the goal is done or max steps are reached.
Reads the live screen via appium-mcp's page source. Parses native XML into a structured list of UI elements.
Sends the goal + current screen state to an LLM. Gets back a JSON action decision — what to tap, type, or swipe next.
Executes the chosen action via appium-mcp tools — click, set_value, scroll, swipe, launch, and 26 more.
Loops back to step 1 with the new screen state. Continues until the goal is achieved or max steps reached.
AppClaw is the decision-maker. appium-mcp handles the device. Clean separation via MCP protocol.
AppClaw consumes appium-mcp's 32 tools over stdio or SSE. It never touches the device directly — it only decides which tool to call next.
Tap, type, swipe, screenshot, launch, install, and more
Swap between Claude, GPT-4, Gemini, Groq, or local Ollama
Complex multi-app goals are split into sequential sub-goals
USB, emulator, simulator, or remote device farms via SSE
Built-in intelligence for the hardest parts of mobile automation.
From simple taps to multi-step compound operations. Smart typing detects non-editable wrappers and finds the real input. submit_message works across WhatsApp, Telegram, Slack, and more.
Record any goal execution and replay it without LLM costs. The replayer doesn't blindly repeat coordinates — it reads the current screen, matches elements, and adapts to layout changes.
Complex multi-app tasks are automatically broken into sequential sub-goals. "Copy the weather and send it on Slack" becomes 4 focused steps, each tracked and executed independently.
Install, configure, connect a device, and run. Four steps to your first AI-automated mobile goal.
Requires Node.js 18+. Clone the repo, install dependencies, and set up Appium with the driver for your platform.
Create a .env file and set your provider and API key. Works with Anthropic, OpenAI, Google, Groq, or local Ollama.
Plug in an Android device (or start an emulator/iOS simulator). Verify with adb devices and start the Appium server.
Pass any goal in plain English, run a declarative YAML flow, or use interactive mode. AppClaw connects to your device and executes autonomously.
Open source. MIT licensed. Free forever. Start automating any Android or iOS app with AI today.
git clone https://github.com/AppiumTestDistribution/appclaw
MIT License · Free forever · Community driven