- Autonomous Task Loop: Agent! now reasons, executes, and self-corrects until the task is complete.
- Agentic Coding: Advanced code editing with Time Machine-style backups for every file change.
- Native Xcode Tools: Faster, project-aware builds and runs without external MCP configuration.
- Privileged Root Access: Secure, user-approved daemon for executing any system command.
- Desktop Automation: Full control of any macOS app via AXorcist (Accessibility API).
- Expanded AI Support: Stabilized tool calling for Mistral and Google Gemini models.
- Unified Provider Registry: Centralized model and URL management via
LLMRegistry. - Ollama Pre-warming: Eliminates cold-start delays by pre-loading models on launch.
- Enhanced Logging & Diagnostics: Improved daemon status checks and error reporting in the activity log.
- Multi-tab LLM Configuration: Per-tab provider/model settings for flexible multi-agent workflows.
A native macOS AI agent that controls your apps, writes code, automates workflows, and runs tasks from your iPhone via iMessage. All powered by the AI provider of your choice.
- Download Agent! and drag to Applications
- Open Agent! -- it sets up everything automatically
- Pick your AI -- Settings → choose a provider → enter API key
- Clone the repository:
git clone https://github.com/toddbruss/Agent.git cd Agent - Open
Agent.xcodeprojin Xcode. - Build and Run the
Agenttarget. - Approve the Helper Tool: When prompted, authorize the privileged daemon to allow root-level command execution.
- Configure your AI Provider: Go to Settings and enter your API key or select a local provider like Ollama.
💡 No API key? Use Ollama with GLM-5 -- completely free, runs offline, no account needed. Requires 32GB+ RAM.
"Play my Workout playlist in Music" "Build the Xcode project and fix any errors" "Take a photo with Photo Booth" "Send an iMessage to Mom saying I'll be home at 6" "Open Safari and search for flights to Tokyo" "Refactor this class into smaller files" "What calendar events do I have today?"
Just type what you want. Agent! figures out how and makes it happen.
Built-in autonomous task loop that reasons, executes, and self-corrects. Agent! doesn't just run code; it observes the results, debugs errors, and iterates until the task is complete.
Full coding environment built in. Reads codebases, edits files with precision, runs shell commands, builds Xcode projects, manages git, and auto-enables coding mode to focus the AI on development tools. Replaces Claude Code, Cursor, and Cline -- no terminal, no IDE plugins, no monthly fee. Features Time Machine-style backups for every file change, letting you revert any edit instantly.
Automatically detects and uses available tools (Xcode, Playwright, Shell, etc.) based on your prompt. No manual configuration required for core tools.
Securely runs root-level commands via a dedicated macOS Launch Daemon. The user approves the daemon once, then the agent can execute commands autonomously via XPC.
Control any Mac app through the Accessibility API. Click buttons, type into fields, navigate menus, scroll, drag -- all programmatically. Powered by AXorcist for reliable, fuzzy-matched element finding.
| Provider | Cost | Best For |
|---|---|---|
| Z.ai/GLM-5.1 | Paid | Recommended starting point |
| Claude (Anthropic) | Paid | Complex tasks |
| ChatGPT (OpenAI) | Paid | General purpose |
| Google Gemini | Paid/Free | High performance, long context |
| Apple Intelligence | Free | On-device, assistant |
| DeepSeek | Paid | Budget cloud AI |
| Grok-2 (xAI) | Paid | Real-time info |
| Local Ollama | Free | Full privacy, offline |
| LM Studio | Free | Easy local setup |
| Hugging Face | Varies | Open-source models |
| vLLM | Free | Local or Cloud |
| Mistral | AI Studio | High-performance open models |
| Mistral Vibe | Le Chat | High-performance open models |
Click the microphone and speak. Agent! transcribes in real time and executes your request.
Text your Mac from your iPhone:
Agent! What song is playing?
Agent! Check my email
Agent! Next Song
Your Mac runs the task and texts back the result. Only approved contacts can send commands.
Drives Safari hands-free -- search Google, click links, fill forms, read pages, extract information.
For complex tasks, Agent! creates a step-by-step plan, works through each step, and checks them off in real time.
Work on multiple tasks simultaneously. Each tab has its own project folder and conversation history.
Take screenshots or paste images. Vision-capable AI models analyze what they see -- describe content, read text, spot UI issues.
Agent! includes built-in Safari web automation via JavaScript and AppleScript. Search Google, click links, fill forms, read page content, and execute JavaScript -- all hands-free.
To enable: Open Safari → Settings → Advanced → check "Show features for web developers". Then go to Developer menu → check "Allow JavaScript from Apple Events".
Full cross-browser automation via Microsoft Playwright MCP. Click, type, screenshot, and navigate any website in Chrome, Firefox, or WebKit -- all controlled by the AI.
Setup (one-time):
# 1. Install Node.js (if not already installed)
brew install node
# 2. Install Playwright MCP server globally
npm install -g @playwright/mcp@latest
# 3. Install browser binaries (pick one or all)
npx playwright install chromium # Chrome (~165MB)
npx playwright install firefox # Firefox (~97MB)
npx playwright install webkit # Safari/WebKit (~75MB)
npx playwright install # All browsersConfigure in Agent!:
Go to Settings → MCP Servers → Add Server, paste this JSON:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp"],
"transport": "stdio"
}
}
}Note: If
npxis not found, use the full path: runwhich npxin Terminal and replace"npx"with the result (e.g."/opt/homebrew/bin/npx").
Toggle ON and Playwright tools appear automatically. The AI can now control browsers directly.
These tools manage project settings and coding mode:
| Tool | What It Does |
|---|---|
| project_folder | Get or change the working directory for this tab — use set, home, documents, library, or none |
| coding_mode | Toggle coding mode on/off — when ON, only Core+Workflow+Coding+UserAgent tools are available for faster responses |
| plan_mode | Create, update, read, list, or delete step-by-step plans with status tracking — ideal for complex tasks |
| memory | Read/write persistent user preferences — use append to remember things across sessions |
💡 Pro Tip: Use
coding_mode(true)when working on code — it removes unnecessary tools and speeds up responses.
These tools interact with macOS UI and web pages:
| Tool | What It Does |
|---|---|
| accessibility | Control any app — click buttons, type text, read elements, manage windows, navigate menus, capture screenshots |
| web | Automate Safari — open URLs, click elements, type text, execute JS, search, navigate tabs |
| mcp_playwright_browser_* | Advanced browser automation via Playwright — snapshot, click, hover, drag, fill forms, upload files, etc. |
| web_search | Search the web for current information — returns relevant page titles, URLs, and content snippets |
💡 Pro Tip: Use
accessibilityfor macOS UI automation — it’s faster and more reliable than screenshots.
These tools manage Swift and AppleScript automation scripts:
| Tool | What It Does |
|---|---|
| agent | Create, read, update, run, delete, or combine Swift automation scripts with TCC permissions |
| applescript_tool | Execute, save, delete, or list AppleScript scripts — use lookup_sdef to inspect app dictionaries |
| javascript_tool | Run JXA (JavaScript for Automation) scripts — ideal for lightweight automation tasks |
| batch_tools | Run multiple tool calls in one batch with progress tracking — no round-trips, ideal for complex workflows |
💡 Pro Tip: Use
agentfor Swift scripts that need TCC permissions — it’s the most powerful scripting tool.
These tools execute shell commands and system-level operations:
| Tool | What It Does |
|---|---|
| execute_agent_command | Run shell commands as current user — use for git, ls, grep, find, homebrew, scripts |
| execute_daemon_command | Run shell commands as ROOT via Launch Daemon — no sudo needed, use for system logs, disk ops, network debug |
| run_shell_script | Execute shell scripts with automatic fallback to in-process when User Agent is off |
| batch_commands | Run multiple shell commands in one call — no round-trips, ideal for setup scripts |
💡 Pro Tip: Use
execute_daemon_commandinstead ofsudo— it’s safer and doesn’t require password prompts.
These tools handle file operations and version control:
| Tool | What It Does |
|---|---|
| file_manager | Read/write/edit/list/search files — use diff_apply for code changes, edit for single-line fixes |
| git | Git operations: status, diff, log, commit, branch — always use this instead of shell git commands |
| xcode | Build/run Xcode projects, analyze/snippet Swift code, add/remove files, grant permissions |
💡 Pro Tip: Always use
file_managerfor file operations — it’s safer and more reliable than shell commands.
These tools help manage the agent's workflow and state:
| Tool | What It Does |
|---|---|
| plan_mode | Create, update, read, list, or delete step-by-step plans with status tracking |
| memory | Read/write persistent user preferences — store notes, settings, or context across sessions |
| coding_mode | Toggle coding mode on/off to restrict available tools for focused development |
| project_folder | Get or change the working directory for this tab — set to home, documents, library, or custom path |
💡 Pro Tip: Use
plan_modeto break complex tasks into manageable steps and track progress.
These are the foundational tools that every agent needs:
| Tool | What It Does |
|---|---|
| task_complete | Signal when a task is finished — always call this at the end of any task |
| list_tools | List all available tools and their descriptions |
| web_search | Search the web for current information or facts you're unsure about |
💡 Pro Tip: Always call
task_completeat the end of every task to signal completion and avoid hanging.
These tools enable system automation and UI interaction:
| Tool | What It Does |
|---|---|
| applescript_tool | Execute AppleScript, list scripts, save/delete, or lookup SDEFs for apps |
| accessibility | Control any macOS app via AX API — click buttons, type text, read elements, manage windows |
| javascript_tool | Run JXA (JavaScript for Automation) scripts, list/save/delete scripts |
| lookup_sdef | Inspect AppleScript dictionary definitions for any app (e.g., Music, Safari) |
💡 Pro Tip: Use
accessibilityto automate UI interactions across all macOS apps — it’s the most powerful tool for GUI automation.
These tools provide code editing, file management, and Xcode integration capabilities:
| Tool | What It Does |
|---|---|
| read_file | Read the contents of any file in the project |
| write_file | Write content to a file (creates if doesn't exist) |
| edit_file | Replace exact string matches in a file |
| create_diff | Preview changes before applying them to a file |
| apply_diff | Apply previously previewed changes to a file |
| diff_and_apply | Create and apply changes to a file in one step |
| undo_edit | Revert the last edit made to a file |
| list_files | List files in a directory with optional pattern matching |
| search_files | Search for files containing specific text patterns |
| read_dir | Get detailed information about files in a directory |
| file_manager | Comprehensive file operations including read, write, edit, list, search |
| xcode | Build, run, analyze, and manage Xcode projects |
| project_folder | Set or get the current project directory |
| mode | Toggle coding mode on/off for optimized tool selection |
💡 Pro Tip: Use
create_diffto preview changes before applying them withapply_diffto avoid accidental edits.
- Your data stays on your Mac. Files, screen contents, and personal data are never uploaded.
- Cloud AI only sees your prompt text. Use local AI to stay 100% offline.
- You're in control. Agent! shows everything it does and logs every action.
- Built on Apple's security model. macOS permissions protect your system.
| Shortcut | Action |
|---|---|
Enter |
Run task |
⌘ R |
Run current task |
⌘ . |
Stop task |
Escape |
Cancel active task |
⌘ D |
Toggle LLM output panel |
⌘ T |
New tab |
⌘ W |
Close tab |
⌘ 1-9 |
Switch to tab |
⌘ [ / ⌘ ] |
Previous / next tab |
⌘ F |
Search activity log |
⌘ L |
Clear conversation |
⌘ H |
Task history |
⌘ , |
Settings |
⌘ V |
Paste image |
↑ / ↓ |
Prompt history |
Do I need to know how to code? No. Just type what you want in plain English.
Is it safe? Yes. Standard macOS automation, full activity logging, you approve permissions.
How much does it cost? Agent! is free (MIT License). Cloud AI providers charge for API usage. Local models are free.
What Mac do I need? macOS 26+. Apple Silicon recommended. 32GB+ RAM for local models.
How is this different from Siri? Siri answers questions. Agent! performs actions -- controls apps, manages files, builds code, automates workflows.
- Technical Architecture -- Tools, scripting, developer details
- Comparisons -- vs Claude Code, Cursor, Cline, OpenClaw
- Security Model -- XPC architecture, privilege separation
- FAQ -- Common questions
Agent! includes native Xcode integration that works without any MCP server setup. These built-in tools are often faster and more reliable than the MCP alternative since they run directly inside the app.
| Tool | What It Does |
|---|---|
| xcode build | Build the current Xcode project, capture errors and warnings. Errors in the activity log are clickable and open directly in Xcode. |
| xcode run | Build and run the app |
| xcode list_projects | Discover open Xcode workspaces and projects |
| xcode select_project | Switch the active project |
| xcode grant_permission | Grant file access to the Xcode project folder |
The AI automatically uses these when you ask it to build, fix errors, or work with Xcode projects. No configuration needed -- just have your project open in Xcode.
🚀 iOS/iPadOS Support: Coming soon! Native support for building, running, and testing iOS and iPadOS apps directly from Agent! is in development.
Tip: For most coding workflows, the built-in tools are all you need. The MCP Xcode server below adds extras like SwiftUI Preview rendering and documentation search.
Agent! can be controlled via voice command "Agent!" using the Messages app. This feature allows users to send commands to Agent! through text messages, enabling remote control and automation of tasks on their macOS device.
- Voice Command Setup: Users can set up a voice command in macOS that triggers sending a message to a predefined contact or group chat.
- Message Reception: Agent! monitors incoming messages for specific keywords or phrases (e.g., "Agent!").
- Command Execution: Upon detecting the keyword, Agent! parses the message content and executes the corresponding task or command.
- Response: Agent! sends a reply message back to the sender with the results or status of the executed command.
- Remote Task Execution: Send a message like "Agent! open Finder" to remotely open the Finder application.
- System Commands: Execute system commands such as "Agent! restart" to restart the computer.
- File Operations: Perform file operations like "Agent! copy /path/to/file" to copy files to a specified location.
To enable this feature, users need to configure the Messages app to allow Agent! to access and monitor incoming messages. This can be done through the system preferences under Security & Privacy → Privacy → Accessibility.
Note: Ensure that the Messages app is running and that the user has granted the necessary permissions for Agent! to interact with it.
The Services button (gear icon) provides quick access to project folder management and task configuration options:
- Move/Go Down: Navigate to a different project folder location
- New Folder: Create a new folder for your project
- Home: Quickly return to your home directory
- Close: Clear the current project folder selection
- Project Folder Input: Enter or paste a custom project folder path
- Folder Size Display: Shows the current folder size (e.g., "20.0M")
- User Prompt: Configure user prompts for tasks
- Cancel: Cancel the current operation
- Thinking Indicator: Shows when Agent! is processing a task
- Task Progress: Displays progress information during task execution
- Context Usage: Shows how much context is being used for the current task
- More Options: Additional configuration settings
- Dismiss: Close the popover (currently disabled when active)
- Steps: View the steps of the current task (7 steps shown)
- Screenshot: Take a screenshot to attach to your task
- Paste Image: Paste an image from clipboard into the task
- Cancel Task: Cancel a running task
- Dictation: Start voice dictation for entering tasks
- Hotword: Activate voice command with "Agent!"
- Run Task: Execute the current task (currently disabled when not ready)
- Task Input Field: Enter your task description here
This popover provides comprehensive control over your project environment and task execution workflow.
Agent! supports MCP servers for extended capabilities. Configure in Settings → MCP Servers.
Connect Agent! directly to Xcode for project-aware operations:
{
"mcpServers" : {
"xcode" : {
"command" : "xcrun",
"args" : [
"mcpbridge"
],
"transport" : "stdio"
}
}
}Xcode MCP provides:
- Project-aware file operations (read/write/edit/delete)
- Build and test integration
- SwiftUI Preview rendering
- Code snippet execution
- Apple Developer Documentation search
- Real-time issue tracking
MIT - free and open source.
