AutoControl is a cross-platform Python GUI automation framework providing mouse control, keyboard input, image recognition, screen capture, action scripting, and report generation — all through a unified API that works on Windows, macOS, and Linux (X11).
- Features
- Architecture
- Installation
- Requirements
- Quick Start
- API Reference
- Mouse Control
- Keyboard Control
- Image Recognition
- Screen Operations
- Action Recording & Playback
- Action Scripting (JSON Executor)
- Report Generation
- Remote Automation (Socket Server)
- Shell Command Execution
- Screen Recording
- Callback Executor
- Package Manager
- Project Management
- Window Management
- GUI Application
- Command-Line Interface
- Platform Support
- Development
- License
- Mouse Automation — move, click, press, release, drag, and scroll with precise coordinate control
- Keyboard Automation — press/release individual keys, type strings, hotkey combinations, key state detection
- Image Recognition — locate UI elements on screen using OpenCV template matching with configurable threshold
- Screenshot & Screen Recording — capture full screen or regions as images, record screen to video (AVI/MP4)
- Action Recording & Playback — record mouse/keyboard events and replay them
- JSON-Based Action Scripting — define and execute automation flows using JSON action files
- Report Generation — export test records as HTML, JSON, or XML reports with success/failure status
- Remote Automation — start a TCP socket server to receive and execute automation commands from remote clients
- Shell Integration — execute shell commands within automation workflows with async output capture
- Callback Executor — trigger automation functions with callback hooks for chaining operations
- Dynamic Package Loading — extend the executor at runtime by importing external Python packages
- Project & Template Management — scaffold automation projects with keyword/executor directory structure
- Window Management — send keyboard/mouse events directly to specific windows (Windows/Linux)
- GUI Application — built-in PySide6 graphical interface for interactive automation
- Cross-Platform — unified API across Windows, macOS, and Linux (X11)
je_auto_control/
├── wrapper/ # Platform-agnostic API layer
│ ├── platform_wrapper.py # Auto-detects OS and loads the correct backend
│ ├── auto_control_mouse.py # Mouse operations
│ ├── auto_control_keyboard.py# Keyboard operations
│ ├── auto_control_image.py # Image recognition (OpenCV template matching)
│ ├── auto_control_screen.py # Screenshot, screen size, pixel color
│ └── auto_control_record.py # Action recording/playback
├── windows/ # Windows-specific backend (Win32 API / ctypes)
├── osx/ # macOS-specific backend (pyobjc / Quartz)
├── linux_with_x11/ # Linux-specific backend (python-Xlib)
├── gui/ # PySide6 GUI application
└── utils/
├── executor/ # JSON action executor engine
├── callback/ # Callback function executor
├── cv2_utils/ # OpenCV screenshot, template matching, video recording
├── socket_server/ # TCP socket server for remote automation
├── shell_process/ # Shell command manager
├── generate_report/ # HTML / JSON / XML report generators
├── test_record/ # Test action recording
├── json/ # JSON action file read/write
├── project/ # Project scaffolding & templates
├── package_manager/ # Dynamic package loading
├── logging/ # Logging
└── exception/ # Custom exception classes
The platform_wrapper.py module automatically detects the current operating system and imports the corresponding backend, so all wrapper functions work identically regardless of platform.
pip install je_auto_controlpip install je_auto_control[gui]On Linux, install the following system packages before installing:
sudo apt-get install cmake libssl-dev- Python >= 3.10
- pip >= 19.3
| Package | Purpose |
|---|---|
je_open_cv |
Image recognition (OpenCV template matching) |
pillow |
Screenshot capture |
mss |
Fast multi-monitor screenshot |
pyobjc |
macOS backend (auto-installed on macOS) |
python-Xlib |
Linux X11 backend (auto-installed on Linux) |
PySide6 |
GUI application (optional, install with [gui]) |
qt-material |
GUI theme (optional, install with [gui]) |
import je_auto_control
# Get current mouse position
x, y = je_auto_control.get_mouse_position()
print(f"Mouse at: ({x}, {y})")
# Move mouse to coordinates
je_auto_control.set_mouse_position(500, 300)
# Left click at current position (use key name)
je_auto_control.click_mouse("mouse_left")
# Right click at specific coordinates
je_auto_control.click_mouse("mouse_right", x=800, y=400)
# Scroll down
je_auto_control.mouse_scroll(scroll_value=5)import je_auto_control
# Press and release a single key
je_auto_control.type_keyboard("a")
# Type a whole string character by character
je_auto_control.write("Hello World")
# Hotkey combination (e.g., Ctrl+C)
je_auto_control.hotkey(["ctrl_l", "c"])
# Check if a key is currently pressed
is_pressed = je_auto_control.check_key_is_press("shift_l")import je_auto_control
# Find all occurrences of an image on screen
positions = je_auto_control.locate_all_image("button.png", detect_threshold=0.9)
# Returns: [[x1, y1, x2, y2], ...]
# Find a single image and get its center coordinates
cx, cy = je_auto_control.locate_image_center("icon.png", detect_threshold=0.85)
print(f"Found at: ({cx}, {cy})")
# Find an image and automatically click it
je_auto_control.locate_and_click("submit_button.png", mouse_keycode="mouse_left")import je_auto_control
# Take a full-screen screenshot and save to file
je_auto_control.pil_screenshot("screenshot.png")
# Take a screenshot of a specific region [x1, y1, x2, y2]
je_auto_control.pil_screenshot("region.png", screen_region=[100, 100, 500, 400])
# Get screen resolution
width, height = je_auto_control.screen_size()
# Get pixel color at coordinates
color = je_auto_control.get_pixel(500, 300)import je_auto_control
import time
# Start recording mouse and keyboard events
je_auto_control.record()
time.sleep(10) # Record for 10 seconds
# Stop recording and get the action list
actions = je_auto_control.stop_record()
# Replay the recorded actions
je_auto_control.execute_action(actions)Create a JSON action file (actions.json):
[
["AC_set_mouse_position", {"x": 500, "y": 300}],
["AC_click_mouse", {"mouse_keycode": "mouse_left"}],
["AC_write", {"write_string": "Hello from AutoControl"}],
["AC_screenshot", {"file_path": "result.png"}],
["AC_hotkey", {"key_code_list": ["ctrl_l", "s"]}]
]Execute it:
import je_auto_control
# Execute from file
je_auto_control.execute_action(je_auto_control.read_action_json("actions.json"))
# Or execute from a list directly
je_auto_control.execute_action([
["AC_set_mouse_position", {"x": 100, "y": 200}],
["AC_click_mouse", {"mouse_keycode": "mouse_left"}]
])Available action commands:
| Category | Commands |
|---|---|
| Mouse | AC_click_mouse, AC_set_mouse_position, AC_get_mouse_position, AC_press_mouse, AC_release_mouse, AC_mouse_scroll, AC_mouse_left, AC_mouse_right, AC_mouse_middle |
| Keyboard | AC_type_keyboard, AC_press_keyboard_key, AC_release_keyboard_key, AC_write, AC_hotkey, AC_check_key_is_press |
| Image | AC_locate_all_image, AC_locate_image_center, AC_locate_and_click |
| Screen | AC_screen_size, AC_screenshot |
| Record | AC_record, AC_stop_record |
| Report | AC_generate_html, AC_generate_json, AC_generate_xml, AC_generate_html_report, AC_generate_json_report, AC_generate_xml_report |
| Project | AC_create_project |
| Shell | AC_shell_command |
| Process | AC_execute_process |
| Executor | AC_execute_action, AC_execute_files |
import je_auto_control
# Enable test recording first
je_auto_control.test_record_instance.set_record_enable(True)
# ... perform automation actions ...
je_auto_control.set_mouse_position(100, 200)
je_auto_control.click_mouse("mouse_left")
# Generate reports
je_auto_control.generate_html_report("test_report") # -> test_report.html
je_auto_control.generate_json_report("test_report") # -> test_report.json
je_auto_control.generate_xml_report("test_report") # -> test_report.xml
# Or get report content as string
html_string = je_auto_control.generate_html()
json_string = je_auto_control.generate_json()
xml_string = je_auto_control.generate_xml()Reports include: function name, parameters, timestamp, and exception info (if any) for each recorded action. HTML reports display successful actions in cyan and failed actions in red.
Start a TCP server to receive JSON automation commands from remote clients:
import je_auto_control
# Start the server (default: localhost:9938)
server = je_auto_control.start_autocontrol_socket_server(host="localhost", port=9938)
# The server runs in a background thread
# Send JSON action commands via TCP to execute remotely
# Send "quit_server" to shut downClient example:
import socket
import json
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(("localhost", 9938))
# Send an automation command
command = json.dumps([
["AC_set_mouse_position", {"x": 500, "y": 300}],
["AC_click_mouse", {"mouse_keycode": "mouse_left"}]
])
sock.sendall(command.encode("utf-8"))
# Receive response
response = sock.recv(8192).decode("utf-8")
print(response)
sock.close()import je_auto_control
# Using the default shell manager
je_auto_control.default_shell_manager.exec_shell("echo Hello")
je_auto_control.default_shell_manager.pull_text() # Print captured output
# Or create a custom ShellManager
shell = je_auto_control.ShellManager(shell_encoding="utf-8")
shell.exec_shell("ls -la")
shell.pull_text()
shell.exit_program()import je_auto_control
import time
# Method 1: ScreenRecorder (manages multiple recordings)
recorder = je_auto_control.ScreenRecorder()
recorder.start_new_record(
recorder_name="my_recording",
path_and_filename="output.avi",
codec="XVID",
frame_per_sec=30,
resolution=(1920, 1080)
)
time.sleep(10)
recorder.stop_record("my_recording")
# Method 2: RecordingThread (simple single recording, outputs MP4)
recording = je_auto_control.RecordingThread(video_name="my_video", fps=20)
recording.start()
time.sleep(10)
recording.stop()Execute an automation function and trigger a callback upon completion:
import je_auto_control
def my_callback():
print("Action completed!")
# Execute set_mouse_position then call my_callback
je_auto_control.callback_executor.callback_function(
trigger_function_name="AC_set_mouse_position",
callback_function=my_callback,
x=500, y=300
)
# With callback parameters
def on_done(message):
print(f"Done: {message}")
je_auto_control.callback_executor.callback_function(
trigger_function_name="AC_click_mouse",
callback_function=on_done,
callback_function_param={"message": "Click finished"},
callback_param_method="kwargs",
mouse_keycode="mouse_left"
)Dynamically load external Python packages into the executor at runtime:
import je_auto_control
# Add all functions/classes from a package to the executor
je_auto_control.package_manager.add_package_to_executor("os")
# Now you can use os functions in JSON action scripts:
# ["os_getcwd", {}]
# ["os_listdir", {"path": "."}]Scaffold a project directory structure with template files:
import je_auto_control
# Create a project structure
je_auto_control.create_project_dir(project_path="./my_project", parent_name="AutoControl")
# This creates:
# my_project/
# └── AutoControl/
# ├── keyword/
# │ ├── keyword1.json # Template action file
# │ ├── keyword2.json # Template action file
# │ └── bad_keyword_1.json # Error handling template
# └── executor/
# ├── executor_one_file.py # Execute single file example
# ├── executor_folder.py # Execute folder example
# └── executor_bad_file.py # Error handling exampleSend events directly to specific windows (Windows and Linux only):
import je_auto_control
# Send keyboard event to a window by title
je_auto_control.send_key_event_to_window("Notepad", keycode="a")
# Send mouse event to a window handle
je_auto_control.send_mouse_event_to_window(window_handle, mouse_keycode="mouse_left", x=100, y=50)Launch the built-in graphical interface (requires [gui] extra):
import je_auto_control
je_auto_control.start_autocontrol_gui()Or from the command line:
python -m je_auto_controlAutoControl can be used directly from the command line:
# Execute a single action file
python -m je_auto_control -e actions.json
# Execute all action files in a directory
python -m je_auto_control -d ./action_files/
# Execute a JSON string directly
python -m je_auto_control --execute_str '[["AC_screenshot", {"file_path": "test.png"}]]'
# Create a project template
python -m je_auto_control -c ./my_project| Platform | Status | Backend | Notes |
|---|---|---|---|
| Windows 10 / 11 | Supported | Win32 API (ctypes) | Full feature support |
| macOS 10.15+ | Supported | pyobjc / Quartz | Action recording not available; send_key_event_to_window / send_mouse_event_to_window not supported |
| Linux (X11) | Supported | python-Xlib | Full feature support |
| Linux (Wayland) | Not supported | — | May be added in a future release |
| Raspberry Pi 3B / 4B | Supported | python-Xlib | Runs on X11 |
git clone https://github.com/Intergration-Automation-Testing/AutoControl.git
cd AutoControl
pip install -r dev_requirements.txt# Unit tests
python -m pytest test/unit_test/
# Integration tests
python -m pytest test/integrated_test/- Homepage: https://github.com/Intergration-Automation-Testing/AutoControl
- Documentation: https://autocontrol.readthedocs.io/en/latest/
- PyPI: https://pypi.org/project/je_auto_control/
MIT License © JE-Chen