A Python syntax checker and linter written entirely in Go. It parses Python source files, reports syntax errors, detects lint issues concurrently, and fixes them in place using an operational transform engine.
Python's ecosystem has excellent linters (pylint, flake8, ruff), but they all require a Python runtime. gocheck is a single static binary with no dependencies. It leverages Go's concurrency model to check files in parallel and run lint rules concurrently within each file. The operational transform engine resolves conflicting fixes by priority so that all non-overlapping fixes are applied in a single pass.
go install gocheck/cmd/gocheck@latestOr build from source:
git clone <repo-url>
cd go-check-python
go build -o gocheck ./cmd/gocheck/gocheck [flags] <files or directories...>
Directories are walked recursively for .py files.
| Flag | Default | Description |
|---|---|---|
--fix |
false |
Apply fixes in-place instead of showing a diff |
--rules |
all | Comma-separated list of rules to enable |
--exclude |
none | Comma-separated list of rules to disable |
--workers |
CPU count | Maximum concurrent file workers |
--quiet |
false |
Only output errors and fix summaries |
--verbose |
false |
Show skipped fixes and per-rule details |
--no-color |
false |
Disable colored output |
--version |
Print version and exit |
| Code | Meaning |
|---|---|
0 |
No issues, or all issues fixed with --fix |
1 |
Syntax errors found |
2 |
Lint issues found (diff mode only) |
3 |
Internal error (bad arguments, I/O failure) |
Check a single file:
gocheck app.pyCheck an entire project:
gocheck src/Auto-fix all issues:
gocheck --fix src/Run only whitespace and import rules:
gocheck --rules trailing-whitespace,indentation,unused-import src/Disable a specific rule:
gocheck --exclude broad-except src/src/app.py:3:5: [trailing-whitespace] trailing whitespace
src/app.py:8:1: [unused-import] imported name 'os' is unused
--- src/app.py
+++ src/app.py (fixed)
@@ -1,5 +1,4 @@
-import os
import sys
-x = 1
+x = 1
y = 2
Fixed src/app.py:
- [trailing-whitespace] line 3: trailing whitespace
- [unused-import] line 1: imported name 'os' is unused
2 fixes applied
Each file goes through two passes:
Source bytes
--> Tokenizer (lexer)
--> Iterative parser --> syntax errors? --> report and stop
--> AST
--> Concurrent lint rules (goroutine per rule)
--> Operational Transform engine (conflict resolution + apply)
--> Fixed source bytes (write or diff)
Concurrency operates at two levels. A worker pool processes multiple files in parallel (bounded by --workers). Within each file, all lint rules run as concurrent goroutines (bounded by GOMAXPROCS).
| Rule | Detects | Fix |
|---|---|---|
trailing-whitespace |
Lines ending with spaces or tabs | Strips trailing whitespace |
blank-line-spacing |
Fewer or more than 2 blank lines between top-level function/class definitions | Normalizes to exactly 2 blank lines |
indentation |
Tab characters used for indentation | Replaces each tab with 4 spaces |
no-trailing-newline |
File missing a final newline | Appends \n |
| Rule | Detects | Fix |
|---|---|---|
unused-import |
Imported name never referenced anywhere in the file | Removes the import statement |
import-sorting |
Import blocks not sorted alphabetically | Re-sorts the entire import block |
duplicate-import |
Same module imported more than once | Removes the duplicate |
| Rule | Detects | Fix |
|---|---|---|
function-naming |
Function names not in snake_case (skips dunder methods) |
Renames to snake_case |
class-naming |
Class names not in PascalCase |
Renames to PascalCase |
constant-naming |
Module-level names with uppercase characters not in UPPER_SNAKE_CASE |
Renames to UPPER_SNAKE_CASE |
variable-naming |
Variables inside functions not in snake_case (skips _-prefixed and dunder names) |
Renames to snake_case |
| Rule | Detects | Fix |
|---|---|---|
none-comparison |
== None or != None |
Rewrites to is None / is not None |
bool-comparison |
== True, == False, != True, != False |
Simplifies to x, not x, etc. |
not-is-not-in |
not x in y or not x is y |
Rewrites to x not in y / x is not y |
unnecessary-parens |
Redundant parentheses after return, if, while, assert |
Removes the outer parentheses |
| Rule | Detects | Fix |
|---|---|---|
mutable-default-arg |
Mutable literals ([], {}) as default argument values |
Replaces default with None and inserts a guard at the top of the function body |
bare-except |
except: with no exception type |
Rewrites to except Exception: |
broad-except |
except Exception: (overly broad) |
Report only, no auto-fix |
| Rule | Detects | Fix |
|---|---|---|
unreachable-code |
Statements after return, break, continue, or raise in the same block |
Removes the unreachable statements |
unused-variable |
Assigned variable never read in the same function (skips _-prefixed names) |
Renames to _ |
The parser avoids recursion entirely. It maintains an explicit stack of frames, where each frame is a state machine for a single grammar production (e.g., if statement, function definition, binary expression). The main loop repeatedly peeks at the top frame, advances its state by consuming tokens or pushing child frames, and pops completed frames back to their parents.
This design eliminates stack overflow on deeply nested Python and makes error recovery straightforward: on a syntax error, the parser skips tokens until a synchronization point (a newline at indent level 0, or a keyword like def, class, if) and continues parsing from there to collect multiple errors in a single pass.
Expression parsing uses the Pratt algorithm (top-down operator precedence). Each token type maps to a binding power that determines how tightly it binds to its operands. The parser compares the current operator's binding power against the minimum binding power of the current frame to decide whether to extend the current expression or return it to the parent:
Binding power (lowest to highest):
lambda < ternary < or < and < not < comparisons
< bitwise or < xor < bitwise and < shifts
< add/sub < mul/div < unary < power < await
< call/subscript/attribute
Chained comparisons (a < b < c) are handled specially: instead of nesting binary operators, the parser collects all operators and comparators into a single Compare node, matching Python's semantics where each operand is evaluated once.
When multiple lint rules run concurrently, they each produce fix operations against the original source bytes. Fixes may overlap (e.g., an unused-import rule deleting a line that an import-sorting rule also wants to reorder). The OT engine resolves this in three steps:
1. Sort. Fixes are sorted by start offset (ascending), with ties broken by priority (descending). This ensures a deterministic left-to-right sweep.
2. Conflict resolution. A single sweep identifies clusters of overlapping fixes. Two fixes conflict when the second starts before the first ends. Within each cluster, the fix with the highest priority wins and the rest are discarded (reported as "skipped due to conflict"). After this pass, all remaining fixes are non-overlapping.
3. Apply. The non-overlapping fixes are applied left-to-right in a single pass over the source bytes. A cursor tracks the current position; for each fix, the engine copies unchanged bytes from the cursor to the fix's start offset, writes the replacement text, and advances the cursor past the fix's end offset. This produces the corrected source in a single allocation.
The entire process is O(n log n) for n fixes (dominated by the sort), plus a single O(m) pass over the source bytes.
The diff generator compares the original and fixed source to produce standard unified diff output. It uses the Longest Common Subsequence (LCS) algorithm to identify which lines are shared between the two versions.
LCS computation uses dynamic programming. A 2D table is built where table[i][j] holds the length of the longest common subsequence of the first i lines of the old file and the first j lines of the new file:
- If
old[i-1] == new[j-1]:table[i][j] = table[i-1][j-1] + 1 - Otherwise:
table[i][j] = max(table[i-1][j], table[i][j-1])
Backtracking from table[m][n] recovers the actual subsequence. Lines present in the old file but not in the LCS are removals; lines present in the new file but not in the LCS are additions; LCS lines are context.
Hunk generation groups changes with 3 lines of surrounding context. Adjacent changes close enough to share context are merged into a single hunk. The output follows the standard unified diff format compatible with patch and git diff.
The lint engine fans out each registered rule as a separate goroutine. A semaphore channel (capacity = GOMAXPROCS) bounds the number of rules executing simultaneously to avoid over-subscribing CPU cores. Each rule receives the AST and the raw source bytes and returns a slice of fix operations. A mutex protects the shared result slice during collection. After all goroutines complete, the combined fixes are passed to the OT engine.
Rules are pure functions with no shared mutable state, making concurrent execution safe without fine-grained locking.
go-check-python/
cmd/gocheck/ CLI entry point — flag parsing, file walking, output
token/ Token type definitions, positions, spans
lexer/ Python tokenizer — source bytes to token stream
ast/ AST node types and tree traversal utilities
parser/ Iterative stack-based Python parser
parser.go Infrastructure — stack, frames, main loop
expr.go Pratt expression parser
stmt.go Statement production state machines
linter/
engine.go Concurrent rule execution engine
rules/
rule.go Rule interface, Fix type, global registry
whitespace.go 4 whitespace rules
imports.go 3 import rules
naming.go 4 naming convention rules
style.go 4 style rules
safety.go 3 safety rules
deadcode.go 2 dead code rules
fixer/
ot.go Operational Transform — sort, conflict resolve, apply
diff.go Unified diff generation with LCS
testdata/ Test fixtures for integration tests
go test ./...The test suite covers:
- Lexer (21 tests): token types, positions, indentation, strings, numbers, operators, line joining
- Parser (44 tests): every expression and statement type, precedence, error recovery
- OT engine (8 tests): single/multiple fixes, conflict resolution, insert/delete, offset transform
- Diff (2 tests): unified diff output, no-change case
- Lint rules (57 tests): per-rule detection and fix verification
- Lint engine (4 tests): concurrent execution, rule filtering
- Integration (4 tests): end-to-end syntax check, diff mode, fix mode, clean file
MIT