gocheck

A Python syntax checker and linter written entirely in Go. It parses Python source files, reports syntax errors, detects lint issues concurrently, and fixes them in place using an operational transform engine.

Why gocheck exists

Python's ecosystem has excellent linters (pylint, flake8, ruff), but they all require a Python runtime. gocheck is a single static binary with no dependencies. It leverages Go's concurrency model to check files in parallel and run lint rules concurrently within each file. The operational transform engine resolves conflicting fixes by priority so that all non-overlapping fixes are applied in a single pass.

Installation

go install gocheck/cmd/gocheck@latest

Or build from source:

git clone <repo-url>
cd go-check-python
go build -o gocheck ./cmd/gocheck/

Usage

gocheck [flags] <files or directories...>

Directories are walked recursively for .py files.

Flags

Flag	Default	Description
`--fix`	`false`	Apply fixes in-place instead of showing a diff
`--rules`	all	Comma-separated list of rules to enable
`--exclude`	none	Comma-separated list of rules to disable
`--workers`	CPU count	Maximum concurrent file workers
`--quiet`	`false`	Only output errors and fix summaries
`--verbose`	`false`	Show skipped fixes and per-rule details
`--no-color`	`false`	Disable colored output
`--version`		Print version and exit

Exit codes

Code	Meaning
`0`	No issues, or all issues fixed with `--fix`
`1`	Syntax errors found
`2`	Lint issues found (diff mode only)
`3`	Internal error (bad arguments, I/O failure)

Examples

Check a single file:

gocheck app.py

Check an entire project:

gocheck src/

Auto-fix all issues:

gocheck --fix src/

Run only whitespace and import rules:

gocheck --rules trailing-whitespace,indentation,unused-import src/

Disable a specific rule:

gocheck --exclude broad-except src/

Output: diff mode (default)

src/app.py:3:5: [trailing-whitespace] trailing whitespace
src/app.py:8:1: [unused-import] imported name 'os' is unused
--- src/app.py
+++ src/app.py (fixed)
@@ -1,5 +1,4 @@
-import os
 import sys
-x = 1
+x = 1
 y = 2

Output: fix mode (`--fix`)

Fixed src/app.py:
  - [trailing-whitespace] line 3: trailing whitespace
  - [unused-import] line 1: imported name 'os' is unused
  2 fixes applied

Pipeline

Each file goes through two passes:

Source bytes
  --> Tokenizer (lexer)
  --> Iterative parser --> syntax errors? --> report and stop
                       --> AST
  --> Concurrent lint rules (goroutine per rule)
  --> Operational Transform engine (conflict resolution + apply)
  --> Fixed source bytes (write or diff)

Concurrency operates at two levels. A worker pool processes multiple files in parallel (bounded by --workers). Within each file, all lint rules run as concurrent goroutines (bounded by GOMAXPROCS).

Lint rules

Whitespace

Rule	Detects	Fix
`trailing-whitespace`	Lines ending with spaces or tabs	Strips trailing whitespace
`blank-line-spacing`	Fewer or more than 2 blank lines between top-level function/class definitions	Normalizes to exactly 2 blank lines
`indentation`	Tab characters used for indentation	Replaces each tab with 4 spaces
`no-trailing-newline`	File missing a final newline	Appends `\n`

Imports

Rule	Detects	Fix
`unused-import`	Imported name never referenced anywhere in the file	Removes the import statement
`import-sorting`	Import blocks not sorted alphabetically	Re-sorts the entire import block
`duplicate-import`	Same module imported more than once	Removes the duplicate

Naming

Rule	Detects	Fix
`function-naming`	Function names not in `snake_case` (skips dunder methods)	Renames to `snake_case`
`class-naming`	Class names not in `PascalCase`	Renames to `PascalCase`
`constant-naming`	Module-level names with uppercase characters not in `UPPER_SNAKE_CASE`	Renames to `UPPER_SNAKE_CASE`
`variable-naming`	Variables inside functions not in `snake_case` (skips `_`-prefixed and dunder names)	Renames to `snake_case`

Style

Rule	Detects	Fix
`none-comparison`	`== None` or `!= None`	Rewrites to `is None` / `is not None`
`bool-comparison`	`== True`, `== False`, `!= True`, `!= False`	Simplifies to `x`, `not x`, etc.
`not-is-not-in`	`not x in y` or `not x is y`	Rewrites to `x not in y` / `x is not y`
`unnecessary-parens`	Redundant parentheses after `return`, `if`, `while`, `assert`	Removes the outer parentheses

Safety

Rule	Detects	Fix
`mutable-default-arg`	Mutable literals (`[]`, `{}`) as default argument values	Replaces default with `None` and inserts a guard at the top of the function body
`bare-except`	`except:` with no exception type	Rewrites to `except Exception:`
`broad-except`	`except Exception:` (overly broad)	Report only, no auto-fix

Dead code

Rule	Detects	Fix
`unreachable-code`	Statements after `return`, `break`, `continue`, or `raise` in the same block	Removes the unreachable statements
`unused-variable`	Assigned variable never read in the same function (skips `_`-prefixed names)	Renames to `_`

Algorithms

Iterative parser with Pratt expression parsing

The parser avoids recursion entirely. It maintains an explicit stack of frames, where each frame is a state machine for a single grammar production (e.g., if statement, function definition, binary expression). The main loop repeatedly peeks at the top frame, advances its state by consuming tokens or pushing child frames, and pops completed frames back to their parents.

This design eliminates stack overflow on deeply nested Python and makes error recovery straightforward: on a syntax error, the parser skips tokens until a synchronization point (a newline at indent level 0, or a keyword like def, class, if) and continues parsing from there to collect multiple errors in a single pass.

Expression parsing uses the Pratt algorithm (top-down operator precedence). Each token type maps to a binding power that determines how tightly it binds to its operands. The parser compares the current operator's binding power against the minimum binding power of the current frame to decide whether to extend the current expression or return it to the parent:

Binding power (lowest to highest):
  lambda < ternary < or < and < not < comparisons
  < bitwise or < xor < bitwise and < shifts
  < add/sub < mul/div < unary < power < await
  < call/subscript/attribute

Chained comparisons (a < b < c) are handled specially: instead of nesting binary operators, the parser collects all operators and comparators into a single Compare node, matching Python's semantics where each operand is evaluated once.

Operational Transform engine

When multiple lint rules run concurrently, they each produce fix operations against the original source bytes. Fixes may overlap (e.g., an unused-import rule deleting a line that an import-sorting rule also wants to reorder). The OT engine resolves this in three steps:

1. Sort. Fixes are sorted by start offset (ascending), with ties broken by priority (descending). This ensures a deterministic left-to-right sweep.

2. Conflict resolution. A single sweep identifies clusters of overlapping fixes. Two fixes conflict when the second starts before the first ends. Within each cluster, the fix with the highest priority wins and the rest are discarded (reported as "skipped due to conflict"). After this pass, all remaining fixes are non-overlapping.

3. Apply. The non-overlapping fixes are applied left-to-right in a single pass over the source bytes. A cursor tracks the current position; for each fix, the engine copies unchanged bytes from the cursor to the fix's start offset, writes the replacement text, and advances the cursor past the fix's end offset. This produces the corrected source in a single allocation.

The entire process is O(n log n) for n fixes (dominated by the sort), plus a single O(m) pass over the source bytes.

Unified diff with LCS

The diff generator compares the original and fixed source to produce standard unified diff output. It uses the Longest Common Subsequence (LCS) algorithm to identify which lines are shared between the two versions.

LCS computation uses dynamic programming. A 2D table is built where table[i][j] holds the length of the longest common subsequence of the first i lines of the old file and the first j lines of the new file:

If old[i-1] == new[j-1]: table[i][j] = table[i-1][j-1] + 1
Otherwise: table[i][j] = max(table[i-1][j], table[i][j-1])

Backtracking from table[m][n] recovers the actual subsequence. Lines present in the old file but not in the LCS are removals; lines present in the new file but not in the LCS are additions; LCS lines are context.

Hunk generation groups changes with 3 lines of surrounding context. Adjacent changes close enough to share context are merged into a single hunk. The output follows the standard unified diff format compatible with patch and git diff.

Concurrent rule engine

The lint engine fans out each registered rule as a separate goroutine. A semaphore channel (capacity = GOMAXPROCS) bounds the number of rules executing simultaneously to avoid over-subscribing CPU cores. Each rule receives the AST and the raw source bytes and returns a slice of fix operations. A mutex protects the shared result slice during collection. After all goroutines complete, the combined fixes are passed to the OT engine.

Rules are pure functions with no shared mutable state, making concurrent execution safe without fine-grained locking.

Project structure

go-check-python/
  cmd/gocheck/         CLI entry point — flag parsing, file walking, output
  token/               Token type definitions, positions, spans
  lexer/               Python tokenizer — source bytes to token stream
  ast/                 AST node types and tree traversal utilities
  parser/              Iterative stack-based Python parser
    parser.go            Infrastructure — stack, frames, main loop
    expr.go              Pratt expression parser
    stmt.go              Statement production state machines
  linter/
    engine.go            Concurrent rule execution engine
    rules/
      rule.go            Rule interface, Fix type, global registry
      whitespace.go      4 whitespace rules
      imports.go         3 import rules
      naming.go          4 naming convention rules
      style.go           4 style rules
      safety.go          3 safety rules
      deadcode.go        2 dead code rules
  fixer/
    ot.go                Operational Transform — sort, conflict resolve, apply
    diff.go              Unified diff generation with LCS
  testdata/              Test fixtures for integration tests

Testing

go test ./...

The test suite covers:

Lexer (21 tests): token types, positions, indentation, strings, numbers, operators, line joining
Parser (44 tests): every expression and statement type, precedence, error recovery
OT engine (8 tests): single/multiple fixes, conflict resolution, insert/delete, offset transform
Diff (2 tests): unified diff output, no-change case
Lint rules (57 tests): per-rule detection and fix verification
Lint engine (4 tests): concurrent execution, rule filtering
Integration (4 tests): end-to-end syntax check, diff mode, fix mode, clean file

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gocheck

Why gocheck exists

Installation

Usage

Flags

Exit codes

Examples

Output: diff mode (default)

Output: fix mode (`--fix`)

Pipeline

Lint rules

Whitespace

Imports

Naming

Style

Safety

Dead code

Algorithms

Iterative parser with Pratt expression parsing

Operational Transform engine

Unified diff with LCS

Concurrent rule engine

Project structure

Testing

License

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
ast		ast
cmd/gocheck		cmd/gocheck
fixer		fixer
lexer		lexer
linter		linter
parser		parser
testdata		testdata
token		token
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
integration_test.go		integration_test.go

Folders and files

Latest commit

History

Repository files navigation

gocheck

Why gocheck exists

Installation

Usage

Flags

Exit codes

Examples

Output: diff mode (default)

Output: fix mode (--fix)

Pipeline

Lint rules

Whitespace

Imports

Naming

Style

Safety

Dead code

Algorithms

Iterative parser with Pratt expression parsing

Operational Transform engine

Unified diff with LCS

Concurrent rule engine

Project structure

Testing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

Output: fix mode (`--fix`)