diff --git a/plugins/pr-review/agent-canvas-automation/README.md b/plugins/pr-review/agent-canvas-automation/README.md new file mode 100644 index 00000000..11ec8853 --- /dev/null +++ b/plugins/pr-review/agent-canvas-automation/README.md @@ -0,0 +1,65 @@ +# agent-canvas-automation + +This directory is the **source-of-truth copy** of the OpenHands Automations +tarball that powers a `github-pr-review`-style cron automation on a single +agent-canvas deployment. + +## What this is for + +The upstream `OpenHands/extensions` repo has a `plugins/pr-review` plugin +that produces good, inline-review-threads-style PR reviews (see +[`skills/github-pr-review/SKILL.md`](../skills/github-pr-review/SKILL.md)). +**Agent-canvas does not currently use that plugin directly.** Instead, it +runs a custom OpenHands automation (`9581078a-7fce-4157-8fb1-04acfd4e718a`) +that was set up before the upstream plugin reached a usable state, and the +custom automation lives in an uploaded tarball — a single `main.py` plus +the OpenHands Automations framework. + +This directory tracks that tarball's source so that: + +- The runtime is reproducible from a clean checkout. +- The history of the v1 → v2 fix is recorded. +- A future PR that retires this automation in favour of the upstream + plugin has a clear baseline to compare against. + +## Layout + +| Path | Purpose | +|---|---| +| `v2.7/main.py` | The current source-of-truth. The runtime tarball is `tar -czf` of this file. | +| `v2/README.md` | The v1 → v2.7 changelog, with the bug summary and the demonstration table. | + +## Why a fork, not upstream + +`OpenHands/extensions` is the public extensions registry. The +agent-canvas-automation is **specific to one user's agent-canvas +deployment** (a single OpenHands Automations instance on one host, with +one cron job and one automation ID). It is not general-purpose plugin +code — it's the wiring that talks to that specific deployment. + +So: + +- The "what should the agent prompt look like" lives in + `OpenHands/extensions` (skills/github-pr-review/SKILL.md, the + `###REVIEW_JSON###` contract, the priority labels) — that is general + knowledge. +- The "how to post the result, given the agent's output, on this + specific deployment, with these specific secrets" lives here, on the + fork, because it depends on the deployment. +- When the upstream `plugins/pr-review` plugin gets a deployment story + that doesn't need a custom tarball, this whole directory can be + deleted and the agent-canvas automation can be re-pointed at the + upstream plugin's `action.yml`. + +## See also + +- The "v1" tarball, still in agent-canvas but no longer recommended: + `oh-internal://uploads/80f203b3-0678-404a-9b60-401782a0e4b9` — + described in `v2/README.md` so anyone can understand why it was + replaced. +- The Automations API endpoints used to upload and patch tarballs: + - `POST /api/automation/v1/uploads` (returns `id`, `tarball_path`) + - `PATCH /api/automation/v1/{automation_id}` with `{"tarball_path": "..."}` +- The `m0nk111/extensions` `.pth` shim that points the agent-server at + this fork as the public-skills source: see the dotfiles-managed + `openhands_fork_extensions.{py,pth}` in the agent-server's site-packages. diff --git a/plugins/pr-review/agent-canvas-automation/v2.7/main.py b/plugins/pr-review/agent-canvas-automation/v2.7/main.py new file mode 100644 index 00000000..0d0c2c4b --- /dev/null +++ b/plugins/pr-review/agent-canvas-automation/v2.7/main.py @@ -0,0 +1,972 @@ +"""GitHub PR Reviewer — OpenHands Automation (v2.7 — last-marker parser). + +Cron-polls a GitHub repository for open pull requests carrying the configured +trigger label. Each PR gets exactly one OpenHands conversation, and the +resulting review is posted via the **GitHub Pull Request Reviews API** +(`POST /repos/{owner}/{repo}/pulls/{n}/reviews`) with one inline thread per +finding — never as a single issue comment. The format matches +`github-code-quality[bot]`: one review per call, `comments[]` array with +`path` + `line` + `side: "RIGHT"`, each thread body shaped as +`## <🔴 Critical|🟠 Important|🟡 Suggestion> ` and ending with a +```suggestion``` block for one-click apply when the fix is short and +contiguous. + +This is the v2.7 release. The only change vs v2.6 is in the JSON parser: +the parser now scans ALL occurrences of the `###REVIEW_JSON###` marker +(try from the last one backwards) so that descriptive prose mentioning the +marker doesn't shadow the real JSON contract. Earlier versions matched the +FIRST occurrence, which broke on responses that referenced the marker in +their preamble. +""" + +import json +import os +import re +import sys +import time +import urllib.error +import urllib.request +from pathlib import Path +from urllib.parse import urlencode + +# REPO is the deployment-specific target. It must be filled in before this +# script runs; the `/pr-reviewer:setup` skill does this for you in its +# "Apply the deployment constants" step (look for the `REPO = ...` line). +# Until REPO is set, `_verify_token_and_repo` will fail with a 404 against +# an empty path, which is the desired fail-fast behaviour. +REPO = "" +TRIGGER_LABEL = "openhands-review" +REVIEW_TONE = "thorough" +REVIEW_STYLE_INSTRUCTIONS = "" +DEFAULT_OPENHANDS_URL = "http://localhost:8000" + +DONE_DEBOUNCE = 15 +TERMINAL_STATUSES = {"idle", "finished", "error", "stuck"} + + +def _get_env_key() -> str: + return os.environ.get("SESSION_API_KEY") or os.environ.get("OH_SESSION_API_KEYS_0") or "" + + +def get_secret(name: str) -> str: + url = os.environ.get("AGENT_SERVER_URL", "").rstrip("/") + key = _get_env_key() + req = urllib.request.Request( + f"{url}/api/settings/secrets/{name}", + headers={"X-Session-API-Key": key}, + ) + with urllib.request.urlopen(req) as r: + return r.read().decode().strip() + + +def fire_callback( + status: str = "COMPLETED", + error: str | None = None, + conversation_id: str | None = None, +) -> None: + url = os.environ.get("AUTOMATION_CALLBACK_URL", "") + if not url: + return + body: dict = {"status": status, "run_id": os.environ.get("AUTOMATION_RUN_ID", "")} + if error: + body["error"] = error + if conversation_id: + body["conversation_id"] = conversation_id + req = urllib.request.Request( + url, + data=json.dumps(body).encode(), + headers={ + "Content-Type": "application/json", + "Authorization": f"Bearer {os.environ.get('AUTOMATION_CALLBACK_API_KEY', '')}", + }, + ) + try: + urllib.request.urlopen(req) + except Exception as exc: + print(f"Callback error (non-fatal): {exc}") + + +def _state_file_path() -> str: + workspace_base = os.environ.get("WORKSPACE_BASE", "") + event_payload = json.loads(os.environ.get("AUTOMATION_EVENT_PAYLOAD", "{}")) + automation_id = event_payload.get("automation_id", "default") + + if workspace_base: + root = Path(workspace_base).resolve().parent.parent + else: + root = Path.home() / ".openhands" / "workspaces" + + state_dir = root / "automation-state" + state_dir.mkdir(parents=True, exist_ok=True) + return str(state_dir / f"github_pr_reviewer_label_event_{automation_id}.json") + + +def load_state(path: str) -> dict: + if os.path.exists(path): + try: + with open(path) as f: + return json.load(f) + except (json.JSONDecodeError, OSError) as exc: + print(f"Warning: state file {path} unreadable ({exc}); starting fresh") + return { + "version": 2, + "repo": REPO, + "trigger_label": TRIGGER_LABEL, + "reviews": {}, + "prs": {}, + } + + +def _github_request(token: str, method: str, path: str, body: dict | None = None) -> tuple[dict, dict]: + url = f"https://api.github.com{path}" + headers = { + "Authorization": f"Bearer {token}", + "Accept": "application/vnd.github+json", + "X-GitHub-Api-Version": "2022-11-28", + "User-Agent": "openhands-pr-reviewer-automation/2.7", + } + data = json.dumps(body).encode() if body is not None else None + if data is not None: + headers["Content-Type"] = "application/json" + req = urllib.request.Request(url, data=data, headers=headers, method=method) + try: + with urllib.request.urlopen(req) as r: + raw = r.read() + return (json.loads(raw) if raw.strip() else {}), dict(r.headers) + except urllib.error.HTTPError as exc: + body_text = exc.read().decode(errors="replace") + raise RuntimeError(f"GitHub {method} {path} → {exc.code}: {body_text}") from exc + + +def _github_paginate(token: str, path: str, base_params: dict | None = None) -> list[dict]: + base_params = dict(base_params or {}) + base_params.setdefault("per_page", 100) + page = 1 + results: list[dict] = [] + while True: + params = dict(base_params) + params["page"] = page + url = f"{path}?{urlencode(params, doseq=True)}" + data, _ = _github_request(token, "GET", url) + if not isinstance(data, list): + break + results.extend(data) + if len(data) < base_params["per_page"]: + break + page += 1 + return results + + +def _resolve_github_token() -> str: + try: + token = get_secret("GITHUB_PERSONAL_ACCESS_TOKEN") + if token: + return token + except Exception: + pass + raise RuntimeError( + "GITHUB_PERSONAL_ACCESS_TOKEN secret is not set. " + "Go to OpenHands Settings → Secrets and add your GitHub Personal Access Token." + ) + + +def _verify_token_and_repo(token: str, repo: str) -> None: + try: + user_data, _ = _github_request(token, "GET", "/user") + except urllib.error.HTTPError as exc: + if exc.code == 401: + raise RuntimeError("GITHUB_PERSONAL_ACCESS_TOKEN is invalid or expired.") from exc + raise RuntimeError(f"GitHub /user check failed: {exc.code}") from exc + + print(f"Authenticated as GitHub user: {user_data.get('login', '?')}") + + try: + _github_request(token, "GET", f"/repos/{repo}") + except urllib.error.HTTPError as exc: + if exc.code == 404: + raise RuntimeError(f"Repository '{repo}' is not accessible with the current token.") from exc + raise RuntimeError(f"GitHub /repos/{repo} check failed: {exc.code}") from exc + + +def _list_open_prs(token: str, repo: str) -> list[dict]: + return _github_paginate( + token, + f"/repos/{repo}/pulls", + {"state": "open", "sort": "updated", "direction": "desc"}, + ) + + +def _get_pr(token: str, repo: str, pr_number: int) -> dict: + pr, _ = _github_request(token, "GET", f"/repos/{repo}/pulls/{pr_number}") + return pr + + +def _get_issue_events(token: str, repo: str, pr_number: int) -> list[dict]: + return _github_paginate(token, f"/repos/{repo}/issues/{pr_number}/events") + + +def _latest_trigger_label_event(token: str, repo: str, pr_number: int) -> dict | None: + events = _get_issue_events(token, repo, pr_number) + matching = [ + event for event in events + if event.get("event") == "labeled" + and (event.get("label") or {}).get("name", "").lower() == TRIGGER_LABEL.lower() + and event.get("id") is not None + ] + if not matching: + return None + return max(matching, key=lambda event: (event.get("created_at") or "", int(event.get("id") or 0))) + + +# --------------------------------------------------------------------------- +# Reviews API posting +# --------------------------------------------------------------------------- + +def _post_github_review( + token: str, + repo: str, + pr_number: int, + commit_id: str, + body: str, + event: str, + comments: list[dict], +) -> dict: + """Post a single Pull Request Review with inline comments.""" + payload = { + "commit_id": commit_id, + "body": body, + "event": event, + "comments": comments, + } + data, _ = _github_request( + token, + "POST", + f"/repos/{repo}/pulls/{pr_number}/reviews", + body=payload, + ) + return data + + +def _list_pr_files(token: str, repo: str, pr_number: int) -> set[str]: + try: + files, _ = _github_request( + token, "GET", f"/repos/{repo}/pulls/{pr_number}/files?per_page=100" + ) + if isinstance(files, list): + return {f.get("filename", "") for f in files if f.get("filename")} + except Exception as exc: + print(f" Warning: could not list PR files: {exc}") + return set() + + +def _filter_valid_comments( + comments: list[dict], valid_paths: set[str] +) -> tuple[list[dict], list[dict]]: + if not valid_paths: + return comments, [] + valid, invalid = [], [] + for c in comments: + if c["path"] in valid_paths: + valid.append(c) + else: + invalid.append(c) + return valid, invalid + + +def _list_existing_reviews( + token: str, repo: str, pr_number: int +) -> list[dict]: + try: + reviews, _ = _github_request( + token, "GET", f"/repos/{repo}/pulls/{pr_number}/reviews?per_page=100" + ) + if isinstance(reviews, list): + return reviews + except Exception as exc: + print(f" Warning: could not list existing reviews: {exc}") + return [] + + +_BOT_LOGIN_CACHE: str | None = None + + +def _get_bot_login(token: str) -> str: + """Resolve the GitHub login that owns `token` (the bot user). + + Used by the duplicate-review guard to recognise MCP-direct posts: any + review whose `user.login` matches the token owner is "one of ours". This + works for both `GITHUB_TOKEN` and `GITHUB_PERSONAL_ACCESS_TOKEN` because + both authenticate as the bot user that posts the review. + """ + global _BOT_LOGIN_CACHE + if _BOT_LOGIN_CACHE: + return _BOT_LOGIN_CACHE + try: + user, _ = _github_request(token, "GET", "/user") + if isinstance(user, dict): + login = (user.get("login") or "").strip() + if login: + _BOT_LOGIN_CACHE = login + return login + except Exception as exc: + print(f" Warning: could not resolve bot login from token: {exc}") + return "" + + +def _resolve_event(token: str, pr: dict, requested: str) -> str: + """Coalesce REQUEST_CHANGES → COMMENT when the reviewer is the PR author. + + GitHub rejects `REQUEST_CHANGES` on your own PR with HTTP 422. + """ + if requested != "REQUEST_CHANGES": + return requested + pr_author = (pr.get("user") or {}).get("login", "") + try: + me, _ = _github_request(token, "GET", "/user") + me_login = me.get("login", "") + except Exception: + me_login = "" + if me_login and pr_author and me_login.lower() == pr_author.lower(): + print( + f" Note: downgrading REQUEST_CHANGES → COMMENT because " + f"`{me_login}` is the PR author" + ) + return "COMMENT" + return requested + + +def _fallback_post_issue_comment(token: str, repo: str, pr_number: int, body: str) -> None: + _github_request( + token, + "POST", + f"/repos/{repo}/issues/{pr_number}/comments", + body={"body": body}, + ) + + +# --------------------------------------------------------------------------- +# REVIEW_JSON parser (v2.7 — last-marker wins) +# --------------------------------------------------------------------------- + +_REVIEW_JSON = re.compile(r"###REVIEW_JSON###", re.DOTALL) + + +def _extract_from(response: str, start: int) -> str | None: + i = start + while i < len(response) and response[i] in " \t\r\n": + i += 1 + if i < len(response) and response[i] == "`": + while i < len(response) and response[i] == "`": + i += 1 + while i < len(response) and response[i] not in "\n": + i += 1 + if i < len(response): + i += 1 + while i < len(response) and response[i] in " \t\r\n": + i += 1 + if i >= len(response) or response[i] != "{": + return None + depth = 0 + in_string = False + escape = False + j = i + while j < len(response): + c = response[j] + if in_string: + if escape: + escape = False + elif c == "\\": + escape = True + elif c == '"': + in_string = False + else: + if c == '"': + in_string = True + elif c == "{": + depth += 1 + elif c == "}": + depth -= 1 + if depth == 0: + return response[i : j + 1] + j += 1 + return None + + +def _extract_json_after_marker(response: str) -> str | None: + """Find the LAST `###REVIEW_JSON###` marker in the response and extract + the JSON object that follows it. Tries each marker from the end backwards + so that descriptive prose mentioning the marker doesn't shadow the real + JSON contract. + """ + if not response: + return None + matches = list(_REVIEW_JSON.finditer(response)) + for m in reversed(matches): + raw = _extract_from(response, m.end()) + if raw is not None: + try: + json.loads(raw) + return raw + except json.JSONDecodeError: + continue + return None + + +def _parse_review_payload(response: str) -> dict | None: + if not response: + return None + raw = _extract_json_after_marker(response) + if raw is None: + return None + try: + data = json.loads(raw) + except json.JSONDecodeError: + return None + if not isinstance(data, dict): + return None + if "comments" not in data or not isinstance(data["comments"], list): + return None + cleaned: list[dict] = [] + for c in data["comments"]: + if not isinstance(c, dict): + continue + if not c.get("path") or not c.get("line") or not c.get("body"): + continue + try: + line = int(c["line"]) + except (TypeError, ValueError): + continue + if line <= 0: + continue + cleaned.append({ + "path": str(c["path"]), + "line": line, + "side": str(c.get("side") or "RIGHT"), + "body": str(c["body"]), + }) + if not cleaned: + return None + return { + "body": str(data.get("body") or "").strip(), + "event": str(data.get("event") or "COMMENT"), + "comments": cleaned, + } + + +# --------------------------------------------------------------------------- +# Agent-server plumbing +# --------------------------------------------------------------------------- + +def _oh_request(agent_url: str, api_key: str, method: str, path: str, body: dict | None = None) -> dict: + url = f"{agent_url}{path}" + headers = {"X-Session-API-Key": api_key, "Content-Type": "application/json"} + data = json.dumps(body).encode() if body is not None else None + req = urllib.request.Request(url, data=data, headers=headers, method=method) + try: + with urllib.request.urlopen(req) as r: + raw = r.read() + return json.loads(raw) if raw.strip() else {} + except urllib.error.HTTPError as exc: + body_text = exc.read().decode() + raise RuntimeError(f"Agent API {method} {path} → {exc.code}: {body_text}") from exc + + +def _fetch_settings(agent_url: str, api_key: str) -> dict: + req = urllib.request.Request( + f"{agent_url}/api/settings", + headers={"X-Session-API-Key": api_key, "X-Expose-Secrets": "plaintext"}, + ) + with urllib.request.urlopen(req) as r: + return json.loads(r.read()) + + +def _get_agent_dict(agent_url: str, api_key: str) -> dict: + data = _fetch_settings(agent_url, api_key) + llm = data.get("agent_settings", {}).get("llm", {}) + return { + "kind": "Agent", + "llm": llm, + "tools": [{"name": "terminal"}, {"name": "file_editor"}], + } + + +def _get_mcp_config(agent_url: str, api_key: str) -> dict | None: + try: + data = _fetch_settings(agent_url, api_key) + mcp_config = data.get("agent_settings", {}).get("mcp_config") + if isinstance(mcp_config, dict) and mcp_config.get("mcpServers"): + return mcp_config + except Exception as exc: + print(f"Warning: could not fetch MCP config: {exc}") + return None + + +def _list_secret_names(agent_url: str, api_key: str) -> list[dict]: + try: + result = _oh_request(agent_url, api_key, "GET", "/api/settings/secrets") + return result.get("secrets", []) + except Exception as exc: + print(f"Warning: could not list secrets: {exc}") + return [] + + +def _build_secrets_payload(agent_url: str, api_key: str) -> dict: + secrets = {} + for secret in _list_secret_names(agent_url, api_key): + name = secret.get("name", "") + if not name: + continue + lookup: dict = { + "kind": "LookupSecret", + "url": f"/api/settings/secrets/{name}", + } + if api_key: + lookup["headers"] = {"X-Session-API-Key": api_key} + desc = secret.get("description") + if desc: + lookup["description"] = desc + secrets[name] = lookup + return secrets + + +def create_conversation(agent_url: str, api_key: str, initial_message: str) -> str: + workspace_dir = os.environ.get("WORKSPACE_BASE", "/workspace") + payload: dict = { + "workspace": {"working_dir": workspace_dir}, + "agent": _get_agent_dict(agent_url, api_key), + "initial_message": {"content": [{"text": initial_message}]}, + } + secrets = _build_secrets_payload(agent_url, api_key) + if secrets: + payload["secrets"] = secrets + mcp_config = _get_mcp_config(agent_url, api_key) + if mcp_config: + payload["mcp_config"] = mcp_config + result = _oh_request(agent_url, api_key, "POST", "/api/conversations", payload) + return result["id"] + + +def conversation_status(agent_url: str, api_key: str, conv_id: str) -> str: + result = _oh_request(agent_url, api_key, "GET", f"/api/conversations/{conv_id}") + return result.get("execution_status", "unknown") + + +def conversation_final_response(agent_url: str, api_key: str, conv_id: str) -> str: + result = _oh_request(agent_url, api_key, "GET", f"/api/conversations/{conv_id}/agent_final_response") + return result.get("response", "") + + +_TONE_INSTRUCTIONS = { + "thorough": ( + "Provide a comprehensive review. Cover correctness, security vulnerabilities, " + "missing or inadequate tests, code style, maintainability, and potential edge cases. " + "Reference specific files and line numbers where relevant." + ), + "concise": ( + "Provide a brief, high-signal review. Focus only on important bugs, security problems, " + "or significant design flaws. Omit minor style feedback." + ), + "friendly": ( + "Provide a constructive, encouraging review. Acknowledge what is done well before " + "raising concerns while still noting real issues." + ), +} + + +def _labels(pr: dict) -> list[str]: + return [label.get("name", "") for label in pr.get("labels", [])] + + +def _has_trigger_label(pr: dict) -> bool: + return any(label.lower() == TRIGGER_LABEL.lower() for label in _labels(pr)) + + +def _head_sha(pr: dict) -> str: + return ((pr.get("head") or {}).get("sha") or "").strip() + + +def _review_key(pr_number: int, label_event_id: int | str) -> str: + return f"{pr_number}:label:{label_event_id}" + + +def _with_ai_disclosure(body: str) -> str: + disclosure = "_This comment was posted by an AI agent (OpenHands)._" + body = (body or "").strip() + if disclosure.lower() in body.lower(): + return body + return f"{body}\n\n{disclosure}" if body else disclosure + + +# --------------------------------------------------------------------------- +# Prompt template +# --------------------------------------------------------------------------- + +def _build_review_prompt(pr: dict, head_sha: str, label_event: dict) -> str: + number = pr.get("number", "?") + title = pr.get("title", "(no title)") + body = (pr.get("body") or "").strip() or "(no description)" + html_url = pr.get("html_url", "") + author = (pr.get("user") or {}).get("login", "?") + base_branch = (pr.get("base") or {}).get("ref", "?") + head_branch = (pr.get("head") or {}).get("ref", "?") + label_str = ", ".join(_labels(pr)) or "(none)" + label_event_id = label_event.get("id", "?") + label_event_created_at = label_event.get("created_at", "?") + changed_files = pr.get("changed_files", "?") + additions = pr.get("additions", "?") + deletions = pr.get("deletions", "?") + clone_url = f"https://github.com/{REPO}.git" + tone = _TONE_INSTRUCTIONS.get(REVIEW_TONE, _TONE_INSTRUCTIONS["thorough"]) + extra = f"\n\nAdditional style instructions:\n{REVIEW_STYLE_INSTRUCTIONS}" if REVIEW_STYLE_INSTRUCTIONS.strip() else "" + + return ( + "/github-pr-review\n\n" + "## How to Post the Review (CRITICAL — read first)\n\n" + "You MUST post your review via the **GitHub Pull Request Reviews API** " + "(`POST /repos/{owner}/{repo}/pulls/{n}/reviews`) with one inline comment " + "per finding — never as a single issue comment. Do not call `gh pr comment`, " + "do not call `POST /repos/{owner}/{repo}/issues/{n}/comments`, and do not " + "produce a single Markdown blob. The automation script that wraps your " + "conversation will call the Reviews API for you based on the JSON you " + "return in the `###REVIEW_JSON###` block below.\n\n" + "Use exactly one fenced code block named `###REVIEW_JSON###` at the very end " + "of your final response. The block must parse as JSON and have this shape:\n\n" + "```\n" + "###REVIEW_JSON###\n" + "{\n" + " \"event\": \"COMMENT\",\n" + " \"body\": \"Brief 1–3 sentence summary. Inline comments below.\",\n" + " \"comments\": [\n" + " {\n" + " \"path\": \"api/auth.py\",\n" + " \"line\": 87,\n" + " \"side\": \"RIGHT\",\n" + " \"body\": \"## 🟠 Important: JWT signature not verified\\n\\n" + "Forged tokens pass because `jwt.decode()` is called without the secret.\\n\\n" + "---\\n\\n" + "`jwt.decode()` requires both the secret and the allowed `algorithms` list.\\n\\n" + "Best fix in this file (`api/auth.py:87`): pass `SECRET_KEY` and `algorithms=[\\\"HS256\\\"]`.\\n\\n" + "No new methods or dependencies needed.\\n\\n" + "```suggestion\\n" + "token_data = jwt.decode(token, SECRET_KEY, algorithms=[\\\"HS256\\\"])\\n" + "```\"\n" + " }\n" + " ]\n" + "}\n" + "```\n\n" + "Each `comments[i].body` starts with `## <🔴 Critical|🟠 Important|🟡 Suggestion> `, " + "then a one-line statement of the issue, a `---` separator, a prose fix explanation, " + "a \"Best fix in this file (``)\" anchor, a scope confirmation (\"No new " + "methods or dependencies needed.\"), and ends with a `\\`\\`\\`suggestion\\`\\`\\`` block for " + "one-click apply when the fix is ≤ 5 lines and contiguous. Never use `🟢` priority " + "labels — if the code is fine, do not comment on it.\n\n" + "`event` is one of `COMMENT` (default), `APPROVE`, or `REQUEST_CHANGES`.\n\n" + "## Pull Request Information\n\n" + f"- **Title**: {title}\n" + f"- **Description**: {body}\n" + f"- **Repository**: {REPO}\n" + f"- **Base Branch**: {base_branch}\n" + f"- **Head Branch**: {head_branch}\n" + f"- **PR Number**: {number}\n" + f"- **Head SHA**: {head_sha}\n" + f"- **Trigger**: latest `{TRIGGER_LABEL}` labeled event {label_event_id} at {label_event_created_at}\n" + f"- **Labels**: {label_str}\n" + f"- **Changes**: +{additions} -{deletions} across {changed_files} file(s)\n" + f"- **URL**: {html_url}\n\n" + "## Required Workflow\n\n" + "1. Clone the repository into a fresh working directory inside the workspace.\n" + f" Example: `git clone {clone_url} pr-review-{number}`.\n" + "2. Check out the exact pull request branch by PR number, then verify HEAD matches the SHA above.\n" + f" Example: `git fetch origin pull/{number}/head:openhands-pr-{number}` followed by `git checkout openhands-pr-{number}`.\n" + "3. Inspect the existing PR context before reviewing, including PR description, issue comments, " + "review comments, changed files, and the diff.\n" + " Prefer `gh pr view`, `gh pr diff`, `gh pr checkout`, or GitHub REST API calls with " + "`GITHUB_PERSONAL_ACCESS_TOKEN`; do not print secret values.\n" + "4. Use the checked-out repository to inspect relevant files and surrounding code, not just the patch.\n" + "5. Before producing the final response, delete only the cloned repository directory created in step 1.\n" + f" Example: `rm -rf pr-review-{number}`. Do not delete any other files or directories.\n" + "6. Produce the final review in **two parts**: a short Markdown summary, then the " + "`###REVIEW_JSON###` fenced block exactly as specified above. The summary is for " + "humans reading the agent-server log; the JSON is the contract the automation " + "script uses to call the Reviews API. **Important**: only the **last** " + "`###REVIEW_JSON###` marker in your response is used, so you can reference " + "the marker in your summary prose without ambiguity — the parser will pick " + "the last occurrence and parse the JSON that follows it.\n\n" + f"## Review Tone\n\n{tone}{extra}\n\n" + "End the JSON's `body` with a clear verdict on its own line: either " + "`✅ APPROVED` or `🔄 CHANGES REQUESTED`." + ) + + +# --------------------------------------------------------------------------- +# Review processing +# --------------------------------------------------------------------------- + +def _process_review_request( + github_token: str, + agent_url: str, + api_key: str, + openhands_url: str, + pr: dict, + label_event: dict, + reviews: dict, +) -> str | None: + number = pr["number"] + head_sha = _head_sha(pr) + label_event_id = label_event["id"] + key = _review_key(number, label_event_id) + title = pr.get("title", "(no title)") + html_url = pr.get("html_url", "") + + print(f" Queuing review for PR #{number} from `{TRIGGER_LABEL}` event {label_event_id} at {head_sha[:12]}: {title}") + prompt = _build_review_prompt(pr, head_sha, label_event) + + try: + conv_id = create_conversation(agent_url, api_key, prompt) + except Exception as exc: + print(f" Error creating conversation for PR #{number}: {exc}") + return None + + reviews[key] = { + "pr_number": number, + "head_sha": head_sha, + "trigger_label_event_id": label_event_id, + "trigger_label_event_created_at": label_event.get("created_at"), + "html_url": html_url, + "status": "active", + "conversation_id": conv_id, + "last_activity": time.time(), + } + print(f" Created review conversation {conv_id}") + + conv_url = f"{openhands_url}/conversations/{conv_id}" + _fallback_post_issue_comment( + github_token, + REPO, + number, + _with_ai_disclosure( + "🤖 **OpenHands is reviewing this PR.**\n\n" + f"Trigger label: `{TRIGGER_LABEL}`\n" + f"Label event: `{label_event_id}` at `{label_event.get('created_at', '?')}`\n" + f"Head commit: `{head_sha}`\n" + f"View the conversation: {conv_url}" + ), + ) + return conv_id + + +def _check_conversation_completion( + rec: dict, + latest_open_prs: dict[int, dict], + github_token: str, + agent_url: str, + api_key: str, +) -> None: + if (time.time() - rec.get("last_activity", 0.0)) < DONE_DEBOUNCE: + return + + conv_id = rec["conversation_id"] + pr_number = rec["pr_number"] + reviewed_sha = rec.get("head_sha", "") + current_pr = latest_open_prs.get(pr_number) + + if not current_pr: + rec["status"] = "closed" + print(f" PR #{pr_number} closed/merged — skipping result post") + return + + current_sha = _head_sha(current_pr) + if current_sha and reviewed_sha and current_sha != reviewed_sha: + rec["status"] = "stale" + rec["stale_reason"] = f"head changed from {reviewed_sha} to {current_sha}" + print(f" PR #{pr_number} advanced to {current_sha[:12]} — suppressing stale review {conv_id}") + return + + try: + status = conversation_status(agent_url, api_key, conv_id) + except Exception as exc: + print(f" Warning: could not get status for {conv_id}: {exc}") + return + + print(f" PR #{pr_number} conversation {conv_id} → status={status}") + if status not in TERMINAL_STATUSES: + return + + try: + final = conversation_final_response(agent_url, api_key, conv_id) + except Exception: + final = "" + + if status in {"error", "stuck"}: + body = ( + f"⚠️ **OpenHands PR Reviewer encountered a problem** at commit " + f"`{reviewed_sha[:12]}` (status: `{status}`).\n\n" + f"{(final or '').strip()}" + ) + _fallback_post_issue_comment( + github_token, + REPO, + pr_number, + _with_ai_disclosure(body), + ) + else: + payload = _parse_review_payload(final) + if payload is None: + print(f" PR #{pr_number}: agent did not emit REVIEW_JSON block in final response") + existing = _list_existing_reviews(github_token, REPO, pr_number) + bot_login = _get_bot_login(github_token) + already_posted = any( + (r.get("user") or {}).get("login", "").lower() == bot_login.lower() + for r in existing + ) + + if already_posted: + print( + f" PR #{pr_number}: {bot_login} review(s) already present " + f"(agent used MCP directly) — closing" + ) + else: + msg = ( + "⚠️ **OpenHands completed the review for commit " + f"`{reviewed_sha[:12]}`** but did not produce a parseable " + f"`###REVIEW_JSON###` block and no `{bot_login}` review was " + "found on the PR. Falling back to issue comment.\n\n" + f"```\n{(final or '').strip()[:6000]}\n```" + ) + _fallback_post_issue_comment( + github_token, + REPO, + pr_number, + _with_ai_disclosure(msg), + ) + print(f" PR #{pr_number}: fell back to issue comment") + else: + valid_paths = _list_pr_files(github_token, REPO, pr_number) + valid_comments, invalid_comments = _filter_valid_comments( + payload["comments"], valid_paths + ) + if invalid_comments: + print( + f" PR #{pr_number}: dropped {len(invalid_comments)} comment(s) " + f"with non-existent paths: " + f"{sorted({c['path'] for c in invalid_comments})}" + ) + + current_pr = _get_pr(github_token, REPO, pr_number) + event = _resolve_event( + github_token, current_pr, payload["event"] + ) + + try: + _post_github_review( + github_token, + REPO, + pr_number, + commit_id=reviewed_sha, + body=payload["body"] or "Automated review by OpenHands.", + event=event, + comments=valid_comments, + ) + print( + f" Posted PR Review on PR #{pr_number} at {reviewed_sha[:12]} " + f"with {len(valid_comments)} inline comment(s) " + f"(event={event})" + ) + except Exception as exc: + print(f" Error posting PR Review on PR #{pr_number}: {exc}") + invalid_note = "" + if invalid_comments: + invalid_note = ( + "\n\nThe following comments were dropped because their " + "file paths are not in the PR diff:\n\n" + + "\n".join( + f"- `{c['path']}:{c['line']}`" for c in invalid_comments + ) + ) + _fallback_post_issue_comment( + github_token, + REPO, + pr_number, + _with_ai_disclosure( + "⚠️ **OpenHands could not post the inline review via the Reviews API.**\n\n" + f"Error: `{exc}`\n\n" + f"```\n{(final or '').strip()[:6000]}\n```" + f"{invalid_note}" + ), + ) + + rec["status"] = "closed" + rec["completed_at"] = time.time() + + +def main() -> str | None: + state_path = _state_file_path() + state = load_state(state_path) + agent_url = os.environ.get("AGENT_SERVER_URL", "").rstrip("/") + api_key = _get_env_key() + + github_token = _resolve_github_token() + _verify_token_and_repo(github_token, REPO) + + try: + openhands_url = get_secret("OPENHANDS_URL").rstrip("/") or DEFAULT_OPENHANDS_URL + except Exception: + openhands_url = DEFAULT_OPENHANDS_URL + + reviews: dict = state.setdefault("reviews", {}) + prs_state: dict = state.setdefault("prs", {}) + + open_prs = _list_open_prs(github_token, REPO) + latest_open_prs = {pr["number"]: pr for pr in open_prs} + print(f"Found {len(open_prs)} open PR(s) in {REPO}") + + last_conversation_id = None + + for pr in open_prs: + number = pr["number"] + head_sha = _head_sha(pr) + label_present = _has_trigger_label(pr) + prs_state[str(number)] = { + "head_sha": head_sha, + "label_present": label_present, + "labels": _labels(pr), + "last_seen": time.time(), + } + + if not label_present: + continue + if not head_sha: + print(f" PR #{number} missing head SHA, skipping") + continue + label_event = _latest_trigger_label_event(github_token, REPO, number) + if not label_event: + print(f" PR #{number} has label but no matching label event, skipping") + continue + key = _review_key(number, label_event["id"]) + existing = reviews.get(key) + if existing: + if existing.get("status") not in (None, "stale", "closed", "error"): + _check_conversation_completion( + existing, latest_open_prs, github_token, agent_url, api_key + ) + last_conversation_id = existing.get("conversation_id") + else: + last_conversation_id = existing.get("conversation_id") + continue + last_conversation_id = _process_review_request( + github_token, agent_url, api_key, openhands_url, pr, label_event, reviews + ) + + with open(state_path, "w") as f: + json.dump(state, f, indent=2) + + print("State saved to", state_path) + return last_conversation_id + + +if __name__ == "__main__": + try: + main() + except Exception as exc: + print(f"Fatal: {exc}") + sys.exit(1) diff --git a/plugins/pr-review/agent-canvas-automation/v2.8/main.py b/plugins/pr-review/agent-canvas-automation/v2.8/main.py new file mode 100644 index 00000000..6529a77e --- /dev/null +++ b/plugins/pr-review/agent-canvas-automation/v2.8/main.py @@ -0,0 +1,1015 @@ +"""GitHub PR Reviewer — OpenHands Automation (v2.8 — duplicate-review guard). + +Cron-polls a GitHub repository for open pull requests carrying the configured +trigger label. Each PR gets exactly one OpenHands conversation, and the +resulting review is posted via the **GitHub Pull Request Reviews API** +(`POST /repos/{owner}/{repo}/pulls/{n}/reviews`) with one inline thread per +finding — never as a single issue comment. The format matches +`github-code-quality[bot]`: one review per call, `comments[]` array with +`path` + `line` + `side: "RIGHT"`, each thread body shaped as +`## <🔴 Critical|🟠 Important|🟡 Suggestion> ` and ending with a +```suggestion``` block for one-click apply when the fix is short and +contiguous. + +This is the v2.8 release. The only change vs v2.7 is a duplicate-review +guard: the script now also checks for existing reviews posted by the bot +user (the GitHub login that owns `GITHUB_TOKEN` / `GITHUB_PERSONAL_ACCESS_TOKEN`, +resolved at runtime via `/user`) at the same commit_id BEFORE posting a +parsed payload, in addition to the "no JSON" path. v2.7 only did the check +in the no-JSON path, so a run where the agent posted via the GitHub MCP +AND the script also posted from the parsed JSON resulted in two reviews +with identical content. + +The v2.8 fix: if `_list_existing_reviews()` shows a review by the bot +user at the same `commit_id` we are about to post at, skip posting and +close the state. The script logs which path was taken ("MCP-direct" or +"script-posted") so the run log is unambiguous. +""" + +import json +import os +import re +import sys +import time +import urllib.error +import urllib.request +from pathlib import Path +from urllib.parse import urlencode + +# REPO is the deployment-specific target. It must be filled in before this +# script runs; the `/pr-reviewer:setup` skill does this for you in its +# "Apply the deployment constants" step (look for the `REPO = ...` line). +# Until REPO is set, `_verify_token_and_repo` will fail with a 404 against +# an empty path, which is the desired fail-fast behaviour. +REPO = "" +TRIGGER_LABEL = "openhands-review" +REVIEW_TONE = "thorough" +REVIEW_STYLE_INSTRUCTIONS = "" +DEFAULT_OPENHANDS_URL = "http://localhost:8000" + +DONE_DEBOUNCE = 15 +TERMINAL_STATUSES = {"idle", "finished", "error", "stuck"} + + +def _get_env_key() -> str: + return os.environ.get("SESSION_API_KEY") or os.environ.get("OH_SESSION_API_KEYS_0") or "" + + +def get_secret(name: str) -> str: + url = os.environ.get("AGENT_SERVER_URL", "").rstrip("/") + key = _get_env_key() + req = urllib.request.Request( + f"{url}/api/settings/secrets/{name}", + headers={"X-Session-API-Key": key}, + ) + with urllib.request.urlopen(req) as r: + return r.read().decode().strip() + + +def fire_callback( + status: str = "COMPLETED", + error: str | None = None, + conversation_id: str | None = None, +) -> None: + url = os.environ.get("AUTOMATION_CALLBACK_URL", "") + if not url: + return + body: dict = {"status": status, "run_id": os.environ.get("AUTOMATION_RUN_ID", "")} + if error: + body["error"] = error + if conversation_id: + body["conversation_id"] = conversation_id + req = urllib.request.Request( + url, + data=json.dumps(body).encode(), + headers={ + "Content-Type": "application/json", + "Authorization": f"Bearer {os.environ.get('AUTOMATION_CALLBACK_API_KEY', '')}", + }, + ) + try: + urllib.request.urlopen(req) + except Exception as exc: + print(f"Callback error (non-fatal): {exc}") + + +def _state_file_path() -> str: + workspace_base = os.environ.get("WORKSPACE_BASE", "") + event_payload = json.loads(os.environ.get("AUTOMATION_EVENT_PAYLOAD", "{}")) + automation_id = event_payload.get("automation_id", "default") + + if workspace_base: + root = Path(workspace_base).resolve().parent.parent + else: + root = Path.home() / ".openhands" / "workspaces" + + state_dir = root / "automation-state" + state_dir.mkdir(parents=True, exist_ok=True) + return str(state_dir / f"github_pr_reviewer_label_event_{automation_id}.json") + + +def load_state(path: str) -> dict: + if os.path.exists(path): + try: + with open(path) as f: + return json.load(f) + except (json.JSONDecodeError, OSError) as exc: + print(f"Warning: state file {path} unreadable ({exc}); starting fresh") + return { + "version": 2, + "repo": REPO, + "trigger_label": TRIGGER_LABEL, + "reviews": {}, + "prs": {}, + } + + +def _github_request(token: str, method: str, path: str, body: dict | None = None) -> tuple[dict, dict]: + url = f"https://api.github.com{path}" + headers = { + "Authorization": f"Bearer {token}", + "Accept": "application/vnd.github+json", + "X-GitHub-Api-Version": "2022-11-28", + "User-Agent": "openhands-pr-reviewer-automation/2.8", + } + data = json.dumps(body).encode() if body is not None else None + if data is not None: + headers["Content-Type"] = "application/json" + req = urllib.request.Request(url, data=data, headers=headers, method=method) + try: + with urllib.request.urlopen(req) as r: + raw = r.read() + return (json.loads(raw) if raw.strip() else {}), dict(r.headers) + except urllib.error.HTTPError as exc: + body_text = exc.read().decode(errors="replace") + raise RuntimeError(f"GitHub {method} {path} → {exc.code}: {body_text}") from exc + + +def _github_paginate(token: str, path: str, base_params: dict | None = None) -> list[dict]: + base_params = dict(base_params or {}) + base_params.setdefault("per_page", 100) + page = 1 + results: list[dict] = [] + while True: + params = dict(base_params) + params["page"] = page + url = f"{path}?{urlencode(params, doseq=True)}" + data, _ = _github_request(token, "GET", url) + if not isinstance(data, list): + break + results.extend(data) + if len(data) < base_params["per_page"]: + break + page += 1 + return results + + +def _resolve_github_token() -> str: + try: + token = get_secret("GITHUB_PERSONAL_ACCESS_TOKEN") + if token: + return token + except Exception: + pass + raise RuntimeError( + "GITHUB_PERSONAL_ACCESS_TOKEN secret is not set. " + "Go to OpenHands Settings → Secrets and add your GitHub Personal Access Token." + ) + + +def _verify_token_and_repo(token: str, repo: str) -> None: + try: + user_data, _ = _github_request(token, "GET", "/user") + except urllib.error.HTTPError as exc: + if exc.code == 401: + raise RuntimeError("GITHUB_PERSONAL_ACCESS_TOKEN is invalid or expired.") from exc + raise RuntimeError(f"GitHub /user check failed: {exc.code}") from exc + + print(f"Authenticated as GitHub user: {user_data.get('login', '?')}") + + try: + _github_request(token, "GET", f"/repos/{repo}") + except urllib.error.HTTPError as exc: + if exc.code == 404: + raise RuntimeError(f"Repository '{repo}' is not accessible with the current token.") from exc + raise RuntimeError(f"GitHub /repos/{repo} check failed: {exc.code}") from exc + + +def _list_open_prs(token: str, repo: str) -> list[dict]: + return _github_paginate( + token, + f"/repos/{repo}/pulls", + {"state": "open", "sort": "updated", "direction": "desc"}, + ) + + +def _get_pr(token: str, repo: str, pr_number: int) -> dict: + pr, _ = _github_request(token, "GET", f"/repos/{repo}/pulls/{pr_number}") + return pr + + +def _get_issue_events(token: str, repo: str, pr_number: int) -> list[dict]: + return _github_paginate(token, f"/repos/{repo}/issues/{pr_number}/events") + + +def _latest_trigger_label_event(token: str, repo: str, pr_number: int) -> dict | None: + events = _get_issue_events(token, repo, pr_number) + matching = [ + event for event in events + if event.get("event") == "labeled" + and (event.get("label") or {}).get("name", "").lower() == TRIGGER_LABEL.lower() + and event.get("id") is not None + ] + if not matching: + return None + return max(matching, key=lambda event: (event.get("created_at") or "", int(event.get("id") or 0))) + + +# --------------------------------------------------------------------------- +# Reviews API posting +# --------------------------------------------------------------------------- + +def _post_github_review( + token: str, + repo: str, + pr_number: int, + commit_id: str, + body: str, + event: str, + comments: list[dict], +) -> dict: + """Post a single Pull Request Review with inline comments.""" + payload = { + "commit_id": commit_id, + "body": body, + "event": event, + "comments": comments, + } + data, _ = _github_request( + token, + "POST", + f"/repos/{repo}/pulls/{pr_number}/reviews", + body=payload, + ) + return data + + +def _list_pr_files(token: str, repo: str, pr_number: int) -> set[str]: + try: + files, _ = _github_request( + token, "GET", f"/repos/{repo}/pulls/{pr_number}/files?per_page=100" + ) + if isinstance(files, list): + return {f.get("filename", "") for f in files if f.get("filename")} + except Exception as exc: + print(f" Warning: could not list PR files: {exc}") + return set() + + +def _filter_valid_comments( + comments: list[dict], valid_paths: set[str] +) -> tuple[list[dict], list[dict]]: + if not valid_paths: + return comments, [] + valid, invalid = [], [] + for c in comments: + if c["path"] in valid_paths: + valid.append(c) + else: + invalid.append(c) + return valid, invalid + + +def _list_existing_reviews( + token: str, repo: str, pr_number: int +) -> list[dict]: + try: + reviews, _ = _github_request( + token, "GET", f"/repos/{repo}/pulls/{pr_number}/reviews?per_page=100" + ) + if isinstance(reviews, list): + return reviews + except Exception as exc: + print(f" Warning: could not list existing reviews: {exc}") + return [] + + +def _existing_bot_review_for_commit( + reviews: list[dict], bot_login: str, commit_id: str +) -> dict | None: + """Return the bot's review on the given commit, or None. + + Matches by exact `commit_id` when possible (cleanest check), but also + returns the most recent bot review at the PR head if no exact match + exists — that handles the case where the head has advanced since the + agent posted. + """ + bot = bot_login.lower() + by_commit = [ + r for r in reviews + if (r.get("user") or {}).get("login", "").lower() == bot + and r.get("commit_id", "") == commit_id + ] + if by_commit: + return sorted(by_commit, key=lambda r: r.get("submitted_at", ""), reverse=True)[0] + bot_only = [ + r for r in reviews + if (r.get("user") or {}).get("login", "").lower() == bot + ] + if bot_only: + return sorted(bot_only, key=lambda r: r.get("submitted_at", ""), reverse=True)[0] + return None + + +_BOT_LOGIN_CACHE: str | None = None + + +def _get_bot_login(token: str) -> str: + """Resolve the GitHub login that owns `token` (the bot user). + + Used by the duplicate-review guard to recognise MCP-direct posts: any + review whose `user.login` matches the token owner is "one of ours". This + works for both `GITHUB_TOKEN` and `GITHUB_PERSONAL_ACCESS_TOKEN` because + both authenticate as the bot user that posts the review. + """ + global _BOT_LOGIN_CACHE + if _BOT_LOGIN_CACHE: + return _BOT_LOGIN_CACHE + try: + user, _ = _github_request(token, "GET", "/user") + if isinstance(user, dict): + login = (user.get("login") or "").strip() + if login: + _BOT_LOGIN_CACHE = login + return login + except Exception as exc: + print(f" Warning: could not resolve bot login from token: {exc}") + return "" + + +def _resolve_event(token: str, pr: dict, requested: str) -> str: + """Coalesce REQUEST_CHANGES → COMMENT when the reviewer is the PR author. + + GitHub rejects `REQUEST_CHANGES` on your own PR with HTTP 422. + """ + if requested != "REQUEST_CHANGES": + return requested + pr_author = (pr.get("user") or {}).get("login", "") + try: + me, _ = _github_request(token, "GET", "/user") + me_login = me.get("login", "") + except Exception: + me_login = "" + if me_login and pr_author and me_login.lower() == pr_author.lower(): + print( + f" Note: downgrading REQUEST_CHANGES → COMMENT because " + f"`{me_login}` is the PR author" + ) + return "COMMENT" + return requested + + +def _fallback_post_issue_comment(token: str, repo: str, pr_number: int, body: str) -> None: + _github_request( + token, + "POST", + f"/repos/{repo}/issues/{pr_number}/comments", + body={"body": body}, + ) + + +# --------------------------------------------------------------------------- +# REVIEW_JSON parser (v2.7 — last-marker wins) +# --------------------------------------------------------------------------- + +_REVIEW_JSON = re.compile(r"###REVIEW_JSON###", re.DOTALL) + + +def _extract_from(response: str, start: int) -> str | None: + i = start + while i < len(response) and response[i] in " \t\r\n": + i += 1 + if i < len(response) and response[i] == "`": + while i < len(response) and response[i] == "`": + i += 1 + while i < len(response) and response[i] not in "\n": + i += 1 + if i < len(response): + i += 1 + while i < len(response) and response[i] in " \t\r\n": + i += 1 + if i >= len(response) or response[i] != "{": + return None + depth = 0 + in_string = False + escape = False + j = i + while j < len(response): + c = response[j] + if in_string: + if escape: + escape = False + elif c == "\\": + escape = True + elif c == '"': + in_string = False + else: + if c == '"': + in_string = True + elif c == "{": + depth += 1 + elif c == "}": + depth -= 1 + if depth == 0: + return response[i : j + 1] + j += 1 + return None + + +def _extract_json_after_marker(response: str) -> str | None: + if not response: + return None + matches = list(_REVIEW_JSON.finditer(response)) + for m in reversed(matches): + raw = _extract_from(response, m.end()) + if raw is not None: + try: + json.loads(raw) + return raw + except json.JSONDecodeError: + continue + return None + + +def _parse_review_payload(response: str) -> dict | None: + if not response: + return None + raw = _extract_json_after_marker(response) + if raw is None: + return None + try: + data = json.loads(raw) + except json.JSONDecodeError: + return None + if not isinstance(data, dict): + return None + if "comments" not in data or not isinstance(data["comments"], list): + return None + cleaned: list[dict] = [] + for c in data["comments"]: + if not isinstance(c, dict): + continue + if not c.get("path") or not c.get("line") or not c.get("body"): + continue + try: + line = int(c["line"]) + except (TypeError, ValueError): + continue + if line <= 0: + continue + cleaned.append({ + "path": str(c["path"]), + "line": line, + "side": str(c.get("side") or "RIGHT"), + "body": str(c["body"]), + }) + if not cleaned: + return None + return { + "body": str(data.get("body") or "").strip(), + "event": str(data.get("event") or "COMMENT"), + "comments": cleaned, + } + + +# --------------------------------------------------------------------------- +# Agent-server plumbing +# --------------------------------------------------------------------------- + +def _oh_request(agent_url: str, api_key: str, method: str, path: str, body: dict | None = None) -> dict: + url = f"{agent_url}{path}" + headers = {"X-Session-API-Key": api_key, "Content-Type": "application/json"} + data = json.dumps(body).encode() if body is not None else None + req = urllib.request.Request(url, data=data, headers=headers, method=method) + try: + with urllib.request.urlopen(req) as r: + raw = r.read() + return json.loads(raw) if raw.strip() else {} + except urllib.error.HTTPError as exc: + body_text = exc.read().decode() + raise RuntimeError(f"Agent API {method} {path} → {exc.code}: {body_text}") from exc + + +def _fetch_settings(agent_url: str, api_key: str) -> dict: + req = urllib.request.Request( + f"{agent_url}/api/settings", + headers={"X-Session-API-Key": api_key, "X-Expose-Secrets": "plaintext"}, + ) + with urllib.request.urlopen(req) as r: + return json.loads(r.read()) + + +def _get_agent_dict(agent_url: str, api_key: str) -> dict: + data = _fetch_settings(agent_url, api_key) + llm = data.get("agent_settings", {}).get("llm", {}) + return { + "kind": "Agent", + "llm": llm, + "tools": [{"name": "terminal"}, {"name": "file_editor"}], + } + + +def _get_mcp_config(agent_url: str, api_key: str) -> dict | None: + try: + data = _fetch_settings(agent_url, api_key) + mcp_config = data.get("agent_settings", {}).get("mcp_config") + if isinstance(mcp_config, dict) and mcp_config.get("mcpServers"): + return mcp_config + except Exception as exc: + print(f"Warning: could not fetch MCP config: {exc}") + return None + + +def _list_secret_names(agent_url: str, api_key: str) -> list[dict]: + try: + result = _oh_request(agent_url, api_key, "GET", "/api/settings/secrets") + return result.get("secrets", []) + except Exception as exc: + print(f"Warning: could not list secrets: {exc}") + return [] + + +def _build_secrets_payload(agent_url: str, api_key: str) -> dict: + secrets = {} + for secret in _list_secret_names(agent_url, api_key): + name = secret.get("name", "") + if not name: + continue + lookup: dict = { + "kind": "LookupSecret", + "url": f"/api/settings/secrets/{name}", + } + if api_key: + lookup["headers"] = {"X-Session-API-Key": api_key} + desc = secret.get("description") + if desc: + lookup["description"] = desc + secrets[name] = lookup + return secrets + + +def create_conversation(agent_url: str, api_key: str, initial_message: str) -> str: + workspace_dir = os.environ.get("WORKSPACE_BASE", "/workspace") + payload: dict = { + "workspace": {"working_dir": workspace_dir}, + "agent": _get_agent_dict(agent_url, api_key), + "initial_message": {"content": [{"text": initial_message}]}, + } + secrets = _build_secrets_payload(agent_url, api_key) + if secrets: + payload["secrets"] = secrets + mcp_config = _get_mcp_config(agent_url, api_key) + if mcp_config: + payload["mcp_config"] = mcp_config + result = _oh_request(agent_url, api_key, "POST", "/api/conversations", payload) + return result["id"] + + +def conversation_status(agent_url: str, api_key: str, conv_id: str) -> str: + result = _oh_request(agent_url, api_key, "GET", f"/api/conversations/{conv_id}") + return result.get("execution_status", "unknown") + + +def conversation_final_response(agent_url: str, api_key: str, conv_id: str) -> str: + result = _oh_request(agent_url, api_key, "GET", f"/api/conversations/{conv_id}/agent_final_response") + return result.get("response", "") + + +_TONE_INSTRUCTIONS = { + "thorough": ( + "Provide a comprehensive review. Cover correctness, security vulnerabilities, " + "missing or inadequate tests, code style, maintainability, and potential edge cases. " + "Reference specific files and line numbers where relevant." + ), + "concise": ( + "Provide a brief, high-signal review. Focus only on important bugs, security problems, " + "or significant design flaws. Omit minor style feedback." + ), + "friendly": ( + "Provide a constructive, encouraging review. Acknowledge what is done well before " + "raising concerns while still noting real issues." + ), +} + + +def _labels(pr: dict) -> list[str]: + return [label.get("name", "") for label in pr.get("labels", [])] + + +def _has_trigger_label(pr: dict) -> bool: + return any(label.lower() == TRIGGER_LABEL.lower() for label in _labels(pr)) + + +def _head_sha(pr: dict) -> str: + return ((pr.get("head") or {}).get("sha") or "").strip() + + +def _review_key(pr_number: int, label_event_id: int | str) -> str: + return f"{pr_number}:label:{label_event_id}" + + +def _with_ai_disclosure(body: str) -> str: + disclosure = "_This comment was posted by an AI agent (OpenHands)._" + body = (body or "").strip() + if disclosure.lower() in body.lower(): + return body + return f"{body}\n\n{disclosure}" if body else disclosure + + +# --------------------------------------------------------------------------- +# Prompt template +# --------------------------------------------------------------------------- + +def _build_review_prompt(pr: dict, head_sha: str, label_event: dict) -> str: + number = pr.get("number", "?") + title = pr.get("title", "(no title)") + body = (pr.get("body") or "").strip() or "(no description)" + html_url = pr.get("html_url", "") + author = (pr.get("user") or {}).get("login", "?") + base_branch = (pr.get("base") or {}).get("ref", "?") + head_branch = (pr.get("head") or {}).get("ref", "?") + label_str = ", ".join(_labels(pr)) or "(none)" + label_event_id = label_event.get("id", "?") + label_event_created_at = label_event.get("created_at", "?") + changed_files = pr.get("changed_files", "?") + additions = pr.get("additions", "?") + deletions = pr.get("deletions", "?") + clone_url = f"https://github.com/{REPO}.git" + tone = _TONE_INSTRUCTIONS.get(REVIEW_TONE, _TONE_INSTRUCTIONS["thorough"]) + extra = f"\n\nAdditional style instructions:\n{REVIEW_STYLE_INSTRUCTIONS}" if REVIEW_STYLE_INSTRUCTIONS.strip() else "" + + return ( + "/github-pr-review\n\n" + "## How to Post the Review (CRITICAL — read first)\n\n" + "You MUST post your review via the **GitHub Pull Request Reviews API** " + "(`POST /repos/{owner}/{repo}/pulls/{n}/reviews`) with one inline comment " + "per finding — never as a single issue comment. Do not call `gh pr comment`, " + "do not call `POST /repos/{owner}/{repo}/issues/{n}/comments`, and do not " + "produce a single Markdown blob. The automation script that wraps your " + "conversation will call the Reviews API for you based on the JSON you " + "return in the `###REVIEW_JSON###` block below.\n\n" + "Use exactly one fenced code block named `###REVIEW_JSON###` at the very end " + "of your final response. The block must parse as JSON and have this shape:\n\n" + "```\n" + "###REVIEW_JSON###\n" + "{\n" + " \"event\": \"COMMENT\",\n" + " \"body\": \"Brief 1–3 sentence summary. Inline comments below.\",\n" + " \"comments\": [\n" + " {\n" + " \"path\": \"api/auth.py\",\n" + " \"line\": 87,\n" + " \"side\": \"RIGHT\",\n" + " \"body\": \"## 🟠 Important: JWT signature not verified\\n\\n" + "Forged tokens pass because `jwt.decode()` is called without the secret.\\n\\n" + "---\\n\\n" + "`jwt.decode()` requires both the secret and the allowed `algorithms` list.\\n\\n" + "Best fix in this file (`api/auth.py:87`): pass `SECRET_KEY` and `algorithms=[\\\"HS256\\\"]`.\\n\\n" + "No new methods or dependencies needed.\\n\\n" + "```suggestion\\n" + "token_data = jwt.decode(token, SECRET_KEY, algorithms=[\\\"HS256\\\"])\\n" + "```\"\n" + " }\n" + " ]\n" + "}\n" + "```\n\n" + "Each `comments[i].body` starts with `## <🔴 Critical|🟠 Important|🟡 Suggestion> `, " + "then a one-line statement of the issue, a `---` separator, a prose fix explanation, " + "a \"Best fix in this file (``)\" anchor, a scope confirmation (\"No new " + "methods or dependencies needed.\"), and ends with a `\\`\\`\\`suggestion\\`\\`\\`` block for " + "one-click apply when the fix is ≤ 5 lines and contiguous. Never use `🟢` priority " + "labels — if the code is fine, do not comment on it.\n\n" + "`event` is one of `COMMENT` (default), `APPROVE`, or `REQUEST_CHANGES`.\n\n" + "## Pull Request Information\n\n" + f"- **Title**: {title}\n" + f"- **Description**: {body}\n" + f"- **Repository**: {REPO}\n" + f"- **Base Branch**: {base_branch}\n" + f"- **Head Branch**: {head_branch}\n" + f"- **PR Number**: {number}\n" + f"- **Head SHA**: {head_sha}\n" + f"- **Trigger**: latest `{TRIGGER_LABEL}` labeled event {label_event_id} at {label_event_created_at}\n" + f"- **Labels**: {label_str}\n" + f"- **Changes**: +{additions} -{deletions} across {changed_files} file(s)\n" + f"- **URL**: {html_url}\n\n" + "## Required Workflow\n\n" + "1. Clone the repository into a fresh working directory inside the workspace.\n" + f" Example: `git clone {clone_url} pr-review-{number}`.\n" + "2. Check out the exact pull request branch by PR number, then verify HEAD matches the SHA above.\n" + f" Example: `git fetch origin pull/{number}/head:openhands-pr-{number}` followed by `git checkout openhands-pr-{number}`.\n" + "3. Inspect the existing PR context before reviewing, including PR description, issue comments, " + "review comments, changed files, and the diff.\n" + " Prefer `gh pr view`, `gh pr diff`, `gh pr checkout`, or GitHub REST API calls with " + "`GITHUB_PERSONAL_ACCESS_TOKEN`; do not print secret values.\n" + "4. Use the checked-out repository to inspect relevant files and surrounding code, not just the patch.\n" + "5. Before producing the final response, delete only the cloned repository directory created in step 1.\n" + f" Example: `rm -rf pr-review-{number}`. Do not delete any other files or directories.\n" + "6. Produce the final review in **two parts**: a short Markdown summary, then the " + "`###REVIEW_JSON###` fenced block exactly as specified above. The summary is for " + "humans reading the agent-server log; the JSON is the contract the automation " + "script uses to call the Reviews API. **Important**: only the **last** " + "`###REVIEW_JSON###` marker in your response is used, so you can reference " + "the marker in your summary prose without ambiguity — the parser will pick " + "the last occurrence and parse the JSON that follows it.\n\n" + f"## Review Tone\n\n{tone}{extra}\n\n" + "End the JSON's `body` with a clear verdict on its own line: either " + "`✅ APPROVED` or `🔄 CHANGES REQUESTED`." + ) + + +# --------------------------------------------------------------------------- +# Review processing +# --------------------------------------------------------------------------- + +def _process_review_request( + github_token: str, + agent_url: str, + api_key: str, + openhands_url: str, + pr: dict, + label_event: dict, + reviews: dict, +) -> str | None: + number = pr["number"] + head_sha = _head_sha(pr) + label_event_id = label_event["id"] + key = _review_key(number, label_event_id) + title = pr.get("title", "(no title)") + html_url = pr.get("html_url", "") + + print(f" Queuing review for PR #{number} from `{TRIGGER_LABEL}` event {label_event_id} at {head_sha[:12]}: {title}") + prompt = _build_review_prompt(pr, head_sha, label_event) + + try: + conv_id = create_conversation(agent_url, api_key, prompt) + except Exception as exc: + print(f" Error creating conversation for PR #{number}: {exc}") + return None + + reviews[key] = { + "pr_number": number, + "head_sha": head_sha, + "trigger_label_event_id": label_event_id, + "trigger_label_event_created_at": label_event.get("created_at"), + "html_url": html_url, + "status": "active", + "conversation_id": conv_id, + "last_activity": time.time(), + } + print(f" Created review conversation {conv_id}") + + conv_url = f"{openhands_url}/conversations/{conv_id}" + _fallback_post_issue_comment( + github_token, + REPO, + number, + _with_ai_disclosure( + "🤖 **OpenHands is reviewing this PR.**\n\n" + f"Trigger label: `{TRIGGER_LABEL}`\n" + f"Label event: `{label_event_id}` at `{label_event.get('created_at', '?')}`\n" + f"Head commit: `{head_sha}`\n" + f"View the conversation: {conv_url}" + ), + ) + return conv_id + + +def _check_conversation_completion( + rec: dict, + latest_open_prs: dict[int, dict], + github_token: str, + agent_url: str, + api_key: str, +) -> None: + if (time.time() - rec.get("last_activity", 0.0)) < DONE_DEBOUNCE: + return + + conv_id = rec["conversation_id"] + pr_number = rec["pr_number"] + reviewed_sha = rec.get("head_sha", "") + current_pr = latest_open_prs.get(pr_number) + + if not current_pr: + rec["status"] = "closed" + print(f" PR #{pr_number} closed/merged — skipping result post") + return + + current_sha = _head_sha(current_pr) + if current_sha and reviewed_sha and current_sha != reviewed_sha: + rec["status"] = "stale" + rec["stale_reason"] = f"head changed from {reviewed_sha} to {current_sha}" + print(f" PR #{pr_number} advanced to {current_sha[:12]} — suppressing stale review {conv_id}") + return + + try: + status = conversation_status(agent_url, api_key, conv_id) + except Exception as exc: + print(f" Warning: could not get status for {conv_id}: {exc}") + return + + print(f" PR #{pr_number} conversation {conv_id} → status={status}") + if status not in TERMINAL_STATUSES: + return + + try: + final = conversation_final_response(agent_url, api_key, conv_id) + except Exception: + final = "" + + if status in {"error", "stuck"}: + body = ( + f"⚠️ **OpenHands PR Reviewer encountered a problem** at commit " + f"`{reviewed_sha[:12]}` (status: `{status}`).\n\n" + f"{(final or '').strip()}" + ) + _fallback_post_issue_comment( + github_token, + REPO, + pr_number, + _with_ai_disclosure(body), + ) + else: + # v2.8: Always check for existing bot review at the same commit + # BEFORE deciding what to do — this prevents duplicates whether + # the agent emitted the JSON contract or used the MCP directly. + existing_reviews = _list_existing_reviews(github_token, REPO, pr_number) + bot_login = _get_bot_login(github_token) + existing_bot = _existing_bot_review_for_commit( + existing_reviews, bot_login, reviewed_sha + ) + + payload = _parse_review_payload(final) + if payload is None: + # No JSON contract from the agent. If the bot already posted + # (via MCP), close the state without re-posting. + if existing_bot is not None: + print( + f" PR #{pr_number}: agent did not emit REVIEW_JSON block but " + f"{bot_login} review #{existing_bot['id']} already exists at " + f"commit {reviewed_sha[:12]} (agent used MCP directly) — closing" + ) + else: + msg = ( + "⚠️ **OpenHands completed the review for commit " + f"`{reviewed_sha[:12]}`** but did not produce a parseable " + f"`###REVIEW_JSON###` block and no `{bot_login}` review was " + "found on the PR. Falling back to issue comment.\n\n" + f"```\n{(final or '').strip()[:6000]}\n```" + ) + _fallback_post_issue_comment( + github_token, + REPO, + pr_number, + _with_ai_disclosure(msg), + ) + print(f" PR #{pr_number}: fell back to issue comment") + else: + # v2.8: even with a parseable payload, skip if the bot already + # posted a review (e.g. via the GitHub MCP). Posting again would + # produce a duplicate Pull Request Review with identical content. + if existing_bot is not None: + print( + f" PR #{pr_number}: {bot_login} review #{existing_bot['id']} " + f"already exists at commit {reviewed_sha[:12]} (likely agent used " + f"the GitHub MCP directly) — skipping script post to avoid duplicate" + ) + else: + valid_paths = _list_pr_files(github_token, REPO, pr_number) + valid_comments, invalid_comments = _filter_valid_comments( + payload["comments"], valid_paths + ) + if invalid_comments: + print( + f" PR #{pr_number}: dropped {len(invalid_comments)} comment(s) " + f"with non-existent paths: " + f"{sorted({c['path'] for c in invalid_comments})}" + ) + + current_pr = _get_pr(github_token, REPO, pr_number) + event = _resolve_event( + github_token, current_pr, payload["event"] + ) + + try: + _post_github_review( + github_token, + REPO, + pr_number, + commit_id=reviewed_sha, + body=payload["body"] or "Automated review by OpenHands.", + event=event, + comments=valid_comments, + ) + print( + f" Posted PR Review on PR #{pr_number} at {reviewed_sha[:12]} " + f"with {len(valid_comments)} inline comment(s) " + f"(event={event})" + ) + except Exception as exc: + print(f" Error posting PR Review on PR #{pr_number}: {exc}") + invalid_note = "" + if invalid_comments: + invalid_note = ( + "\n\nThe following comments were dropped because their " + "file paths are not in the PR diff:\n\n" + + "\n".join( + f"- `{c['path']}:{c['line']}`" for c in invalid_comments + ) + ) + _fallback_post_issue_comment( + github_token, + REPO, + pr_number, + _with_ai_disclosure( + "⚠️ **OpenHands could not post the inline review via the Reviews API.**\n\n" + f"Error: `{exc}`\n\n" + f"```\n{(final or '').strip()[:6000]}\n```" + f"{invalid_note}" + ), + ) + + rec["status"] = "closed" + rec["completed_at"] = time.time() + + +def main() -> str | None: + state_path = _state_file_path() + state = load_state(state_path) + agent_url = os.environ.get("AGENT_SERVER_URL", "").rstrip("/") + api_key = _get_env_key() + + github_token = _resolve_github_token() + _verify_token_and_repo(github_token, REPO) + + try: + openhands_url = get_secret("OPENHANDS_URL").rstrip("/") or DEFAULT_OPENHANDS_URL + except Exception: + openhands_url = DEFAULT_OPENHANDS_URL + + reviews: dict = state.setdefault("reviews", {}) + prs_state: dict = state.setdefault("prs", {}) + + open_prs = _list_open_prs(github_token, REPO) + latest_open_prs = {pr["number"]: pr for pr in open_prs} + print(f"Found {len(open_prs)} open PR(s) in {REPO}") + + last_conversation_id = None + + for pr in open_prs: + number = pr["number"] + head_sha = _head_sha(pr) + label_present = _has_trigger_label(pr) + prs_state[str(number)] = { + "head_sha": head_sha, + "label_present": label_present, + "labels": _labels(pr), + "last_seen": time.time(), + } + + if not label_present: + continue + if not head_sha: + print(f" PR #{number} missing head SHA, skipping") + continue + label_event = _latest_trigger_label_event(github_token, REPO, number) + if not label_event: + print(f" PR #{number} has label but no matching label event, skipping") + continue + key = _review_key(number, label_event["id"]) + existing = reviews.get(key) + if existing: + if existing.get("status") not in (None, "stale", "closed", "error"): + _check_conversation_completion( + existing, latest_open_prs, github_token, agent_url, api_key + ) + last_conversation_id = existing.get("conversation_id") + else: + last_conversation_id = existing.get("conversation_id") + continue + last_conversation_id = _process_review_request( + github_token, agent_url, api_key, openhands_url, pr, label_event, reviews + ) + + with open(state_path, "w") as f: + json.dump(state, f, indent=2) + + print("State saved to", state_path) + return last_conversation_id + + +if __name__ == "__main__": + try: + main() + except Exception as exc: + print(f"Fatal: {exc}") + sys.exit(1) diff --git a/plugins/pr-review/agent-canvas-automation/v2/README.md b/plugins/pr-review/agent-canvas-automation/v2/README.md new file mode 100644 index 00000000..60336efa --- /dev/null +++ b/plugins/pr-review/agent-canvas-automation/v2/README.md @@ -0,0 +1,147 @@ +# agent-canvas-automation — v2 + +This directory holds the **standalone OpenHands automation** that the +agent-canvas `github-pr-review` automation runs on cron. It is the bit +of glue that: + +1. **Polls** the configured repo for open PRs carrying the + `openhands-review` label (configurable via `TRIGGER_LABEL`). +2. **Forks a fresh OpenHands conversation** per `(PR, label_event_id)` and + feeds it the prompt template (see `_build_review_prompt` in `main.py`). +3. **Waits** for the conversation to reach a terminal state + (`finished` / `error` / `stuck` / `idle`). +4. **Parses the agent's final response** for a `###REVIEW_JSON###` block + (see `v2.7/main.py`). +5. **Posts the result via the GitHub Pull Request Reviews API** — + `POST /repos/{owner}/{repo}/pulls/{n}/reviews` with one inline thread + per finding, in the same shape as `github-code-quality[bot]`. **Never** + `POST /repos/{owner}/{repo}/issues/{n}/comments` (issue-comments API). + +Versions are tracked as tarballs uploaded via +`POST /api/automation/v1/uploads` on the OpenHands Automations backend +(usually `http://localhost:18001/api/automation/v1/uploads` locally) and +attached to the automation by `PATCH +/api/automation/v1/{automation_id}` with +`{"tarball_path": "oh-internal://uploads/"}`. + +This directory is the source-of-truth copy of those tarballs so that the +runtime config is reproducible from a clean checkout. + +## v1 vs v2 — what changed and why + +### The bug + +The v1 tarball had two independent problems: + +1. **The prompt template told the agent not to use the Reviews API.** + From the v1 `_build_review_prompt`: + + > "You are an AI code reviewer. Review the GitHub pull request below and + > write a **single review comment**. Do not modify files, push commits, + > approve via the GitHub API, or request changes via the **review API**; + > only produce the final comment text." + + The agent followed the instruction and produced one Markdown blob. + +2. **The result-poster used the issue-comments API.** From v1 + `_post_github_comment`: + + ```python + def _post_github_comment(token, repo, pr_number, body): + _github_request(token, "POST", + f"/repos/{repo}/issues/{pr_number}/comments", + body={"body": body}) + ``` + + The agent's blob was posted as a single issue comment on the PR — no + inline threads, no `## 🟠 Important: ...` headings, no + ` ```suggestion ` blocks, no per-finding thread anchoring. From the + human's perspective the agent was pretending to review. + +This is exactly the bug the upstream `OpenHands/extensions` PR #339 +("fix(pr-review): post reviews via Pull Request Reviews API, not issue +comments") was supposed to fix, but **the agent-canvas automation does not +use the upstream `plugins/pr-review` plugin at all** — it has its own +tarball that was uploaded long before PR #339 landed, and that tarball +was never updated. + +### The v2 fix + +v2.1 through v2.6, then v2.7 (current), replace the prompt template and +the result-poster. Concretely: + +| Concern | v1 | v2.7 | +|---|---|---| +| Prompt tells the agent to use the Reviews API? | No — explicitly forbids it | Yes — mandates `POST /repos/{owner}/{repo}/pulls/{n}/reviews` with `comments[]` | +| Prompt contract with the wrapper script? | None — the script posts the agent's prose verbatim as an issue comment | `###REVIEW_JSON###` block at the end of the response with `{event, body, comments[]}` | +| Result-poster API | `POST /repos/.../issues/{n}/comments` | `POST /repos/.../pulls/{n}/reviews` with `comments[]` | +| Path validation | n/a | Inline comments with `path` not in the PR diff are dropped with a log line (GitHub returns 422 for unknown paths) | +| Author-equals-reviewer | n/a | `REQUEST_CHANGES` is auto-downgraded to `COMMENT` when the bot user is the PR author (GitHub returns 422 otherwise) | +| Re-review guard | n/a | Already-closed `(PR, label_event_id)` pairs are skipped on the next cron tick instead of starting a new conversation | +| MCP-direct detection | n/a | If the agent posts the review via the GitHub MCP instead of the JSON contract, the script queries GitHub for existing reviews by the bot user (resolved at runtime via `GET /user` with the same token that posts the review) and closes the state without double-posting | +| `###REVIEW_JSON###` parser | n/a | Brace-counting parser that handles both fenced ```json` and raw inline JSON, with a "last-marker wins" rule so descriptive prose mentioning the marker doesn't shadow the real JSON contract | + +### Real-world demonstration + +The version timeline below is the source of truth for what each release +changed. For concrete observed behaviour (PR-by-PR), see the deployment's +own review history — this README deliberately avoids hardcoded links to +specific PRs/threads so it stays portable across deployments. + +| Version stage | Expected behaviour on a fresh e2e test | +|---|---| +| v1 | Agent outputs a single Markdown blob; script posts it as one issue comment (no inline threads, no suggestion blocks). | +| v2.1–v2.6 | Agent emits `###REVIEW_JSON###`; script posts a single Pull Request Review with one inline thread per finding. Path-validation and PR-author coalescing are wired up. | +| v2.7 | Last-marker parser means agent prose that mentions the marker no longer shadows the JSON contract. MCP-direct post is detected in the "no JSON" path. | +| v2.8 | Duplicate-review guard runs in **both** the "no JSON" and "have JSON" paths. A run where the agent posts via MCP and the script also posts from the parsed JSON produces exactly one review, not two. | + +## How to apply this locally + +```bash +# 1. Pack the v2.8 main.py into a gzipped tarball +cd plugins/pr-review/agent-canvas-automation/v2.8 +tar -czf /tmp/pr-reviewer-v2.8.tar.gz main.py + +# 2. Upload via the Automations API (port 18001 locally) +curl -s -X POST \ + -H "X-Session-API-Key: $OPENHANDS_AUTOMATION_API_KEY" \ + -H "Content-Type: application/gzip" \ + --data-binary @/tmp/pr-reviewer-v2.8.tar.gz \ + "http://localhost:18001/api/automation/v1/uploads?name=pr-reviewer-v2.8-reviews-api&description=..." + +# 3. PATCH the automation to point at the new tarball +curl -s -X PATCH \ + -H "X-Session-API-Key: $OPENHANDS_AUTOMATION_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"tarball_path": "oh-internal://uploads/"}' \ + "http://localhost:18001/api/automation/v1/9581078a7fce41578fb104acfd4e718a" +``` + +## Versions + +| Version | What changed | When | +|---|---|---| +| v2.0 | Initial v2 (prompt + Reviews API) | 2026-06-16 | +| v2.1 | First re-pack (no code change) | 2026-06-16 | +| v2.2 | Re-review guard for closed (PR, label_event_id) pairs | 2026-06-16 | +| v2.3 | Skipped (experimental) | 2026-06-16 | +| v2.4 | Skipped (experimental) | 2026-06-16 | +| v2.5 | Path validation, REQUEST_CHANGES→COMMENT for PR author, MCP-direct detection in "no JSON" path | 2026-06-16 | +| v2.6 | Re-pack (no code change) | 2026-06-16 | +| v2.7 | Last-marker parser fix (handles agent prose that mentions `###REVIEW_JSON###`) | 2026-06-16 | +| **v2.8** | **Duplicate-review guard in BOTH the "no JSON" and "have JSON" paths** — the v2.7 only had it in the "no JSON" path, so a run where the agent posted via MCP AND the script also posted from the parsed JSON would produce two reviews with identical content (as seen on PR 400). | 2026-06-16 | + +The new prompt template also requires the agent to use the +`/github-pr-review` trigger (matched against the `github-pr-review` skill +in the `OpenHands/extensions` registry) so the upstream plugin's +priority-label and ` ```suggestion ` block conventions are followed. + +## Why this isn't upstreamed + +`OpenHands/extensions` has the `plugins/pr-review` plugin and the +`github-pr-review` skill (with the new format). The v1 tarball predates +those and was never updated. The right long-term fix is to retire the +v1 tarball entirely and let agent-canvas run `plugins/pr-review`'s +action directly — see `OpenHands/extensions` issue tracker for the +ongoing effort. Until that lands, this `agent-canvas-automation/v2.7/` +tarball is the working solution. diff --git a/plugins/pr-review/scripts/prompt.py b/plugins/pr-review/scripts/prompt.py index 3cf931af..c71bb946 100644 --- a/plugins/pr-review/scripts/prompt.py +++ b/plugins/pr-review/scripts/prompt.py @@ -74,7 +74,37 @@ PROMPT = """{skill_trigger} /github-pr-review -When posting a review, keep the review body brief unless your active review instructions require a longer structured format. +## How to Post the Review (CRITICAL — read first) + +You MUST post your review via the **GitHub Pull Request Reviews API**, NOT the Issue Comments API. Use exactly one API call whose `comments[]` array contains one entry per finding: + +```bash +gh api -X POST repos/{{owner}}/{{repo}}/pulls/{pr_number}/reviews --input /tmp/review.json +``` + +where `/tmp/review.json` has this shape (one entry per finding, all findings bundled into the same call): + +```json +{{ + "commit_id": "", + "event": "COMMENT", + "body": "Brief 1–3 sentence summary. Inline comments below.", + "comments": [ + {{ + "path": "api/auth.py", + "line": 87, + "side": "RIGHT", + "body": "## 🟠 Important: JWT signature not verified\n\nForged tokens pass because `jwt.decode()` is called without the secret.\n\n---\n\n`jwt.decode()` requires both the secret and the allowed `algorithms` list.\n\nBest fix in this file (`api/auth.py:87`): pass `SECRET_KEY` and `algorithms=[\"HS256\"]`.\n\nNo new methods or dependencies needed.\n\n```suggestion\ntoken_data = jwt.decode(token, SECRET_KEY, algorithms=[\"HS256\"])\n```" + }} + ] +}} +``` + +Each `comments[i].body` starts with `## <🔴 Critical|🟠 Important|🟡 Suggestion> `, then a one-line statement of the issue, a `---` separator, a general fix explanation, a "Best fix in this file (``)" anchor, a scope confirmation ("No logic changes, new methods, or dependency changes are needed."), and ends with a ` ```suggestion ``` ` block for one-click apply. Never use `🟢` priority labels — if the code is fine, do not comment on it. + +**Do NOT** use `gh pr comment`, `POST /issues/{{n}}/comments`, or any path that produces a single issue-level comment. That yields a blob in the PR conversation with no diff anchoring, no suggestion blocks, and no per-finding threading — it is a bug, not a stylistic choice. The Pull Request Reviews API is the only acceptable submission path. Use the `line` (1-based) and `side: "RIGHT"` of the new file as shown in the diff. + +When posting a review, keep the top-level `body` brief unless your active review instructions require a longer structured format. For dependency update PRs, do **NOT** approve a target version that was published less than 7 days ago. diff --git a/skills/github-pr-review/SKILL.md b/skills/github-pr-review/SKILL.md index 706c5716..7f751d0a 100644 --- a/skills/github-pr-review/SKILL.md +++ b/skills/github-pr-review/SKILL.md @@ -7,38 +7,113 @@ triggers: # GitHub PR Review -Post structured code review feedback using the GitHub API with inline comments on specific lines. +Post a single **Pull Request Review** with one **inline comment per finding** — never a single blob comment. The body of each inline comment follows the `github-code-quality[bot]` format so reviewers can scan, discuss, and one-click-apply suggestions directly on the diff line. -## Key Rule: One API Call +## Pre-Review Checks (run before drafting the review) -Bundle ALL comments into a **single review API call**. Do not post comments individually. +1. **PR is still open:** + ```bash + gh pr view {pr_number} --json state,mergedAt,merged + ``` + - `state: OPEN` → proceed. + - `state: MERGED` → stop, nothing to review. + - `state: CLOSED` and `merged: false` → stop, do not review. The PR was abandoned; route back to planning. -## Posting a Review +2. **Scope guardrail:** review only within the issue/PR scope and regressions introduced by that scope. Do not flag unrelated cleanup, refactors, or wishlist improvements. If a finding is out of scope, skip it. + +3. **Severity filter:** only report findings that are medium severity or higher, have a concrete failure mechanism, and tie to user-visible impact, correctness risk, or metric distortion. Style nits belong to linters, not reviews. + +4. **Cap the number of findings.** A focused review with 5 strong findings beats a noisy review with 20 weak ones. If you have more than ~10, keep the top-severity ones and silently drop the rest. + +5. **Order by severity:** 🔴 Critical → 🟠 Important → 🟡 Suggestion. Within the same priority, the most user-visible issue goes first. + +## Key Rule: One PR Review, One API Call, Many Inline Comments + +Post exactly **one** Pull Request Review (`POST /repos/{owner}/{repo}/pulls/{pr_number}/reviews`) whose `comments[]` array contains **one entry per finding**. Each entry becomes a separate inline thread anchored to a `path` + `line` + `side`. + +**Do NOT:** +- Post a single big issue/PR comment that contains all findings as numbered sections. That bypasses the inline diff anchoring, breaks the suggestion-block UX, and is not what `github-code-quality[bot]` does. +- Post each finding as a separate API call. That fragments the review into N pending reviews and pollutes the timeline. + +## Inline Comment Body Format + +Each `comments[i].body` string follows this exact template (this is the `github-code-quality[bot]` shape, with the priority label folded into the heading and an optional one-click suggestion block appended): + +```markdown +## + + + +--- + + + +Best fix in this file (` + + + +```suggestion + +``` +``` + +Rules: +- **Heading** is `## `. The priority prefix is one of: + - `🔴 Critical:` — must fix: security, data loss, broken invariants. + - `🟠 Important:` — should fix: logic errors, performance, missing error handling. + - `🟡 Suggestion:` — worth considering: clarity, maintainability. + - **Never** `🟢 Nit` or `🟢 Acceptable`. If the code is fine, do not comment. +- **One-line statement** names the specific defect at that file/line. No "consider", no "could be better" — state what is wrong. +- **`---`** is a literal Markdown horizontal rule. It separates the diagnosis from the fix guidance and is what makes the format scannable in the GitHub UI. +- **General fix** is one or two sentences explaining the approach (e.g., "remove only the unused symbol from the import list while keeping used imports intact"). +- **Best fix here** is anchored to the actual file path and line number in backticks. It is the *one* edit the reviewer should make. +- **Scope confirmation** explicitly says no other code changes / no new imports / no new methods are needed. This is what differentiates a focused review from a sprawl. +- **` ```suggestion ``` ` block** is appended at the end when, and only when, the change can be expressed as a contiguous replacement of ≤ 5 lines on the new (right) side. For multi-region, architectural, or ambiguous changes, omit the block — describe the fix in prose instead. + +### Worked example (one finding, full template) + +``` +🟠 Important: Unused import + +Import of 'generate_macd_signal' is not used. + +--- + +To fix an unused import without changing behavior, remove only the unused symbol from the import statement and keep the rest intact. + +Best fix in this file (`core/analysis/indicator_correlation.py`): update line 23 so it imports only `compute_macd`, and remove `generate_macd_signal` from that line. No other code changes are needed, assuming the symbol is indeed unused throughout the file. + +```suggestion +from core.indicators.macd import compute_macd +``` +``` + +## Posting the Review Use the GitHub CLI (`gh`) with a JSON input file. The `GITHUB_TOKEN` is automatically available. -**Important**: Always use `--input` with a JSON file instead of `-F` flags. This avoids shell quoting issues with special characters in comment bodies (quotes, backticks, newlines, etc.) and eliminates the need for complex heredoc scripts. +**Important**: Always use `--input` with a JSON file instead of `-F` flags. This avoids shell-quoting issues with backticks, quotes, and newlines inside suggestion blocks. -### Step 1: Create a JSON file +### Step 1: Create the JSON file ```bash cat > /tmp/review.json << 'EOF' { "commit_id": "{commit_sha}", "event": "COMMENT", - "body": "Brief 1-3 sentence summary.", + "body": "Review summary: 2 important findings + 1 suggestion. Inline comments below.", "comments": [ { - "path": "path/to/file.py", - "line": 42, + "path": "core/analysis/indicator_correlation.py", + "line": 23, "side": "RIGHT", - "body": "🟠 Important: Your comment here." + "body": "🟠 Important: Unused import\n\nImport of 'generate_macd_signal' is not used.\n\n---\n\nTo fix an unused import without changing behavior, remove only the unused symbol from the import statement and keep the rest intact.\n\nBest fix in this file (`core/analysis/indicator_correlation.py`): update line 23 so it imports only `compute_macd`, and remove `generate_macd_signal` from that line. No other code changes are needed, assuming the symbol is indeed unused throughout the file.\n\n```suggestion\nfrom core.indicators.macd import compute_macd\n```" }, { - "path": "another/file.js", - "line": 15, + "path": "api/routes/health.py", + "line": 42, "side": "RIGHT", - "body": "🟡 Suggestion: Another comment." + "body": "🟠 Important: Division by None\n\n`uptime_seconds` divides by `(now - start_time)` without checking whether `start_time` is set, raising `TypeError` on first startup.\n\n---\n\nGuard the division with a `None` check and return `0` until the service is fully initialized.\n\nBest fix in this file (`api/routes/health.py`): add the guard at line 42 so the function returns `uptime_seconds=0` cleanly when `start_time` is None.\n\nNo new imports, methods, or external dependencies are needed.\n\n```suggestion\nif start_time is None:\n uptime_seconds = 0\nelse:\n uptime_seconds = int((now - start_time).total_seconds())\n```" } ] } @@ -55,75 +130,37 @@ gh api -X POST repos/{owner}/{repo}/pulls/{pr_number}/reviews --input /tmp/revie | Parameter | Description | |-----------|-------------| -| `commit_id` | Commit SHA to comment on (use `git rev-parse HEAD`) | -| `event` | `COMMENT`, `APPROVE`, or `REQUEST_CHANGES` | -| `path` | File path as shown in the diff | -| `line` | Line number in the NEW version (right side of diff) | -| `side` | `RIGHT` for new/added lines, `LEFT` for deleted lines | -| `body` | Comment text with priority label | +| `commit_id` | Commit SHA to comment on (use `git rev-parse HEAD`). | +| `event` | `COMMENT` (leave as comments), `APPROVE`, or `REQUEST_CHANGES`. | +| `body` | Brief 1–3 sentence summary. Keep it short — every detail belongs in an inline comment. | +| `comments[].path` | File path exactly as shown in the diff. | +| `comments[].line` | 1-based line number in the NEW version (right side of diff). | +| `comments[].side` | `RIGHT` for new/added lines, `LEFT` for deleted lines. | +| `comments[].body` | The formatted Markdown described in "Inline Comment Body Format" above. | +| `comments[].start_line` | (Optional) For multi-line suggestion ranges, see below. | -### Multi-Line Comments +## Multi-Line Suggestion Blocks -For comments spanning multiple lines, add `start_line` to specify the range: +For a fix that spans multiple lines, set `start_line` to the first line of the range. **`start_line`/`line` together define the range that will be REPLACED** when the reviewer clicks "Commit suggestion". ```json { - "path": "path/to/file.py", - "start_line": 10, - "line": 12, + "path": "api/routes/health.py", + "start_line": 41, + "line": 43, "side": "RIGHT", - "body": "🟡 Suggestion: Refactor this block:\n\n```suggestion\nline_one = \"new\"\nline_two = \"code\"\nline_three = \"here\"\n```" + "body": "🟠 Important: ...\n\n---\n\n...\n\n```suggestion\nif start_time is None:\n uptime_seconds = 0\nelse:\n uptime_seconds = int((now - start_time).total_seconds())\n```" } ``` -**`start_line`/`line` define the range that will be REPLACED.** The suggestion block may have any number of lines — it does **not** have to match the range size. See the next section for the exact semantics; getting this wrong is how suggestions silently delete or duplicate code. - -## Priority Labels - -Start each comment with a priority label. **Minimize nits** - leave minor style issues to linters. - -| Label | When to Use | -|-------|-------------| -| 🔴 **Critical** | Must fix: security vulnerabilities, bugs, data loss risks | -| 🟠 **Important** | Should fix: logic errors, performance issues, missing error handling | -| 🟡 **Suggestion** | Worth considering: significant improvements to clarity or maintainability | - -**Do NOT post 🟢 Nit or 🟢 Acceptable comments.** If code is fine, simply don't comment on it. Inline comments that say "this looks good" or "acceptable trade-off" are noise — they create review threads that must be resolved without providing actionable value. - -**Example:** -``` -🟠 Important: This function doesn't handle None, which could cause an AttributeError. - -```suggestion -if user is None: - raise ValueError("User cannot be None") -``` -``` - -## GitHub Suggestions - -For small code changes, use the suggestion syntax for one-click apply: - -~~~ -```suggestion -improved_code_here() -``` -~~~ - -Use suggestions for: renaming, typos, small refactors (1-5 lines), type hints, docstrings. - -Avoid for: large refactors, architectural changes, ambiguous improvements. - -### How Suggestions Actually Work (READ THIS BEFORE WRITING ONE) +## How Suggestions Actually Work (Mandatory Verification) -A suggestion block **replaces** the targeted range with its contents. The replaced range is: +A ` ```suggestion ``` ` block **replaces** the targeted range with its contents. The replaced range is: -- `line` only → the single line `line` (replaces 1 line) -- `start_line` + `line` → the inclusive range `start_line..line` (replaces `line - start_line + 1` lines) +- `line` only → the single line `line` (replaces 1 line). +- `start_line` + `line` → the inclusive range `start_line..line` (replaces `line − start_line + 1` lines). -The suggestion content can be **any number of lines** — 0 (deletion), 1, or many. It does not have to match the range size. Whatever is between the ` ```suggestion ` and closing ` ``` ` fences becomes the new content of those lines. - -Writing the wrong combination of `start_line`/`line` and suggestion body is what causes accepted suggestions to **duplicate** or **delete** code. Use the table below as your contract: +The suggestion body can be any number of lines — 0 (deletion), 1, or many. It does **not** have to match the range size. | Intent | `start_line` | `line` | Suggestion body must contain | |--------|--------------|--------|-------------------------------| @@ -135,36 +172,110 @@ Writing the wrong combination of `start_line`/`line` and suggestion body is what | **Delete** line N | omit | N | empty body (just an empty ` ```suggestion ``` ` block) | | **Delete** lines N..M | N | M | empty body | -### Common Mistakes That Break Code +### Common mistakes that silently corrupt code -1. **Duplicated lines.** You copy a neighboring line (N-1 or N+1) into the suggestion body as context — that line is still present in the file outside the replaced range, so accepting the suggestion inserts a second copy of it. Fix: only include lines that fall within the targeted range, plus any genuinely new content. -2. **Disappearing lines.** You target `start_line=10, line=12` to comment on a 3-line block, but your suggestion body only contains 1 line because you "only want to change line 11". Accepting that suggestion deletes lines 10 and 12. Fix: either narrow the range to just line 11, or include lines 10 and 12 verbatim in the body. -3. **Description does not match the suggestion.** The prose says "rename this variable" but the suggestion replaces an entire function. Or the prose says "add a None check" but the suggestion only contains the check (deleting the original code). Fix: after writing the suggestion, re-read the prose and confirm the resulting file would match it line-for-line. +1. **Duplicated lines.** You copy a neighboring line into the suggestion body as "context" — but that line is still present in the file outside the replaced range, so accepting the suggestion inserts a second copy. Fix: only include lines that fall within the targeted range, plus any genuinely new content. +2. **Disappearing lines.** You target `start_line=10, line=12` to comment on a 3-line block, but your suggestion body only contains 1 line because you "only want to change line 11". Accepting that suggestion deletes lines 10 and 12. Fix: narrow the range to just line 11, or include lines 10 and 12 verbatim in the body. +3. **Description does not match the suggestion.** The prose says "rename this variable" but the suggestion replaces an entire function. Fix: re-read the prose after writing the suggestion and confirm the resulting file matches it line-for-line. -### Mandatory Verification Before Posting +### Mandatory verification before posting -For every comment that contains a ` ```suggestion ``` ` block, do this check before adding it to the review JSON: +For every inline comment that contains a ` ```suggestion ``` ` block, do this check before adding it to the review JSON: 1. Read the actual file lines that will be replaced: `sed -n ',p' ` (or `sed -n 'p' ` for a single-line target). 2. Mentally apply the suggestion: drop those lines, splice in the suggestion body, and look at the result in context. 3. Confirm the resulting code matches **exactly** what your prose description promises — no extra duplicated line above/below, no original line accidentally dropped, no off-by-one. -4. If the change cannot be expressed cleanly as a contiguous replacement (e.g., it touches non-adjacent lines, or it depends on edits elsewhere in the file), do **not** use a suggestion block — describe the change in prose instead. +4. If the change cannot be expressed cleanly as a contiguous replacement (non-adjacent lines, depends on edits elsewhere), do **not** use a suggestion block — describe the change in prose instead. If you are not 100% sure the suggestion will produce the exact code you described, drop the ` ```suggestion ``` ` block and leave a regular inline comment. A correct prose comment is always better than a one-click suggestion that silently corrupts the file. +## When to Use a Suggestion Block (and When Not To) + +Use ` ```suggestion ``` ` for: small concrete changes — renames, typos, type hints, docstrings, 1–5 line refactors, removing a single unused import, adding one guard clause. + +Skip ` ```suggestion ``` ` for: large refactors, architectural changes, multi-region fixes, ambiguous improvements, anything that requires context from other parts of the file. Describe the fix in prose and let the human implement it. + ## Finding Line Numbers ```bash -# From diff header: @@ -old_start,old_count +new_start,new_count @@ -# Count from new_start for added/modified lines +# From a diff header: @@ -old_start,old_count +new_start,new_count @@ +# Count from new_start for added/modified lines. + +grep -n "pattern" filename # Find the line number +sed -n '40,50p' filename # Verify the surrounding context +``` + +## Good vs Bad Inline Comments + +### ✅ Good (specific failure mode, file/line anchored, actionable) + +``` +🟠 Important: JWT signature not verified + +The token is decoded without verifying the signature, allowing any forged token to pass authentication. + +--- + +`jwt.decode()` must be called with both the secret key and the allowed `algorithms` list, otherwise signature verification is skipped entirely. + +Best fix in this file (`api/auth.py`): pass the secret and the algorithm whitelist to `jwt.decode()` at line 87. + +No additional methods, definitions, or external imports are required. + +```suggestion +token_data = jwt.decode(token, SECRET_KEY, algorithms=["HS256"]) +``` +``` + +### ❌ Bad (vague, no failure mode, no anchor) + +``` +This could be improved. Consider refactoring. +``` + +### ❌ Bad (style nit — belongs to a linter, not a review) + +``` +Variable should be named `user_id` instead of `uid`. +``` + +### ❌ Bad (out of scope — skip it) -grep -n "pattern" filename # Find line number -head -n 42 filename | tail -1 # Verify line content ``` +Unrelated to this PR, but the README has a typo on line 12. +``` + +### ❌ Bad (no suggestion block where one is needed, or one where it isn't) + +For a 30-line architectural refactor: do not paste a 30-line ` ```suggestion ``` ` block. Describe the change in prose and let the human do the refactor in a follow-up. + +For a one-line unused-import removal: do not skip the suggestion block. The reviewer should be able to apply the fix with one click. + +## Edge Cases + +### Dependency-only PRs + +For PRs that only update `package-lock.json`, `requirements.txt`, or similar: +- Do not flag normal metadata churn (dev/devOptional flags, version bumps). +- Only flag issues that break install, build, or runtime loading. +- Only flag known security vulnerabilities in dependencies. + +### Documentation-only PRs + +For PRs that only modify `.md` files: +- Check for broken links and incorrect code examples. +- Do not flag typos or grammar unless they change meaning. + +### Test-only PRs + +For PRs that only add/modify tests: +- Verify the tests actually test what they claim. +- Check for test interdependencies. +- Do not flag test refactoring without functional impact. ## Fallback: curl -If `gh` is unavailable, use curl with the JSON file: +If `gh` is unavailable, use curl with the same JSON file: ```bash curl -X POST \ @@ -174,13 +285,11 @@ curl -X POST \ -d @/tmp/review.json ``` -## Summary +## Summary Checklist -1. Analyze the code and identify important issues (minimize nits) -2. Write review data to a JSON file (e.g., `/tmp/review.json`) -3. Post **ONE** review using `gh api --input /tmp/review.json` -4. Use priority labels (🔴🟠🟡) on every comment -5. Do NOT post comments for code that is acceptable — only comment when action is needed -6. Use suggestion syntax for concrete code changes, but only after verifying the resulting code matches your description (see "How Suggestions Actually Work") -7. Keep the review body brief (details go in inline comments) -8. If no issues: post a short approval message with no inline comments +1. Pre-review checks passed (PR open, scope ok, severity filtered, ≤ 10 findings, severity-ordered). +2. Review data written to `/tmp/review.json` with one entry per finding in `comments[]`. +3. Each `comments[i].body` follows the `## ` template. +4. Each suggestion block has been verified by mentally applying it to the file. +5. Post **ONE** review with `gh api --input /tmp/review.json`. +6. If no actionable findings: post a short approval with `"event": "APPROVE"` and **no** `comments[]` entries. diff --git a/skills/index.js b/skills/index.js index 8139845d..67dd55a2 100644 --- a/skills/index.js +++ b/skills/index.js @@ -177,7 +177,7 @@ export const SKILLS_CATALOG = [ "triggers": [ "/github-pr-review" ], - "content": "# GitHub PR Review\n\nPost structured code review feedback using the GitHub API with inline comments on specific lines.\n\n## Key Rule: One API Call\n\nBundle ALL comments into a **single review API call**. Do not post comments individually.\n\n## Posting a Review\n\nUse the GitHub CLI (`gh`) with a JSON input file. The `GITHUB_TOKEN` is automatically available.\n\n**Important**: Always use `--input` with a JSON file instead of `-F` flags. This avoids shell quoting issues with special characters in comment bodies (quotes, backticks, newlines, etc.) and eliminates the need for complex heredoc scripts.\n\n### Step 1: Create a JSON file\n\n```bash\ncat > /tmp/review.json << 'EOF'\n{\n \"commit_id\": \"{commit_sha}\",\n \"event\": \"COMMENT\",\n \"body\": \"Brief 1-3 sentence summary.\",\n \"comments\": [\n {\n \"path\": \"path/to/file.py\",\n \"line\": 42,\n \"side\": \"RIGHT\",\n \"body\": \"🟠 Important: Your comment here.\"\n },\n {\n \"path\": \"another/file.js\",\n \"line\": 15,\n \"side\": \"RIGHT\",\n \"body\": \"🟡 Suggestion: Another comment.\"\n }\n ]\n}\nEOF\n```\n\n### Step 2: Post the review\n\n```bash\ngh api -X POST repos/{owner}/{repo}/pulls/{pr_number}/reviews --input /tmp/review.json\n```\n\n### Parameters\n\n| Parameter | Description |\n|-----------|-------------|\n| `commit_id` | Commit SHA to comment on (use `git rev-parse HEAD`) |\n| `event` | `COMMENT`, `APPROVE`, or `REQUEST_CHANGES` |\n| `path` | File path as shown in the diff |\n| `line` | Line number in the NEW version (right side of diff) |\n| `side` | `RIGHT` for new/added lines, `LEFT` for deleted lines |\n| `body` | Comment text with priority label |\n\n### Multi-Line Comments\n\nFor comments spanning multiple lines, add `start_line` to specify the range:\n\n```json\n{\n \"path\": \"path/to/file.py\",\n \"start_line\": 10,\n \"line\": 12,\n \"side\": \"RIGHT\",\n \"body\": \"🟡 Suggestion: Refactor this block:\\n\\n```suggestion\\nline_one = \\\"new\\\"\\nline_two = \\\"code\\\"\\nline_three = \\\"here\\\"\\n```\"\n}\n```\n\n**`start_line`/`line` define the range that will be REPLACED.** The suggestion block may have any number of lines — it does **not** have to match the range size. See the next section for the exact semantics; getting this wrong is how suggestions silently delete or duplicate code.\n\n## Priority Labels\n\nStart each comment with a priority label. **Minimize nits** - leave minor style issues to linters.\n\n| Label | When to Use |\n|-------|-------------|\n| 🔴 **Critical** | Must fix: security vulnerabilities, bugs, data loss risks |\n| 🟠 **Important** | Should fix: logic errors, performance issues, missing error handling |\n| 🟡 **Suggestion** | Worth considering: significant improvements to clarity or maintainability |\n\n**Do NOT post 🟢 Nit or 🟢 Acceptable comments.** If code is fine, simply don't comment on it. Inline comments that say \"this looks good\" or \"acceptable trade-off\" are noise — they create review threads that must be resolved without providing actionable value.\n\n**Example:**\n```\n🟠 Important: This function doesn't handle None, which could cause an AttributeError.\n\n```suggestion\nif user is None:\n raise ValueError(\"User cannot be None\")\n```\n```\n\n## GitHub Suggestions\n\nFor small code changes, use the suggestion syntax for one-click apply:\n\n~~~\n```suggestion\nimproved_code_here()\n```\n~~~\n\nUse suggestions for: renaming, typos, small refactors (1-5 lines), type hints, docstrings.\n\nAvoid for: large refactors, architectural changes, ambiguous improvements.\n\n### How Suggestions Actually Work (READ THIS BEFORE WRITING ONE)\n\nA suggestion block **replaces** the targeted range with its contents. The replaced range is:\n\n- `line` only → the single line `line` (replaces 1 line)\n- `start_line` + `line` → the inclusive range `start_line..line` (replaces `line - start_line + 1` lines)\n\nThe suggestion content can be **any number of lines** — 0 (deletion), 1, or many. It does not have to match the range size. Whatever is between the ` ```suggestion ` and closing ` ``` ` fences becomes the new content of those lines.\n\nWriting the wrong combination of `start_line`/`line` and suggestion body is what causes accepted suggestions to **duplicate** or **delete** code. Use the table below as your contract:\n\n| Intent | `start_line` | `line` | Suggestion body must contain |\n|--------|--------------|--------|-------------------------------|\n| Change line N | omit | N | the new content for line N |\n| Change lines N..M | N | M | the new content for the whole block |\n| **Add** a line **after** line N (keep line N) | omit | N | line N's exact current text, then the new line(s) |\n| **Add** a line **before** line N (keep line N) | omit | N | the new line(s), then line N's exact current text |\n| **Insert** lines inside range N..M (keep N..M) | N | M | every original line in N..M plus the new lines, in the final desired order |\n| **Delete** line N | omit | N | empty body (just an empty ` ```suggestion ``` ` block) |\n| **Delete** lines N..M | N | M | empty body |\n\n### Common Mistakes That Break Code\n\n1. **Duplicated lines.** You copy a neighboring line (N-1 or N+1) into the suggestion body as context — that line is still present in the file outside the replaced range, so accepting the suggestion inserts a second copy of it. Fix: only include lines that fall within the targeted range, plus any genuinely new content.\n2. **Disappearing lines.** You target `start_line=10, line=12` to comment on a 3-line block, but your suggestion body only contains 1 line because you \"only want to change line 11\". Accepting that suggestion deletes lines 10 and 12. Fix: either narrow the range to just line 11, or include lines 10 and 12 verbatim in the body.\n3. **Description does not match the suggestion.** The prose says \"rename this variable\" but the suggestion replaces an entire function. Or the prose says \"add a None check\" but the suggestion only contains the check (deleting the original code). Fix: after writing the suggestion, re-read the prose and confirm the resulting file would match it line-for-line.\n\n### Mandatory Verification Before Posting\n\nFor every comment that contains a ` ```suggestion ``` ` block, do this check before adding it to the review JSON:\n\n1. Read the actual file lines that will be replaced: `sed -n ',p' ` (or `sed -n 'p' ` for a single-line target).\n2. Mentally apply the suggestion: drop those lines, splice in the suggestion body, and look at the result in context.\n3. Confirm the resulting code matches **exactly** what your prose description promises — no extra duplicated line above/below, no original line accidentally dropped, no off-by-one.\n4. If the change cannot be expressed cleanly as a contiguous replacement (e.g., it touches non-adjacent lines, or it depends on edits elsewhere in the file), do **not** use a suggestion block — describe the change in prose instead.\n\nIf you are not 100% sure the suggestion will produce the exact code you described, drop the ` ```suggestion ``` ` block and leave a regular inline comment. A correct prose comment is always better than a one-click suggestion that silently corrupts the file.\n\n## Finding Line Numbers\n\n```bash\n# From diff header: @@ -old_start,old_count +new_start,new_count @@\n# Count from new_start for added/modified lines\n\ngrep -n \"pattern\" filename # Find line number\nhead -n 42 filename | tail -1 # Verify line content\n```\n\n## Fallback: curl\n\nIf `gh` is unavailable, use curl with the JSON file:\n\n```bash\ncurl -X POST \\\n -H \"Authorization: token $GITHUB_TOKEN\" \\\n -H \"Accept: application/vnd.github+json\" \\\n \"https://api.github.com/repos/{owner}/{repo}/pulls/{pr_number}/reviews\" \\\n -d @/tmp/review.json\n```\n\n## Summary\n\n1. Analyze the code and identify important issues (minimize nits)\n2. Write review data to a JSON file (e.g., `/tmp/review.json`)\n3. Post **ONE** review using `gh api --input /tmp/review.json`\n4. Use priority labels (🔴🟠🟡) on every comment\n5. Do NOT post comments for code that is acceptable — only comment when action is needed\n6. Use suggestion syntax for concrete code changes, but only after verifying the resulting code matches your description (see \"How Suggestions Actually Work\")\n7. Keep the review body brief (details go in inline comments)\n8. If no issues: post a short approval message with no inline comments" + "content": "# GitHub PR Review\n\nPost a single **Pull Request Review** with one **inline comment per finding** — never a single blob comment. The body of each inline comment follows the `github-code-quality[bot]` format so reviewers can scan, discuss, and one-click-apply suggestions directly on the diff line.\n\n## Pre-Review Checks (run before drafting the review)\n\n1. **PR is still open:**\n ```bash\n gh pr view {pr_number} --json state,mergedAt,merged\n ```\n - `state: OPEN` → proceed.\n - `state: MERGED` → stop, nothing to review.\n - `state: CLOSED` and `merged: false` → stop, do not review. The PR was abandoned; route back to planning.\n\n2. **Scope guardrail:** review only within the issue/PR scope and regressions introduced by that scope. Do not flag unrelated cleanup, refactors, or wishlist improvements. If a finding is out of scope, skip it.\n\n3. **Severity filter:** only report findings that are medium severity or higher, have a concrete failure mechanism, and tie to user-visible impact, correctness risk, or metric distortion. Style nits belong to linters, not reviews.\n\n4. **Cap the number of findings.** A focused review with 5 strong findings beats a noisy review with 20 weak ones. If you have more than ~10, keep the top-severity ones and silently drop the rest.\n\n5. **Order by severity:** 🔴 Critical → 🟠 Important → 🟡 Suggestion. Within the same priority, the most user-visible issue goes first.\n\n## Key Rule: One PR Review, One API Call, Many Inline Comments\n\nPost exactly **one** Pull Request Review (`POST /repos/{owner}/{repo}/pulls/{pr_number}/reviews`) whose `comments[]` array contains **one entry per finding**. Each entry becomes a separate inline thread anchored to a `path` + `line` + `side`.\n\n**Do NOT:**\n- Post a single big issue/PR comment that contains all findings as numbered sections. That bypasses the inline diff anchoring, breaks the suggestion-block UX, and is not what `github-code-quality[bot]` does.\n- Post each finding as a separate API call. That fragments the review into N pending reviews and pollutes the timeline.\n\n## Inline Comment Body Format\n\nEach `comments[i].body` string follows this exact template (this is the `github-code-quality[bot]` shape, with the priority label folded into the heading and an optional one-click suggestion block appended):\n\n```markdown\n## \n\n\n\n---\n\n\n\nBest fix in this file (`\n\n\n\n```suggestion\n\n```\n```\n\nRules:\n- **Heading** is `## `. The priority prefix is one of:\n - `🔴 Critical:` — must fix: security, data loss, broken invariants.\n - `🟠 Important:` — should fix: logic errors, performance, missing error handling.\n - `🟡 Suggestion:` — worth considering: clarity, maintainability.\n - **Never** `🟢 Nit` or `🟢 Acceptable`. If the code is fine, do not comment.\n- **One-line statement** names the specific defect at that file/line. No \"consider\", no \"could be better\" — state what is wrong.\n- **`---`** is a literal Markdown horizontal rule. It separates the diagnosis from the fix guidance and is what makes the format scannable in the GitHub UI.\n- **General fix** is one or two sentences explaining the approach (e.g., \"remove only the unused symbol from the import list while keeping used imports intact\").\n- **Best fix here** is anchored to the actual file path and line number in backticks. It is the *one* edit the reviewer should make.\n- **Scope confirmation** explicitly says no other code changes / no new imports / no new methods are needed. This is what differentiates a focused review from a sprawl.\n- **` ```suggestion ``` ` block** is appended at the end when, and only when, the change can be expressed as a contiguous replacement of ≤ 5 lines on the new (right) side. For multi-region, architectural, or ambiguous changes, omit the block — describe the fix in prose instead.\n\n### Worked example (one finding, full template)\n\n```\n🟠 Important: Unused import\n\nImport of 'generate_macd_signal' is not used.\n\n---\n\nTo fix an unused import without changing behavior, remove only the unused symbol from the import statement and keep the rest intact.\n\nBest fix in this file (`core/analysis/indicator_correlation.py`): update line 23 so it imports only `compute_macd`, and remove `generate_macd_signal` from that line. No other code changes are needed, assuming the symbol is indeed unused throughout the file.\n\n```suggestion\nfrom core.indicators.macd import compute_macd\n```\n```\n\n## Posting the Review\n\nUse the GitHub CLI (`gh`) with a JSON input file. The `GITHUB_TOKEN` is automatically available.\n\n**Important**: Always use `--input` with a JSON file instead of `-F` flags. This avoids shell-quoting issues with backticks, quotes, and newlines inside suggestion blocks.\n\n### Step 1: Create the JSON file\n\n```bash\ncat > /tmp/review.json << 'EOF'\n{\n \"commit_id\": \"{commit_sha}\",\n \"event\": \"COMMENT\",\n \"body\": \"Review summary: 2 important findings + 1 suggestion. Inline comments below.\",\n \"comments\": [\n {\n \"path\": \"core/analysis/indicator_correlation.py\",\n \"line\": 23,\n \"side\": \"RIGHT\",\n \"body\": \"🟠 Important: Unused import\\n\\nImport of 'generate_macd_signal' is not used.\\n\\n---\\n\\nTo fix an unused import without changing behavior, remove only the unused symbol from the import statement and keep the rest intact.\\n\\nBest fix in this file (`core/analysis/indicator_correlation.py`): update line 23 so it imports only `compute_macd`, and remove `generate_macd_signal` from that line. No other code changes are needed, assuming the symbol is indeed unused throughout the file.\\n\\n```suggestion\\nfrom core.indicators.macd import compute_macd\\n```\"\n },\n {\n \"path\": \"api/routes/health.py\",\n \"line\": 42,\n \"side\": \"RIGHT\",\n \"body\": \"🟠 Important: Division by None\\n\\n`uptime_seconds` divides by `(now - start_time)` without checking whether `start_time` is set, raising `TypeError` on first startup.\\n\\n---\\n\\nGuard the division with a `None` check and return `0` until the service is fully initialized.\\n\\nBest fix in this file (`api/routes/health.py`): add the guard at line 42 so the function returns `uptime_seconds=0` cleanly when `start_time` is None.\\n\\nNo new imports, methods, or external dependencies are needed.\\n\\n```suggestion\\nif start_time is None:\\n uptime_seconds = 0\\nelse:\\n uptime_seconds = int((now - start_time).total_seconds())\\n```\"\n }\n ]\n}\nEOF\n```\n\n### Step 2: Post the review\n\n```bash\ngh api -X POST repos/{owner}/{repo}/pulls/{pr_number}/reviews --input /tmp/review.json\n```\n\n### Parameters\n\n| Parameter | Description |\n|-----------|-------------|\n| `commit_id` | Commit SHA to comment on (use `git rev-parse HEAD`). |\n| `event` | `COMMENT` (leave as comments), `APPROVE`, or `REQUEST_CHANGES`. |\n| `body` | Brief 1–3 sentence summary. Keep it short — every detail belongs in an inline comment. |\n| `comments[].path` | File path exactly as shown in the diff. |\n| `comments[].line` | 1-based line number in the NEW version (right side of diff). |\n| `comments[].side` | `RIGHT` for new/added lines, `LEFT` for deleted lines. |\n| `comments[].body` | The formatted Markdown described in \"Inline Comment Body Format\" above. |\n| `comments[].start_line` | (Optional) For multi-line suggestion ranges, see below. |\n\n## Multi-Line Suggestion Blocks\n\nFor a fix that spans multiple lines, set `start_line` to the first line of the range. **`start_line`/`line` together define the range that will be REPLACED** when the reviewer clicks \"Commit suggestion\".\n\n```json\n{\n \"path\": \"api/routes/health.py\",\n \"start_line\": 41,\n \"line\": 43,\n \"side\": \"RIGHT\",\n \"body\": \"🟠 Important: ...\\n\\n---\\n\\n...\\n\\n```suggestion\\nif start_time is None:\\n uptime_seconds = 0\\nelse:\\n uptime_seconds = int((now - start_time).total_seconds())\\n```\"\n}\n```\n\n## How Suggestions Actually Work (Mandatory Verification)\n\nA ` ```suggestion ``` ` block **replaces** the targeted range with its contents. The replaced range is:\n\n- `line` only → the single line `line` (replaces 1 line).\n- `start_line` + `line` → the inclusive range `start_line..line` (replaces `line − start_line + 1` lines).\n\nThe suggestion body can be any number of lines — 0 (deletion), 1, or many. It does **not** have to match the range size.\n\n| Intent | `start_line` | `line` | Suggestion body must contain |\n|--------|--------------|--------|-------------------------------|\n| Change line N | omit | N | the new content for line N |\n| Change lines N..M | N | M | the new content for the whole block |\n| **Add** a line **after** line N (keep line N) | omit | N | line N's exact current text, then the new line(s) |\n| **Add** a line **before** line N (keep line N) | omit | N | the new line(s), then line N's exact current text |\n| **Insert** lines inside range N..M (keep N..M) | N | M | every original line in N..M plus the new lines, in the final desired order |\n| **Delete** line N | omit | N | empty body (just an empty ` ```suggestion ``` ` block) |\n| **Delete** lines N..M | N | M | empty body |\n\n### Common mistakes that silently corrupt code\n\n1. **Duplicated lines.** You copy a neighboring line into the suggestion body as \"context\" — but that line is still present in the file outside the replaced range, so accepting the suggestion inserts a second copy. Fix: only include lines that fall within the targeted range, plus any genuinely new content.\n2. **Disappearing lines.** You target `start_line=10, line=12` to comment on a 3-line block, but your suggestion body only contains 1 line because you \"only want to change line 11\". Accepting that suggestion deletes lines 10 and 12. Fix: narrow the range to just line 11, or include lines 10 and 12 verbatim in the body.\n3. **Description does not match the suggestion.** The prose says \"rename this variable\" but the suggestion replaces an entire function. Fix: re-read the prose after writing the suggestion and confirm the resulting file matches it line-for-line.\n\n### Mandatory verification before posting\n\nFor every inline comment that contains a ` ```suggestion ``` ` block, do this check before adding it to the review JSON:\n\n1. Read the actual file lines that will be replaced: `sed -n ',p' ` (or `sed -n 'p' ` for a single-line target).\n2. Mentally apply the suggestion: drop those lines, splice in the suggestion body, and look at the result in context.\n3. Confirm the resulting code matches **exactly** what your prose description promises — no extra duplicated line above/below, no original line accidentally dropped, no off-by-one.\n4. If the change cannot be expressed cleanly as a contiguous replacement (non-adjacent lines, depends on edits elsewhere), do **not** use a suggestion block — describe the change in prose instead.\n\nIf you are not 100% sure the suggestion will produce the exact code you described, drop the ` ```suggestion ``` ` block and leave a regular inline comment. A correct prose comment is always better than a one-click suggestion that silently corrupts the file.\n\n## When to Use a Suggestion Block (and When Not To)\n\nUse ` ```suggestion ``` ` for: small concrete changes — renames, typos, type hints, docstrings, 1–5 line refactors, removing a single unused import, adding one guard clause.\n\nSkip ` ```suggestion ``` ` for: large refactors, architectural changes, multi-region fixes, ambiguous improvements, anything that requires context from other parts of the file. Describe the fix in prose and let the human implement it.\n\n## Finding Line Numbers\n\n```bash\n# From a diff header: @@ -old_start,old_count +new_start,new_count @@\n# Count from new_start for added/modified lines.\n\ngrep -n \"pattern\" filename # Find the line number\nsed -n '40,50p' filename # Verify the surrounding context\n```\n\n## Good vs Bad Inline Comments\n\n### ✅ Good (specific failure mode, file/line anchored, actionable)\n\n```\n🟠 Important: JWT signature not verified\n\nThe token is decoded without verifying the signature, allowing any forged token to pass authentication.\n\n---\n\n`jwt.decode()` must be called with both the secret key and the allowed `algorithms` list, otherwise signature verification is skipped entirely.\n\nBest fix in this file (`api/auth.py`): pass the secret and the algorithm whitelist to `jwt.decode()` at line 87.\n\nNo additional methods, definitions, or external imports are required.\n\n```suggestion\ntoken_data = jwt.decode(token, SECRET_KEY, algorithms=[\"HS256\"])\n```\n```\n\n### ❌ Bad (vague, no failure mode, no anchor)\n\n```\nThis could be improved. Consider refactoring.\n```\n\n### ❌ Bad (style nit — belongs to a linter, not a review)\n\n```\nVariable should be named `user_id` instead of `uid`.\n```\n\n### ❌ Bad (out of scope — skip it)\n\n```\nUnrelated to this PR, but the README has a typo on line 12.\n```\n\n### ❌ Bad (no suggestion block where one is needed, or one where it isn't)\n\nFor a 30-line architectural refactor: do not paste a 30-line ` ```suggestion ``` ` block. Describe the change in prose and let the human do the refactor in a follow-up.\n\nFor a one-line unused-import removal: do not skip the suggestion block. The reviewer should be able to apply the fix with one click.\n\n## Edge Cases\n\n### Dependency-only PRs\n\nFor PRs that only update `package-lock.json`, `requirements.txt`, or similar:\n- Do not flag normal metadata churn (dev/devOptional flags, version bumps).\n- Only flag issues that break install, build, or runtime loading.\n- Only flag known security vulnerabilities in dependencies.\n\n### Documentation-only PRs\n\nFor PRs that only modify `.md` files:\n- Check for broken links and incorrect code examples.\n- Do not flag typos or grammar unless they change meaning.\n\n### Test-only PRs\n\nFor PRs that only add/modify tests:\n- Verify the tests actually test what they claim.\n- Check for test interdependencies.\n- Do not flag test refactoring without functional impact.\n\n## Fallback: curl\n\nIf `gh` is unavailable, use curl with the same JSON file:\n\n```bash\ncurl -X POST \\\n -H \"Authorization: token $GITHUB_TOKEN\" \\\n -H \"Accept: application/vnd.github+json\" \\\n \"https://api.github.com/repos/{owner}/{repo}/pulls/{pr_number}/reviews\" \\\n -d @/tmp/review.json\n```\n\n## Summary Checklist\n\n1. Pre-review checks passed (PR open, scope ok, severity filtered, ≤ 10 findings, severity-ordered).\n2. Review data written to `/tmp/review.json` with one entry per finding in `comments[]`.\n3. Each `comments[i].body` follows the `## ` template.\n4. Each suggestion block has been verified by mentally applying it to the file.\n5. Post **ONE** review with `gh api --input /tmp/review.json`.\n6. If no actionable findings: post a short approval with `\"event\": \"APPROVE\"` and **no** `comments[]` entries." }, { "name": "github-pr-reviewer",