Skip to content

feat(backend): expand Omi MCP data surface (action items, goals, chat, people, screen activity, daily summaries)#7817

Merged
kodjima33 merged 5 commits into
mainfrom
kodjima33/omi-mcp-data-audit
Jun 11, 2026
Merged

feat(backend): expand Omi MCP data surface (action items, goals, chat, people, screen activity, daily summaries)#7817
kodjima33 merged 5 commits into
mainfrom
kodjima33/omi-mcp-data-audit

Conversation

@kodjima33

Copy link
Copy Markdown
Collaborator

What

Expands the Omi MCP server to expose six data domains it already stores but never surfaced, wired in both the REST API (`routers/mcp.py`) and the MCP tool surface AI clients see (`routers/mcp_sse.py`):

Tool / endpoint Data
`get_action_items` · `GET /v1/mcp/action-items` To-dos w/ due dates + completion (filter by status/due range); drops soft-deleted, truncates locked
`get_goals` · `GET /v1/mcp/goals` Active (or all) stated goals
`get_chat_messages` · `GET /v1/mcp/chat` Recent Omi chat history (decrypted via existing read decorator)
`get_people` · `GET /v1/mcp/people` Contacts/speakers — name + transcript samples; raw audio URLs & speaker embeddings stripped
`get_screen_activity` · `GET /v1/mcp/screen-activity` Desktop Rewind (app, window, OCR); `summary=true` for per-app aggregate
`get_daily_summaries` · `GET /v1/mcp/daily-summaries` Per-day life digests

Shared response shaping lives in `utils/mcp_data.py` so both routers reuse identical shapes without cross-importing (routers must not import each other). Each domain gets its own OAuth scope (`action_items.read`, `goals.read`, `chat.read`, `screen_activity.read`, `people.read`) so a future scoped-key model can gate them — today's `omi_mcp_` key still grants all.

Testing

  • New `tests/unit/test_mcp_data_endpoints.py` (19 tests) exercises the real dispatch + shaping for every new tool/endpoint with mocked DB (deleted-item filtering, locked truncation, bad-date rejection, and asserts person audio/embeddings never leak). Added to `test.sh`.
  • Full MCP suite green: 58 passed.
  • Verified end-to-end against real prod Firestore (read-only): all six endpoints return real data for a live user; chat confirmed to decrypt to plaintext; `/people` confirmed to omit audio URLs/embeddings.

🤖 Generated with Claude Code

kodjima33 and others added 5 commits June 11, 2026 03:07
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ity, daily summaries via MCP REST

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…people, screen activity, daily summaries

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR expands the Omi MCP server with six new data domains (action items, goals, chat, people, screen activity, daily summaries), wiring each into both the REST router and the MCP SSE tool surface via a shared utils/mcp_data.py shaping layer. The privacy-sensitive stripping of audio URLs and speaker embeddings from person records is correct, and the clean_* helpers are reused consistently across both routers.

  • Missing try/except around parse_mcp_bool in the get_goals and get_screen_activity execute_tool branches — an invalid boolean value escapes as an unhandled ValueError rather than a proper ToolExecutionError(-32602), unlike every other tool in the same function.
  • Raw, unvalidated date strings reach Firestore in the get_daily_summaries MCP tool branch (and via loose Optional[str] typing in the REST endpoint), while all other date-filtering tools pass through _parse_mcp_date(); bad input silently returns wrong results instead of a clear error.
  • get_daily_summaries reuses CONVERSATIONS_READ_SECURITY instead of its own daily_summaries.read scope, contrary to the PR's stated per-domain scoping design.

Confidence Score: 3/5

Needs fixes before merging — two tool dispatch branches have unguarded parse calls that leak exceptions to callers, and the daily-summaries path bypasses date validation that all other tools enforce.

Three tool branches in mcp_sse.py have correctness gaps: get_goals and get_screen_activity both call parse_mcp_bool() without a try/except, so a bad boolean argument produces an unhandled ValueError instead of a proper MCP error; get_daily_summaries forwards raw date strings to Firestore without running them through _parse_mcp_date(), silently returning wrong data on invalid input. The REST endpoint compounds the last issue by typing its date params as Optional[str] rather than Optional[datetime]. The privacy-sensitive people-record stripping and the overall scope design are solid, but these dispatch gaps affect real runtime behavior on the new tool surface.

backend/routers/mcp_sse.py (the get_goals, get_screen_activity, and get_daily_summaries dispatch branches) and the get_daily_summaries endpoint signature in backend/routers/mcp.py.

Important Files Changed

Filename Overview
backend/utils/mcp_data.py New shared shaping helpers for MCP responses; correctly strips audio URLs and speaker embeddings from person docs, truncates locked action-item descriptions, and normalises screen-activity field names. No issues found.
backend/routers/mcp.py Six new REST endpoints added; start_date/end_date in get_daily_summaries are typed as Optional[str] so FastAPI never validates they are real dates, unlike every other date-filtering endpoint in this file that uses Optional[datetime].
backend/routers/mcp_sse.py Adds six MCP tool dispatch branches; get_goals and get_screen_activity call parse_mcp_bool() without a try/except ValueError, so an invalid bool input escapes as an unhandled exception; get_daily_summaries skips _parse_mcp_date() validation and reuses CONVERSATIONS_READ_SECURITY instead of a dedicated scope.
backend/tests/unit/test_mcp_data_endpoints.py 19 new unit tests covering all six domains; deleted-item filtering, locked truncation, bad-date rejection, and audio/embedding stripping are all verified. The bad-date test only exercises get_action_items; get_daily_summaries bad-date and invalid-bool paths for get_goals/get_screen_activity are not covered.
backend/test.sh New test file correctly wired into the test script.

Sequence Diagram

sequenceDiagram
    participant C as MCP Client / REST caller
    participant SSE as mcp_sse.execute_tool
    participant Shape as utils/mcp_data.py
    participant DB as database modules

    C->>SSE: get_action_items(completed, due_start_date, limit)
    SSE->>SSE: parse_mcp_int / parse_optional_mcp_bool / _parse_mcp_date
    SSE->>DB: action_items_db.get_action_items(uid, ...)
    DB-->>SSE: raw Firestore docs
    SSE->>Shape: clean_action_item(item)
    SSE-->>C: "{action_items: [...]}"

    C->>SSE: "get_goals(include_inactive=maybe)"
    SSE->>SSE: parse_mcp_bool raises unhandled ValueError
    SSE-->>C: unhandled exception / 500

    C->>SSE: "get_screen_activity(summary=bad)"
    SSE->>SSE: parse_mcp_bool raises unhandled ValueError
    SSE-->>C: unhandled exception / 500

    C->>SSE: "get_daily_summaries(start_date=not-a-date)"
    SSE->>SSE: No _parse_mcp_date - raw string forwarded
    SSE->>DB: "daily_summaries_db.get_daily_summaries(start_date=not-a-date)"
    DB-->>SSE: empty / wrong results silent
    SSE-->>C: "{daily_summaries: []}"
Loading

Reviews (1): Last reviewed commit: "test(backend): run test_mcp_data_endpoin..." | Re-trigger Greptile

Comment on lines +815 to +817
elif tool_name == "get_goals":
include_inactive = parse_mcp_bool(arguments.get("include_inactive"), "include_inactive", default=False)
return {"goals": goals_db.get_all_goals(user_id, include_inactive=include_inactive)}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 parse_mcp_bool can raise ValueError for invalid inputs (e.g., "maybe"), but the get_goals branch doesn't wrap it in a try/except. Every other tool that calls parsing helpers wraps them in try/except ValueError as e: raise ToolExecutionError(str(e), code=-32602). Without that guard, a malformed bool argument escapes execute_tool as an unhandled ValueError rather than producing the expected MCP error response.

Suggested change
elif tool_name == "get_goals":
include_inactive = parse_mcp_bool(arguments.get("include_inactive"), "include_inactive", default=False)
return {"goals": goals_db.get_all_goals(user_id, include_inactive=include_inactive)}
elif tool_name == "get_goals":
try:
include_inactive = parse_mcp_bool(arguments.get("include_inactive"), "include_inactive", default=False)
except ValueError as e:
raise ToolExecutionError(str(e), code=-32602)
return {"goals": goals_db.get_all_goals(user_id, include_inactive=include_inactive)}

Comment on lines +831 to +841
elif tool_name == "get_screen_activity":
start = _parse_mcp_date(arguments.get("start_date"), "start_date")
end = _parse_mcp_date(arguments.get("end_date"), "end_date")
app = arguments.get("app")
summary = parse_mcp_bool(arguments.get("summary"), "summary", default=False)
if summary:
return screen_activity_db.get_screen_activity_summary(user_id, start_date=start, end_date=end)
try:
limit = parse_mcp_int(arguments.get("limit"), "limit", default=200, minimum=1, maximum=1000)
except ValueError as e:
raise ToolExecutionError(str(e), code=-32602)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 parse_mcp_bool for summary can raise ValueError on a bad input and is unguarded here. Wrapping it in a try/except that covers all the parsing in this branch keeps the error path consistent with every other tool.

Suggested change
elif tool_name == "get_screen_activity":
start = _parse_mcp_date(arguments.get("start_date"), "start_date")
end = _parse_mcp_date(arguments.get("end_date"), "end_date")
app = arguments.get("app")
summary = parse_mcp_bool(arguments.get("summary"), "summary", default=False)
if summary:
return screen_activity_db.get_screen_activity_summary(user_id, start_date=start, end_date=end)
try:
limit = parse_mcp_int(arguments.get("limit"), "limit", default=200, minimum=1, maximum=1000)
except ValueError as e:
raise ToolExecutionError(str(e), code=-32602)
elif tool_name == "get_screen_activity":
try:
start = _parse_mcp_date(arguments.get("start_date"), "start_date")
end = _parse_mcp_date(arguments.get("end_date"), "end_date")
app = arguments.get("app")
summary = parse_mcp_bool(arguments.get("summary"), "summary", default=False)
except (ValueError, ToolExecutionError) as e:
raise ToolExecutionError(str(e), code=-32602)
if summary:
return screen_activity_db.get_screen_activity_summary(user_id, start_date=start, end_date=end)
try:
limit = parse_mcp_int(arguments.get("limit"), "limit", default=200, minimum=1, maximum=1000)
except ValueError as e:
raise ToolExecutionError(str(e), code=-32602)

Comment on lines +847 to +860
elif tool_name == "get_daily_summaries":
try:
limit = parse_mcp_int(arguments.get("limit"), "limit", default=30, minimum=1, maximum=100)
offset = parse_mcp_int(arguments.get("offset"), "offset", default=0, minimum=0, maximum=100000)
except ValueError as e:
raise ToolExecutionError(str(e), code=-32602)
summaries = daily_summaries_db.get_daily_summaries(
user_id,
limit=limit,
offset=offset,
start_date=arguments.get("start_date"),
end_date=arguments.get("end_date"),
)
return {"daily_summaries": summaries}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 start_date and end_date are forwarded to daily_summaries_db.get_daily_summaries() as raw, unvalidated strings. The DB function passes them straight into Firestore FieldFilter string comparisons, so an invalid value like "not-a-date" silently produces wrong results instead of a proper MCP error. Every other tool with date params calls _parse_mcp_date() first.

Suggested change
elif tool_name == "get_daily_summaries":
try:
limit = parse_mcp_int(arguments.get("limit"), "limit", default=30, minimum=1, maximum=100)
offset = parse_mcp_int(arguments.get("offset"), "offset", default=0, minimum=0, maximum=100000)
except ValueError as e:
raise ToolExecutionError(str(e), code=-32602)
summaries = daily_summaries_db.get_daily_summaries(
user_id,
limit=limit,
offset=offset,
start_date=arguments.get("start_date"),
end_date=arguments.get("end_date"),
)
return {"daily_summaries": summaries}
elif tool_name == "get_daily_summaries":
try:
limit = parse_mcp_int(arguments.get("limit"), "limit", default=30, minimum=1, maximum=100)
offset = parse_mcp_int(arguments.get("offset"), "offset", default=0, minimum=0, maximum=100000)
except ValueError as e:
raise ToolExecutionError(str(e), code=-32602)
start = _parse_mcp_date(arguments.get("start_date"), "start_date")
end = _parse_mcp_date(arguments.get("end_date"), "end_date")
summaries = daily_summaries_db.get_daily_summaries(
user_id,
limit=limit,
offset=offset,
start_date=start.strftime("%Y-%m-%d") if start else None,
end_date=end.strftime("%Y-%m-%d") if end else None,
)
return {"daily_summaries": summaries}

Comment thread backend/routers/mcp.py
Comment on lines +486 to +493
@router.get("/v1/mcp/daily-summaries", tags=["mcp"])
def get_daily_summaries(
limit: int = 30,
offset: int = 0,
start_date: Optional[str] = None,
end_date: Optional[str] = None,
uid: str = Depends(get_uid_from_mcp_api_key),
):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Every other date-filtering endpoint in this file uses Optional[datetime] so FastAPI validates the format automatically. Using Optional[str] here means a caller can pass "not-a-date" and it reaches Firestore unchanged, silently returning wrong results instead of a 422.

Suggested change
@router.get("/v1/mcp/daily-summaries", tags=["mcp"])
def get_daily_summaries(
limit: int = 30,
offset: int = 0,
start_date: Optional[str] = None,
end_date: Optional[str] = None,
uid: str = Depends(get_uid_from_mcp_api_key),
):
@router.get("/v1/mcp/daily-summaries", tags=["mcp"])
def get_daily_summaries(
limit: int = 30,
offset: int = 0,
start_date: Optional[datetime] = None,
end_date: Optional[datetime] = None,
uid: str = Depends(get_uid_from_mcp_api_key),
):

Comment on lines +90 to +94
ACTION_ITEMS_READ_SECURITY = [{"type": "oauth2", "scopes": ["action_items.read"]}]
GOALS_READ_SECURITY = [{"type": "oauth2", "scopes": ["goals.read"]}]
CHAT_READ_SECURITY = [{"type": "oauth2", "scopes": ["chat.read"]}]
SCREEN_ACTIVITY_READ_SECURITY = [{"type": "oauth2", "scopes": ["screen_activity.read"]}]
PEOPLE_READ_SECURITY = [{"type": "oauth2", "scopes": ["people.read"]}]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The PR description says each domain gets its own OAuth scope, and five new scopes are correctly added. But get_daily_summaries reuses CONVERSATIONS_READ_SECURITY, so any OAuth client with only conversations.read can also read daily life summaries — a distinct, high-sensitivity data domain. A dedicated daily_summaries.read scope should be added to match the stated design.

Suggested change
ACTION_ITEMS_READ_SECURITY = [{"type": "oauth2", "scopes": ["action_items.read"]}]
GOALS_READ_SECURITY = [{"type": "oauth2", "scopes": ["goals.read"]}]
CHAT_READ_SECURITY = [{"type": "oauth2", "scopes": ["chat.read"]}]
SCREEN_ACTIVITY_READ_SECURITY = [{"type": "oauth2", "scopes": ["screen_activity.read"]}]
PEOPLE_READ_SECURITY = [{"type": "oauth2", "scopes": ["people.read"]}]
ACTION_ITEMS_READ_SECURITY = [{"type": "oauth2", "scopes": ["action_items.read"]}]
GOALS_READ_SECURITY = [{"type": "oauth2", "scopes": ["goals.read"]}]
CHAT_READ_SECURITY = [{"type": "oauth2", "scopes": ["chat.read"]}]
SCREEN_ACTIVITY_READ_SECURITY = [{"type": "oauth2", "scopes": ["screen_activity.read"]}]
PEOPLE_READ_SECURITY = [{"type": "oauth2", "scopes": ["people.read"]}]
DAILY_SUMMARIES_READ_SECURITY = [{"type": "oauth2", "scopes": ["daily_summaries.read"]}]

@kodjima33 kodjima33 merged commit 3199113 into main Jun 11, 2026
3 checks passed
@kodjima33 kodjima33 deleted the kodjima33/omi-mcp-data-audit branch June 11, 2026 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant