fix(backend): stop expiring speech profiles in has_profile check (#5128)#7821
Conversation
The /v3/speech-profile endpoint applied a 90-day expiry (get_user_has_speech_profile(uid, max_age_days=90), added in 34be170), but nothing else honors that expiry: the /v4/listen pipeline (routers/transcribe.py) enables speaker identification and downloads the profile for embedding extraction regardless of age. So users whose profile is older than 90 days — and actively used for diarization — are told has_profile=false and nagged to "Teach Omi your voice" again. Remove the age cutoff so the banner endpoint agrees with the pipeline: an existing profile is a profile. Also drops the per-request blob metadata round-trip (blob.reload) the age check needed. Adds tests/unit/test_speech_profile_existence.py (registered in test.sh) covering existence semantics, the signature, and a structural guard that the router passes no age cutoff. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Greptile SummaryThis PR fixes a regression where users with speech profiles older than 90 days were incorrectly shown the "Teach Omi your voice" re-enrollment banner on every launch, because the
Confidence Score: 4/5Safe to merge — the change removes a one-line age filter that was never applied by the transcription pipeline, restoring consistency between the API and the actual diarization behavior. The core logic change is minimal and verifiably correct: both call sites in transcribe.py and speech_profile.py already passed no age limit, so collapsing to a plain blob.exists() call only removes the divergence. The one point of minor attention is the test test_endpoint_does_not_pass_age_cutoff, which derives the router path from storage_mod.file at runtime — correct given the backend/utils/other/storage.py nesting, but would silently break if the module path ever changes. No files require special attention. The structural test in test_speech_profile_existence.py warrants a glance if the directory layout changes in the future. Important Files Changed
Sequence DiagramsequenceDiagram
participant App as Mobile App
participant EP as /v3/speech-profile
participant Storage as utils/other/storage.py
participant GCS as Google Cloud Storage
participant TP as Transcription Pipeline
Note over App,TP: Before fix (regression)
App->>EP: GET /v3/speech-profile
EP->>Storage: "get_user_has_speech_profile(uid, max_age_days=90)"
Storage->>GCS: blob.exists()
Storage->>GCS: blob.reload() [fetch metadata]
Storage-->>EP: "False (profile >90 days old)"
EP-->>App: "{has_profile: false}"
App-->>App: Show re-enroll banner
TP->>Storage: get_user_has_speech_profile(uid)
Storage->>GCS: blob.exists()
Storage-->>TP: True
TP-->>TP: Uses profile for diarization
Note over App,TP: After fix (this PR)
App->>EP: GET /v3/speech-profile
EP->>Storage: get_user_has_speech_profile(uid)
Storage->>GCS: blob.exists()
Storage-->>EP: True
EP-->>App: "{has_profile: true}"
TP->>Storage: get_user_has_speech_profile(uid)
Storage->>GCS: blob.exists()
Storage-->>TP: True
TP-->>TP: Uses profile for diarization
Reviews (1): Last reviewed commit: "fix(backend): stop expiring speech profi..." | Re-trigger Greptile |
Bug (#5128, Bug 1)
Users who already have a speech profile are re-prompted to "Teach Omi your voice". Live on prod for every user whose profile is older than 90 days (i.e. most long-time users).
Root cause
Commit 34be170 (#3891, Dec 24 2025) added a 90-day expiry to the
/v3/speech-profileendpoint only:Nothing else honors that expiry:
routers/transcribe.py:764enables speaker identification viaget_user_has_speech_profile(uid)— no age limitrouters/transcribe.py:1833downloads the profile for embedding extraction regardless of ageget_profile_audio_if_exists()has no age limitSo a >90-day profile is actively used for diarization while the app is told
has_profile: falseand shows the re-teach banner every launch. The expiry never had the (presumably intended) effect of refreshing what the pipeline uses — its only production effect is the spurious nag.Fix
Remove the age cutoff from
get_user_has_speech_profileso the banner endpoint agrees with the listen pipeline: an existing profile is a profile. Zero pipeline behavior change. Bonus: drops the extra per-request GCS metadata round-trip (blob.reload()) the age check required.max_age_dayshad no other callers, so the dead parameter is removed too.Tests
tests/unit/test_speech_profile_existence.py(registered intest.sh): existence semantics (old profile counts, missing blob/bucket don't), noblob.reload()metadata fetch, signature guard (uidonly), and a structural guard that the router passes no age cutoff. All assertions verified locally on Python 3.11 against the real modules (pytest not installed locally; CI runs the suite).blackclean,lint_async_blockersclean.Note: Bug 2 in #5128 (recording UX glitches) is speculative/multi-cause and is not addressed here.
🤖 automated by hourly watchdog; opened for review, not merged.
Fixes #5128
🤖 Generated with Claude Code