component search v2#2308
Draft
Mbeaulne wants to merge 1 commit into
Draft
Conversation
🎩 PreviewA preview build has been created at: |
Collaborator
Author
This stack of pull requests is managed by Graphite. Learn more about stacking. |
376b12b to
fccd552
Compare
fccd552 to
22e4748
Compare
8 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Description
Adds an experimental Components V2 page with natural-language search over the component library, behind the
component-search-v2beta flag. Currently the Components page has no search — finding the right component in a large library is painful. This is the start of a real fix.Architecture: two layers, not one
The search uses a lexical index + optional LLM rerank pattern rather than sending every query to an LLM. This keeps the common case fast and cheap.
componentSearchIndex.ts) runs entirely in the browser. Tokenizes component name, description, input/output names, and container command/args (image, args, flags). Sub-10ms for hundreds of components. No API call, no key needed. Works for code-style queries (pandas,train_test_split,--epochs) and partial names.naturalLanguageComponentSearchService.ts) is opt-in via a ✨ button. Takes the top 20 lexical hits and asks an LLM to reorder them by intent and write a one-sentence reason per match. Reranking 20 candidates is cheap and fast; the model never sees the whole library.BYOK
AI rerank requires the user's own OpenAI-compatible API key (any provider — OpenAI, Anthropic via gateway, Gemini, Shopify LLM proxy, local Ollama, etc.). Stored in
localStorageonly — no shared key bundled in the app, no proxying through Tangle. Configured at Settings → Agent Configuration. Lexical search works with no key.What's intentionally NOT in this PR
reduce dimensionality→pca_decomposition). Requires build-step changes; deferred to a follow-up. The architecture has a clean hook point: addsemanticSearch()next tolexicalSearch()and merge.Notable design decisions
useMutation, not auseQuery. Reranking is an explicit user action. Auto-firing on keystrokes would burn tokens for nothing.useDeferredValue(React 19) instead of a debounce timer. Input stays snappy.beforeLoadredirect, not just hidden in the sidebar. Direct URL navigation to/components-v2redirects to/componentsif the flag is off.Related Issue and Pull requests
N/A — experimental beta feature.
Type of Change
Checklist
Screenshots
TODO: add screenshots of the search page (empty / lexical results / AI rerank) and the Agent Configuration settings page.
Test Instructions
Setup
Test lexical search (no key required)
N components indexed. Start typing to search.train→ components with "train" in name surface first. Each result has amatched: namebadge.pandas→ components whose container command imports pandas surface, withmatched: commandbadge.train test split→train_test_split(or similar) ranks at the top.dataset) → matches showmatched: inputs/outputs.asdfqwer→ "No components matched" message.Test AI rerank (BYOK required)
https://api.openai.com/v1+ yoursk-...key.clean up my data(assuming you have components likededupe_rows,drop_nulls).Why: ...line explaining the match.Test flag gating
/components-v2in the URL bar → redirects to/components.Test error handling
https://example.com/v1) and save.Regression
/runs,/pipelines,/favorites, and the editor still work.Additional Comments
This is intentionally scoped as an MVP behind a flag. Treat the lexical layer as the load-bearing piece — it works without any API key, and the architecture is set up so embeddings can slot in next to it without restructuring. The LLM rerank is the optional cherry on top, not the core mechanic.
Open questions for reviewers:
ComponentSearchConfig(model+thinkingModel) — currently onlythinkingModelis used (for rerank). Themodelfield is vestigial from the prior "fast vs thinking" toggle. Worth simplifying to one field in a follow-up, with a tiny localStorage migration.