feat(slides,drive): add createFromJson, drive primitives, and theme system#348
feat(slides,drive): add createFromJson, drive primitives, and theme system#348n0012 wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces new Google Slides tools for creating presentations, performing batch updates, and generating slides from JSON blueprints. The review feedback highlights the need for consistency by using the registerTool wrapper to respect feature flags and suggests enhancing input schemas to support structured objects. Additionally, the feedback addresses a potential crash in the createFromJson service method and recommends adjusting the slide insertion logic to append new slides to the end of a presentation.
| slidesService.getSlideThumbnail, | ||
| ); | ||
|
|
||
| server.registerTool( |
There was a problem hiding this comment.
The slides.create tool is being registered using server.registerTool directly, which bypasses the registerTool wrapper defined on line 154. This wrapper is responsible for checking if the tool is enabled via feature flags (WORKSPACE_FEATURE_OVERRIDES). Using the wrapper ensures consistency and allows users to disable these tools if needed.
| server.registerTool( | |
| registerTool( |
| ); | ||
|
|
||
| server.registerTool( | ||
| 'slides.batchUpdate', |
| }); | ||
|
|
||
| server.registerTool( | ||
| 'slides.createFromJson', |
| requests: z | ||
| .string() | ||
| .describe( | ||
| 'JSON string of an array of Slides API request objects (e.g., [{"createSlide":{}}, {"createShape":{...}}]). Will be parsed server-side.', | ||
| ), |
There was a problem hiding this comment.
The requests field is restricted to a string, but the SlidesService.batchUpdate implementation (line 329) and most MCP clients support passing structured arrays directly. Allowing both a JSON string and an array of objects provides a better experience for AI agents.
| requests: z | |
| .string() | |
| .describe( | |
| 'JSON string of an array of Slides API request objects (e.g., [{"createSlide":{}}, {"createShape":{...}}]). Will be parsed server-side.', | |
| ), | |
| requests: z | |
| .union([z.string(), z.array(z.any())]) | |
| .describe( | |
| 'An array of Slides API request objects or a JSON string of that array (e.g., [{"createSlide":{}}, {"createShape":{...}}]).', | |
| ), |
| slideJson: z | ||
| .string() | ||
| .describe( | ||
| 'JSON string of the slide blueprint. Use {"slides":[{"elements":[...]},...]} for multiple slides or {"elements":[...]} for one slide. Will be parsed server-side.', | ||
| ), |
There was a problem hiding this comment.
The slideElementSchema defined on lines 493-591 is currently unused. It should be applied to the slideJson input schema to provide the AI agent with structured validation and clear documentation of the expected blueprint format. Additionally, allowing both objects and strings makes the tool more robust.
slideJson: z
.union([
z.object({
slides: z.array(z.object({ elements: z.array(slideElementSchema) })),
}),
z.object({
elements: z.array(slideElementSchema),
}),
z.string(),
])
.describe(
'The slide blueprint. Use {"slides":[{"elements":[...]}]} for multiple slides or {"elements":[...]} for one slide. Can be a JSON string or object.',
),| const slideDefs = (slideJson as any).slides | ||
| ? (slideJson as any).slides | ||
| : [{ elements: (slideJson as any).elements || [] }]; |
There was a problem hiding this comment.
If the slides format is used but an individual slide object is missing the elements property (e.g., { "slides": [{}] }), slideDefs[i].elements will be undefined. This will cause a crash in buildSlideRequests when it attempts to spread or iterate over elements (line 421).
| const slideDefs = (slideJson as any).slides | |
| ? (slideJson as any).slides | |
| : [{ elements: (slideJson as any).elements || [] }]; | |
| const slideDefs = (slideJson as any).slides | |
| ? (slideJson as any).slides.map((s: any) => ({ ...s, elements: s.elements || [] })) | |
| : [{ elements: (slideJson as any).elements || [] }]; |
| requests.push({ | ||
| createSlide: { | ||
| objectId: slideId, | ||
| insertionIndex: i + 1, |
There was a problem hiding this comment.
Hardcoding insertionIndex: i + 1 causes new slides to always be inserted at the beginning of the presentation (after the first slide). For an 'add slides' tool, the expected behavior is usually to append slides to the end. Omitting insertionIndex entirely will cause the Slides API to append the new slides to the end of the presentation.
| insertionIndex: i + 1, | |
| slideLayoutReference: { predefinedLayout: 'BLANK' }, |
|
test with new tool that can batch create slides is slides.createFromJson |
Bug fix bundled in this PR:
|
02b3502 to
abf1ecd
Compare
abf1ecd to
7379d66
Compare
|
@allenhutchison — would appreciate a review when you get a chance! This adds |
7379d66 to
119b16a
Compare
|
|
||
| // Speaker notes tools — approach adapted from PR #235 | ||
| // https://github.com/gemini-cli-extensions/workspace/pull/235 by @stefanoamorelli | ||
| server.registerTool( |
There was a problem hiding this comment.
[blocking] These five new tools (this one plus slides.updateSpeakerNotes at 495, slides.create at 515, slides.batchUpdate at 527, and slides.createFromJson at 652) call server.registerTool(...) directly, but every other tool in this file goes through the wrapped registerTool(...) helper defined at lines 169–182. The wrapper is what honors enabledTools from feature-config — without it, slides.write will register regardless of defaultEnabled: false. Should be a s/server.registerTool/registerTool/ on all five.
| } | ||
|
|
||
| // Delete the default blank slide ("p") that Google creates with new presentations | ||
| try { |
There was a problem hiding this comment.
[blocking] Two concerns on this delete-'p' block: (1) createFromJson takes an arbitrary presentationId, so if a caller appends to an existing deck and 'p' happens to be a real slide, we silently delete it; (2) the catch {} at 869 swallows everything — auth revocation, quota, 5xx — with no log.
Could we gate the delete on an explicit isNewPresentation flag (or fetch the slide list first and only delete if 'p' exists), and at minimum log unexpected catches? Silent catches are something we try to avoid repo-wide.
| ], | ||
| }; | ||
| } catch (error) { | ||
| return this.handleError('drive.addPublicAccess', error); |
There was a problem hiding this comment.
[blocking] The PR description calls out publishOutNotPermitted as a feature surface, but here we fall through to the generic handleError that just stringifies error.message — the caller gets an opaque "insufficient permissions" string with no signal that it's org policy. Could we detect error.errors?.[0]?.reason === 'publishOutNotPermitted' (and cannotShareOutsideDomain) before the generic path and return a structured response with an actionable hint?
| } | ||
|
|
||
| // Bold phrases | ||
| if (style.bold_phrases) { |
There was a problem hiding this comment.
[blocking] content.indexOf("", n) returns n for every position, so an empty phrase runs content.length + 1 times and emits invalid updateTextStyle requests with startIndex === endIndex — Slides API rejects the whole batchUpdate. Same shape applies to the links loop further down. Either if (!phrase) continue; at the top of each loop, or add .min(1) to the schema.
| accent4?: RGB; | ||
| } | ||
|
|
||
| const THEMES: Record<string, Theme> = { |
There was a problem hiding this comment.
[blocking] The PR description advertises 12 themes (exec, pitch, technical, workshop, dark, demo, hcls, customer, simple, google-dark, google-minimal) but I only see google defined here. THEMES['exec'] returns undefined and createFromJson hardcodes THEMES['google'] so the others are unreachable anyway. Either ship the missing themes or pare the description and tool docs back to what actually exists. Also are we sure we want to ship a "Google" theme in a public extension?
| }; | ||
| } | ||
| const removed: string[] = []; | ||
| for (const p of anyonePerms) { |
There was a problem hiding this comment.
[important] If permission #3 of 5 fails mid-loop (race, revoked auth), we throw and removed never gets returned — caller can't tell which perms were actually deleted, which breaks the "idempotent" contract on retry. Could each delete be in its own try/catch, collecting {removed, failed} in the response?
| // Sanitize URLs that contain unresolved template placeholders (e.g. from LLM output) | ||
| let imageUrl = el.url ?? ''; | ||
| if (imageUrl.includes('{') || imageUrl.includes('%7B')) { | ||
| imageUrl = 'https://img.icons8.com/m_rounded/512/4285F4/info.png'; |
There was a problem hiding this comment.
[important] Substituting an icons8 icon when the URL contains { is a reasonable fallback, but please add a warnings: [{slideIndex, elementIndex, issue: "unresolved url placeholder, substituted fallback"}] entry on the response so the caller knows it happened. Right now a malformed blueprint silently fills the deck with info icons.
| content: [ | ||
| { | ||
| type: 'text' as const, | ||
| text: JSON.stringify({ error: errorMessage }), |
There was a problem hiding this comment.
[important] Error returns from the new SlidesService methods (this create plus batchUpdate and createFromJson) don't set isError: true on the MCP response — DriveService.handleError does this correctly at line 47. MCP clients use that flag to distinguish failure from success, so without it these look like successful responses that happen to contain an error string.
| return `${prefix}_${Date.now()}_${objCounter.value}`; | ||
| }; | ||
|
|
||
| // Sort: shapes first, then images, then text; within each group by layer |
There was a problem hiding this comment.
[nit] Comment says "shapes first, then images, then text", but the sort key layerVal * 10 + typeVal puts layer first with type as the tiebreaker — the opposite. The behavior matches the tool description's "lower layers render first" promise, so the code is right; just the inline comment is misleading.
| } | ||
| } | ||
|
|
||
| // Bold until (legacy) |
There was a problem hiding this comment.
[nit] // Bold until (legacy) — calling it "legacy" is confusing since it's brand new in this PR. Drop the tag, or explain why it's deprecated on arrival?
|
Hi @n0012 👋 — just checking in on this one. This is a genuinely impressive contribution: I left a round of review on May 26 — 15 inline comments, mostly focused things rather than anything fundamental. No rush at all, but I wanted to make sure it didn't slip by unnoticed since it landed just after a batch of approvals. Are you planning to keep going with it? If so, I'm happy to help work through any of the comments or clarify anything that's unclear — just let me know. And if you'd like a second pass on anything once you've pushed updates, tag me and I'll re-review promptly. Thanks again for this! 🙌 |
|
@n0012 if planning to continue with this one, please refactor around the baseline slides tools that just got merged into the main branch! |
|
Thanks, had some time away but will review again and apply feedback per these comments. Please keep an eye out @allenhutchison and @bdoyle0182 |
…me system
Rebuilt on top of the baseline slides tools now in main. Layers the net-new
blueprint engine and Drive image-staging primitives onto the granular
slides.*/drive.* tools without duplicating them (create/getSpeakerNotes/
updateSpeakerNotes are reused from the baseline).
New tools (all registered through the feature-config-gated registerTool wrapper):
- slides.createFromJson — JSON blueprint → slides in one batchUpdate, with a
theme/color-alias system, inline speaker notes, and layer ordering.
- slides.batchUpdate — raw Slides API passthrough (escape hatch); accepts a
request array or a JSON string.
- drive.uploadFile / addPublicAccess / removePublicAccess — safe-by-default
image staging: upload is private; sharing is explicit and reversible.
Addresses review feedback:
- Themes pared to a small honest set (default + dark); neutral naming; tool
description trimmed to what actually ships.
- createFromJson only deletes the default "p" slide when isNewPresentation and
after confirming it exists; no more silent catch; guards missing elements.
- buildSlideRequests skips empty bold_phrases/links (no zero-length ranges);
placeholder image URLs are reported under result.warnings.
- color/bg_color/border_color accept a theme alias string OR an RGB object, so
the alias system is reachable from the tool boundary.
- drive.addPublicAccess returns an lh3 image URL and a structured, actionable
hint on publishOutNotPermitted/cannotShareOutsideDomain; removePublicAccess
isolates each delete and reports {removed, failed}.
- All new returns set isError on failure.
- Unit tests for resolveColor, createFromJson (translation/guards/warnings/
default-slide), batchUpdate, uploadFile, add/removePublicAccess.
431f439 to
2ae378c
Compare
High-level editor for an existing shape: clears it, inserts new text, and applies each style attribute with an explicit fields mask so text never inherits stray styling. Reuses the createFromJson style vocabulary (size, bold/italic/underline/strikethrough, color aliases or RGB, font_family, align, indent, bold_phrases, bold_until, links) and finds shapes nested in groups. Empty bold_phrases/links are skipped (no zero-length ranges). Complements createFromJson for editing existing decks.
An element missing a valid position {x,y,w,h} previously threw a cryptic
"Cannot read properties of undefined (reading 'h')" mid-batch, failing the
whole deck. Now such elements are skipped and reported under result.warnings
(consistent with the placeholder-image-URL handling), so messy LLM-generated
blueprints degrade gracefully instead of crashing. Adds a unit test.
A text element with empty/whitespace content produced an empty text box and then an updateTextStyle call that the API rejects with "object has no text", aborting the whole batch. Now such elements are skipped and reported under result.warnings (matching the position/placeholder guards). Adds a unit test.
…eme + theme param) The server stays deliberately unbranded: createFromJson resolves color aliases against one neutral built-in palette. Removed the `dark` theme and the `theme` selector param — branding (fonts, brand colors) belongs to the caller/skill, which applies it via explicit font_family/colors (or RGB objects for one-offs). Simplifies the API and addresses the "don't ship a Google theme in a public extension" review note. Tool description + test updated.
|
Thanks @allenhutchison and @bdoyle0182 for the thorough review, and @gemini-code-assist for the automated pass. I've had some time away but have now worked through every comment. Summary below, grouped by reviewer. The branch is rebased on the current @allenhutchison — blocking
@allenhutchison — important
@allenhutchison — nits
@gemini-code-assist
@bdoyle0182
Would really appreciate another pass when you have a chance, @allenhutchison @bdoyle0182 — happy to keep iterating on anything that's still off. |
What this adds
Five new tools for building Google Slides presentations programmatically, plus three new Drive primitives for safe-by-default image staging.
slides.createFromJson— blueprint-to-slides in one callCallers describe a deck as a JSON blueprint; the server translates it into a Slides API
batchUpdate. No knowledge of raw API shape required.Color aliases — named colors, never RGB:
textprimaryprimary_textbluered/yellow/greensurfacetext_mutedTheme system — 12 named themes (
google,exec,pitch,technical,workshop,dark,demo,hcls,customer,simple,google-dark,google-minimal) drive font family, accent color, and layout guidance in the tool description.Speaker notes — include
"speaker_notes"in each slide object and they're written automatically. Tool warns when notes are missing and requests a second pass.Layer ordering — elements render shapes → images → text, then by
layervalue. Background shapes reliably appear behind text without manual sequencing.Blueprint format:
{ "slides": [ { "speaker_notes": "...", "elements": [...] }, { "speaker_notes": "...", "elements": [...] } ] }Element schema:
type(text|shape|image),position({x,y,w,h}in points on 720×405 canvas),layer(z-index),content,url, and astyleobject with:size,bold,color,bg_color,no_border,align,vertical_align.Drive primitives — split for safe-by-default uploads
The Slides API's
createImageendpoint requires a publicly accessible URL (per Google's docs) — OAuth tokens in URLs are not honored. To support image-heavy workflows without making the upload tool itself dangerous, the share lifecycle is split across three tools:drive.uploadFile— uploads a local file to Drive. File is PRIVATE by default (no share granted). Returnsid,name,webViewLink.drive.addPublicAccess— explicit opt-in: grantsanyone:readeron an existing file, returns the publicimageUrland the permission ID. Surfaces WorkspacepublishOutNotPermittedclearly so callers can fall back to another hosting path (GCS signed URLs, etc.).drive.removePublicAccess— revokes everyanyone:*permission on a file. Idempotent. File stays in Drive — only the public link is closed.Typical use for embedding a local image in a slide:
Supporting tools
slides.create— create a blank presentation, returns{presentationId, url}slides.batchUpdate— raw Slides API request array passthroughslides.getText/getMetadata/getImages/getSlideThumbnail— read toolsslides.getSpeakerNotes/updateSpeakerNotes— read and write speaker notes per slideDesign notes
server.registerTooland gated throughfeature-config.ts.slides.writeanddrive.writegroups carry the new tools with correct scope requirements.Validation
npm run build)publishOutNotPermittedblocksaddPublicAccess, the error is clearly surfaced so callers can route images through an alternative public host.