Skip to content

feat: add max_rows parameter to .app() (#194)#234

Draft
cpsievert wants to merge 8 commits into
mainfrom
feature/max-rows-194
Draft

feat: add max_rows parameter to .app() (#194)#234
cpsievert wants to merge 8 commits into
mainfrom
feature/max-rows-194

Conversation

@cpsievert
Copy link
Copy Markdown
Contributor

@cpsievert cpsievert commented May 14, 2026

Summary

Large datasets can overwhelm the data table in .app(). The new max_rows parameter (default 1000) truncates displayed rows while leaving the full dataset available via df() for charts, summaries, and other downstream views.

For lazy sources (Polars LazyFrame, R's tbl_sql), truncation is applied before collection — the backend only transfers max_rows rows instead of loading the full dataset into memory. For eager/in-memory sources, head() is applied at display time.

Closes #194.

Python:

  • max_rows parameter on .app() for all four frameworks (Shiny, Streamlit, Dash, Gradio)
  • maybe_truncate() uses as_narwhals(df, lazy=True) to preserve lazy semantics — one code path handles all source types
  • Card footer shows row/column info

R:

  • max_rows parameter on $app(), $app_obj(), and querychat_app()
  • maybe_truncate() detects tbl_sql and uses dplyr::tally() (COUNT query) + head() (LIMIT) before collect()
  • Card footer shows row/column info

Test plan

  • Python: 15 unit tests — eager (pandas, polars, narwhals, native) + lazy (Polars LazyFrame)
  • R: 19 unit tests — data.frame path + tbl_sql lazy path + info messages
  • All 165 existing R tests pass
  • Manual: querychat_app(mtcars) shows "Data has 32 rows and 11 columns."
  • Manual: large dataset shows truncation message + developer warning

cpsievert and others added 2 commits May 14, 2026 16:20
…194)

Add maybe_truncate() helper and max_rows parameter (default=1000) to
.app() methods for Shiny, Streamlit, Dash, and Gradio. Truncates
displayed data with a user-facing info message when the limit is
exceeded. This does not affect the number of rows the LLM can query.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Large datasets can overwhelm the data table display. The new max_rows
parameter (default 1000) truncates displayed rows while leaving the
full dataset available for LLM queries. A card footer shows row/column
counts and indicates when truncation is active.

Also fixes ruff S101 lint error in Python maybe_truncate().
@cpsievert cpsievert requested a review from Copilot May 14, 2026 21:32
@cpsievert cpsievert marked this pull request as draft May 14, 2026 21:33

This comment was marked as resolved.

…dling

For lazy sources (Polars LazyFrame, Ibis Table, R tbl_sql), truncation
is now applied before collection so the backend only transfers max_rows
rows — avoiding loading the full dataset into memory just to display
the first 1000 rows.

Python: uses narwhals lazy path (head + collect) for Polars LazyFrame,
and ibis count() + head() for Ibis Tables. Callers now pass raw data
directly to maybe_truncate instead of pre-collecting via as_narwhals.

R: detects tbl_sql and uses dplyr::tally() (COUNT query) + head()
(LIMIT query) before collect(). Removes manual collect() from app_obj.

Tests added for Polars LazyFrame (Python) and tbl_sql (R) paths.
@cpsievert cpsievert force-pushed the feature/max-rows-194 branch from 7afcc46 to 1ad488f Compare May 14, 2026 21:56
cpsievert and others added 5 commits May 14, 2026 21:58
The previous version had three separate code paths (ibis, lazy narwhals,
eager). This collapses them: use as_narwhals(lazy=True) when max_rows is
set (which preserves laziness for Polars LazyFrames and is harmless for
eager frames), and as_narwhals() when it's None. For R, tbl_sql gets
tally() + head() before collect(); everything else uses nrow() + head().

Net reduction of ~150 lines.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add max_rows parameter to .app() for large data handling

2 participants