Commit e9cbabd
authored
feat(parquet): batch consecutive null/empty rows in
# Which issue does this PR close?
- Spawn off from #9653
- Contributes to #9731
# Rationale for this change
See #9731
# What changes are included in this PR?
Restructure `write_list()` to accumulate consecutive null and empty rows
and flush them in a single `visit_leaves()` call using
`extend(repeat_n(...))`, instead of calling `visit_leaves()` per row.
With sparse data (99% nulls), a 4096-row batch previously triggered
~4000 individual tree traversals, each pushing a single value per leaf.
Now consecutive null/empty runs are collapsed into one traversal that
extends all leaf level buffers in bulk.
This follows the same pattern already used by `write_struct()`. The
`write_non_null_slice` path is unchanged since each non-null row has
different offsets and cannot be batched.
# Are these changes tested?
All tests passing; existing tests give 100% coverage.
# Are there any user-facing changes?
N/A
Signed-off-by: Hippolyte Barraud <hippolyte.barraud@datadoghq.com>write_list (#9752)1 parent 73ceb1d commit e9cbabd
1 file changed
Lines changed: 49 additions & 19 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
336 | 336 | | |
337 | 337 | | |
338 | 338 | | |
339 | | - | |
340 | | - | |
341 | | - | |
342 | | - | |
343 | | - | |
344 | | - | |
345 | | - | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
346 | 352 | | |
347 | 353 | | |
348 | | - | |
349 | | - | |
350 | | - | |
351 | | - | |
352 | | - | |
353 | | - | |
354 | | - | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
355 | 367 | | |
356 | 368 | | |
357 | 369 | | |
358 | 370 | | |
359 | 371 | | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
360 | 375 | | |
361 | 376 | | |
362 | 377 | | |
363 | 378 | | |
364 | 379 | | |
| 380 | + | |
365 | 381 | | |
366 | | - | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
367 | 385 | | |
368 | | - | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
369 | 389 | | |
370 | | - | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
371 | 395 | | |
372 | 396 | | |
| 397 | + | |
| 398 | + | |
373 | 399 | | |
374 | 400 | | |
| 401 | + | |
375 | 402 | | |
376 | 403 | | |
377 | 404 | | |
378 | 405 | | |
379 | | - | |
| 406 | + | |
380 | 407 | | |
381 | | - | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
382 | 411 | | |
383 | 412 | | |
| 413 | + | |
384 | 414 | | |
385 | 415 | | |
386 | 416 | | |
| |||
0 commit comments