Skip to content

Separate BatchGenerator into standalone Slicer and Batcher components? #172

Description

@weiji14

What is your issue?

Current state

Currently, xbatcher v0.3.0's BatchGenerator is this all-in-one class/function that does too many things, and there are more features planned. The 400+ lines of code at https://github.com/xarray-contrib/xbatcher/blob/v0.3.0/xbatcher/generators.py is not something easy for people to understand and contribute to without spending a few hours. To make things more maintainable and future proof, we might need a major refactor.

Proposal

Split BatchGenerator into 2 (or more) subcomponents. Specifically:

  1. A Slicer that does the slicing/subsetting/cropping/tiling/chipping from a multi-dimensional xarray object.
  2. A Batcher that groups together the pieces from the Slicer into batches of data.

These are the parameters from the current BatchGenerator that would be handled by each component:

Slicer:

  • input_dims
  • input_overlap

Batcher:

  • batch_dims
  • concat_input_dims
  • preload_batch

Benefits

Cons

  • May result in the current one-liner becoming a multi-liner
  • Could lead to some backwards incompatibility/breaking changes

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions