Skip to content

fix: lazy polars import for Ivy Bridge CPU compatibility#117

Open
LuJiansen wants to merge 1 commit into
mcvickerlab:masterfrom
LuJiansen:fix/ivy-bridge-compat
Open

fix: lazy polars import for Ivy Bridge CPU compatibility#117
LuJiansen wants to merge 1 commit into
mcvickerlab:masterfrom
LuJiansen:fix/ivy-bridge-compat

Conversation

@LuJiansen

Copy link
Copy Markdown

Problem

On Intel Ivy Bridge CPUs (Xeon E5-2670 v2), the polars-runtime-32 shared library crashes with Illegal instruction because it is compiled with AVX2/AVX-512 instructions that these CPUs lack. This affects both wasp2-map make-reads and wasp2-count count-variants.

Clusters with mixed CPU architectures (Ivy Bridge + Haswell in the same Slurm partition) cannot use --constraint to avoid these nodes.

Solution

Make polars imports lazy — move import polars as pl from module top-level into the functions that actually use polars. Add TYPE_CHECKING guards so static analysis can still resolve pl.DataFrame annotations without triggering a runtime import.

For count-variants (the most common counting path), add a polars-free alternative make_count_df_no_polars() using pure Python csv + the existing Rust BamCounter.

Files changed (5 files, +101 −29)

File Change
src/counting/count_alleles.py TYPE_CHECKING guard + lazy import + new make_count_df_no_polars()
src/counting/run_counting.py Route non-gene paths through make_count_df_no_polars; remove unused imports and dead assignments
src/counting/filter_variant_data.py TYPE_CHECKING guard + lazy import
src/counting/parse_gene_data.py TYPE_CHECKING guard + lazy import
src/counting/count_alleles_sc.py TYPE_CHECKING guard + lazy import

What is NOT broken

  • make_count_df() (polars path) still works on CPUs with AVX2 support
  • Gene intersection mode (GTF/GFF) still uses polars
  • mapping/intersect_variant_data.py was already lazy-imported — no changes needed

Testing

  • ruff check: All checks passed
  • ruff format --check: 5 files already formatted
  • py_compile: 5/5 files pass
  • Full ATAC-seq WASP pipeline (fastp → bwa → make-reads → filter → haplotag → counting) completes on Ivy Bridge nodes
  • _parse_intersect_tsv unit tested
  • Backward compatible: existing polars-dependent code paths unchanged

Make polars imports lazy in counting/ modules to prevent
'Illegal instruction' crashes on Intel Ivy Bridge (Xeon E5-2670 v2)
nodes that lack AVX2/AVX-512 support.

Changes:
- counting/count_alleles.py:
  * Add TYPE_CHECKING guard for pl.DataFrame annotation
  * Lazy import polars inside make_count_df()
  * Add make_count_df_no_polars() using pure Python csv + Rust counting
- counting/run_counting.py:
  * Route non-gene paths through make_count_df_no_polars
  * Remove unused parse_intersect_region_new import
  * Remove unused region_col_name assignments
- counting/filter_variant_data.py:
  * Add TYPE_CHECKING guard + lazy import in gtf_to_bed()
- counting/parse_gene_data.py:
  * Add TYPE_CHECKING guard + lazy import in parse_gene_file()
- counting/count_alleles_sc.py:
  * Add TYPE_CHECKING guard + lazy import in make_count_matrix()

ruff check + ruff format: clean
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant