Skip to content

[ array indexing fails for non-contiguous array indices #880

@LTLA

Description

@LTLA

First, let's mock up a dataset:

library(tiledb)

path <- "mock"
dom <- tiledb_domain(dims = list(
    tiledb_dim("d1", c(1L, 2000L), 2000L, type = "INT32"),
    tiledb_dim("d2", c(1L, 1500L), 1500L, type = "INT32")
))
schema <- tiledb_array_schema(
    dom,
    attrs = list(tiledb_attr("x", type = "FLOAT64")),
    sparse = TRUE
)
tiledb_array_create(path, schema)

smat <- Matrix::rsparsematrix(2000, 1500, density=0.2, repr="T")
arr <- tiledb_array(path, query_type = "WRITE")
arr[] <- data.frame(d1 = smat@i + 1L, d2 = smat@j + 1L, x = smat@x)
tiledb_array_close(arr)

Now, extracting a submatrix with consecutive indices is fine:

arr <- tiledb_array(path, query_type = "READ")
arr[1:10,1:10]
## $d1
##  [1]  4  7 10  4  7  9  1  7  3  5  7 10  9  1 10  5  8  2  3  4  5  6  8  1  6
## [26]  5  7 10
## 
## $d2
##  [1]  1  1  1  2  2  2  3  3  4  4  4  4  5  6  6  7  7  8  8  8  8  8  8  9  9
## [26] 10 10 10
## 
## $x
##  [1] -0.00081 -0.27000 -1.00000  1.30000 -0.31000  1.60000 -1.40000  0.02100
##  [9] -0.99000 -0.82000 -0.81000 -0.81000  1.10000  1.40000  0.49000 -1.40000
## [17] -0.17000 -0.79000  0.80000  0.03000  0.47000  0.36000 -1.40000  1.60000
## [25] -0.30000  1.00000  0.79000  0.76000
## 
## attr(,"query_status")
## [1] "COMPLETE"

But if we have non-contiguous indices, we get an incorrect result. See how there are odd numbers showing up in d1 despite the fact that only even indices in the first dimension were requested. It seems like the request is improperly collapsed to the range of requested indices (i.e., [2, 10]).

arr[c(2,4,6,8,10),1:10]
## $d1
##  [1]  4  7 10  4  7  9  7  3  5  7 10  9 10  5  8  2  3  4  5  6  8  6  5  7 10
## 
## $d2
##  [1]  1  1  1  2  2  2  3  4  4  4  4  5  6  7  7  8  8  8  8  8  8  9 10 10 10
## 
## $x
##  [1] -0.00081 -0.27000 -1.00000  1.30000 -0.31000  1.60000  0.02100 -0.99000
##  [9] -0.82000 -0.81000 -0.81000  1.10000  0.49000 -1.40000 -0.17000 -0.79000
## [17]  0.80000  0.03000  0.47000  0.36000 -1.40000 -0.30000  1.00000  0.79000
## [25]  0.76000
## 
## attr(,"query_status")
## [1] "COMPLETE"

FWIW selected_ranges() does the right thing, but if [ is not intended to support non-consecutive indices, then the method should throw an error rather than silently returning the wrong result.

selected_ranges(arr) <- list(cbind(c(2,4,6,8,10), c(2,4,6,8,10)), cbind(1, 10))
arr[]
## $d1
##  [1]  4 10  4 10 10  8  2  4  6  8  6 10
## 
## $d2
##  [1]  1  1  2  4  6  7  8  8  8  8  9 10
## 
## $x
##  [1] -0.00081 -1.00000  1.30000 -0.81000  0.49000 -0.17000 -0.79000  0.03000
##  [9]  0.36000 -1.40000 -0.30000  0.76000
## 
## attr(,"query_status")
## [1] "COMPLETE"
Session information
R Under development (unstable) (2026-02-19 r89439)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.5 LTS

Matrix products: default
BLAS:   /home/luna/Software/R/trunk/lib/libRblas.so 
LAPACK: /home/luna/Software/R/trunk/lib/libRlapack.so;  LAPACK version 3.12.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Australia/Sydney
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RcppSpdlog_0.0.28 tiledb_0.33.0    

loaded via a namespace (and not attached):
 [1] zoo_1.8-15          bit_4.6.0           compiler_4.6.0     
 [4] Matrix_1.7-5        tools_4.6.0         RcppCCTZ_0.2.14    
 [7] spdl_0.0.5          Rcpp_1.1.1          nanoarrow_0.8.0    
[10] bit64_4.6.0-1       nanotime_0.3.13     grid_4.6.0         
[13] data.table_1.18.2.1 lattice_0.22-9     

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions