Changes in version 1.0                         

  - ~~Parallelization~~ (basic support complete. See below)
  - Native python library (re-using C++ backend)
  - Peak-gene correlations
  - MACS peak calling

Contributions welcome :)

                        Changes in version 0.2.1                        

Features

  - apply_by_col() and apply_by_row() allow providing custom R functions
    to compute per row/col summaries. In initial tests calculating
    row/col means using R functions is ~2x slower than the C++-based
    implementation but memory usage remains low.
  - Add rowMaxs() and colMaxs() functions, which return the maximum
    value in each row or column of a matrix. If matrixStats or
    MatrixGenerics packages are installed, BPCells::rowMaxs() will fall
    back to their implementations for non-BPCells objects. Thanks to
    @immanuelazn for their first contribution as a new lab hire!
  - Add regress_out() to allow removing unwanted sources of variation
    via least squares linear regression models. Thanks to @ycli1995 for
    pull request #110
  - Add trackplot_genome_annotation() for plotting peaks, with options
    for directional arrows, colors, labels, and peak widths. (pull
    request #113)

Improvements

  - trackplot_loop() now accepts discrete color scales

Bug-fixes

  - Fixed error message when a matrix is too large to be converted to
    dgCMatrix. (Thanks to @RookieA1 for reporting issue #95)
  - Fixed forgetting dimnames when subsetting after certain sets of
    operations. (Thanks to @Yunuuuu for reporting issues #97 and #100)
  - Fixed plotting crashes when running trackplot_coverage with
    fragments from a single cluster. (Thanks to @sjessa for directly
    reporting this bug and coming up with a fix)

                        Changes in version 0.2.0                        

We are finally declaring a new release version, covering a large amount
of changes and improvements over the past year. Among the major features
here are parallelization options for svds() and matrix_stats(), improved
genomic track plots, and runtime CPU feature detection for SIMD code
(enables higher performance, more portable builds). Full details of
changes below.

This version also comes with a new installation path, which is done in
preparation for a future Python package release. (So we can have one
folder for R and one for Python, rather than having all the R files sit
in the root folder). This is a breaking change and requires a slightly
modified installation command.

Thanks to @brgew, @ycli1995, and @Yunuuuu for pull requests that
contributed to this release, as well as all users who submitted github
issues to help identify and fix bugs.

Breaking changes

  - Installation location has changed, to make room for a future python
    package release. New installs will have to use
    remotes::install_github("bnprks/BPCells/r") (note the additional /r)
      - r-universe mirrors will have to add "subdir": "r" to their
        packages.json config.
  - New slots have been added to 10x matrix objects, so any saved RDS
    files may need to have their 10x matrix inputs re-opened and
    replaced by calling all_matrix_inputs(). Outside of loading old RDS
    files no changes should be needed.
  - trackplot_gene() now returns a plot with a facet label to match the
    new trackplot system. This label can be removed by by calling
    trackplot_gene(...) + ggplot2::facet_null() to be equivalent to the
    old function's output.

Deprecations

  - draw_trackplot_grid() deprecated, replaced by trackplot_combine()
    with simplified arguments
  - trackplot_bulk() has been deprecated, replaced by
    trackplot_coverage() with equivalent functionality
  - The old function names will output deprecation warnings, but
    otherwise work as before.

Features

  - New svds() function, based on the excellent Spectra C++ library
    (used in RSpectra) by Yixuan Qiu. This should ensure lower memory
    usage compared to irlba, while achieving similar speed + accuracy.
  - Limited parallelization is now supported. This is easiest to use via
    the threads argument to matrix_stats() and svds().
      - All normalizations are supported, but a few operations like
        marker_features() and writing a matrix to disk remain
        single-threaded.
      - Running svds() with many threads on gene-major matrices can
        result in high memory usage for now. This problem is not present
        for cell-major matrices.
  - Reading text-based MatrixMarket inputs (e.g. from 10x or Parse) is
    now supported via import_matrix_market() and the convenience
    function import_matrix_market_10x(). Our implementation uses
    disk-backed sorting to allow importing large files with low memory
    usage.
  - Added binarize() function and associated generics <, <=, >, and >=.
    This only supports comparison with non-negative numbers currently.
    (Thanks to contribution from @brgew)
  - Added round() matrix transformation (Thanks to contributions from
    @brgew)
  - Add getter/setter function all_matrix_inputs() to help enable
    relocating the underlying storage for BPCells matrix transform
    objects.
  - All hdf5-writing functions now support a gzip_level parameter, which
    will enable a shuffle + gzip filter for compression. This is
    generally much slower than bitpacking compression, but it adds
    improved storage options for files that must be read by outside
    programs. Thanks to @ycli1995 for submitting this improvement in
    pull #42.
  - AnnData export now supported via write_matrix_anndata_hdf5() (issue
    #49)
  - Re-licensed code base to use dual-licensed Apache V2 or MIT instead
    of GPLv3
  - Assigning to a subset is now supported (e.g. m1[i,j] <- m2). Note
    that this does not modify data on disk. Instead, it uses a series of
    subsetting and concatenation operations to provide the appearance of
    overwriting the appropriate entries.
  - Added knn_to_geodesic_graph(), which matches the Scanpy default
    construction for graph-based clustering
  - Add checksum(), which allows for calculating an MD5 checksum of a
    matrix contents. Thanks to @brgrew for submitting this improvement
    in pull request #83
  - write_insertion_bedgraph() allows exporting pseudobulk insertion
    data to bedgraph format

Improvements

  - Merging fragments with c() now handles inputs with mismatched
    chromosome names.
  - Merging fragments is now 2-3.5x faster
  - SNN graph construction in knn_to_snn_graph() should work more
    smoothly on large datasets due to C++ implementation
  - Reduced memory usage in marker_features() for samples with millions
    of cells and a large number of clusters to compare.
  - On Windows, increased the maximum number of files that can be
    simultaneously open. Previously, opening >63 compressed counts
    matrices simultaneously would hit the limit. Now at least 1,000
    simultaneous matrices should be possible.
  - Subsetting peak or tile matrices with [ now propagates through so we
    always avoid computing parts of the peak/tile matrix that have been
    discarded by our subset. Subsetting a tile matrix will automatically
    convert into a peak matrix when possible for improved efficiency.
  - Subsetting RowBindMatrices and ColBindMatrices now propagates
    through so we avoid touching matrices with no selected indices
  - Added logic to help reduce cases where subsetting causes BPCells to
    fall back to a less efficient matrix-vector multiply algorithm. This
    affects most math transforms. As part of this, the filtering part of
    a subset will propagate to earlier transformation steps, while the
    reordering will not. Thanks to @nimanouri-nm for raising issue #65
    to fix a bug in the initial implementation.
  - Additional C++17 filesystem backwards compatibility that should
    allow slightly older compilers such as GCC 7.5 to build BPCells.
  - as.matrix() will produce integer matrices when appropriate (Thanks
    to @Yunuuuu in pull #77)
  - 10x HDF5 matrices can now read and write non-integer types when
    requested (Thanks to @ycli1995 in pull #75)
  - Old-style 10x files from cellranger v2 can now read multi-genome
    files, which are returned as a list (Thanks to @ycli1995 in pull
    #75)
  - Trackplots have received several improvements
      - Trackplots now use faceting to provide per-plot labels, leading
        to an easier-to-use trackplot_combine()
      - trackplot_gene() now draws arrows for the direction of
        transcription
      - trackplot_loop() is a new track type allows plotting
        interactions between genomic regions, for instance peak-gene
        correlations or loop calls from Hi-C
      - trackplot_scalebar() is added to show genomic scale
      - All trackplot functions now return ggplot objects with
        additional metadata stored for the plotting height of each track
      - Labels and heights for trackplots can be adjusted using
        set_trackplot_label() and set_trackplot_height()
      - The getting started pbmc 3k vignette now includes the updated
        trackplot APIs in its final example
  - Add rowVars() and colVars() functions, as convenience wrappers
    around matrix_stats(). If matrixStats or MatrixGenerics packages are
    installed, BPCells::rowVars() will fall back to their
    implementations for non-BPCells objects. Unfortunately,
    matrixStats::rowVars() is not generic, so either BPCells::rowVars()
    or BPCells::colVars()
  - Optimize mean and variance calculations for matrices added to a
    per-row or per-column constant.
  - Migrate SIMD code to use highway.
      - Adds run-time detection of CPU features to eliminate
        architecture-specific compilation
      - For now, the Pow SIMD implementation is removed, but Square gets
        a new SIMD implementation
      - Empirically, most operations using SIMD math instructions are
        about 2x faster. This includes log1p(), and
        sctransform_pearson()
      - Minor speedups on dense-sparse matrix multiply functions
        (1.1-1.5x faster)

Bug-fixes

  - Fixed a few fragment transforms where using chrNames(frags) <- val
    or cellNames(frags) <- val could cause downstream errors.
  - Fixed errors in transpose_storage_order() for matrices with >4
    billion non-zero entries.
  - Fixed error in transpose_storage_order() for matrices with no
    non-zero entries.
  - Fixed bug writing fragment files with >512 chromosomes.
  - Fixed bug when reading fragment files with >4 billion fragments.
  - Fixed file permissions errors when using read-only hdf5 files (Issue
    #26 reported thanks to @ttumkaya)
  - Renaming rownames() or colnames() is now propagated when saving
    matrices (Issue #29 reported thanks to @realzehuali, with an
    additional fix after report thanks to @Dario-Rocha)
  - Fixed 64-bit integer overflow (!) that could cause incorrect p-value
    calculations in marker_features() for features with more than 2.6
    million zeros.
  - Improved robustness of the Windows installation process for setups
    that do not need the -lsz linker flag to compile hdf5
  - Fixed possible memory safety bug where wrapped R objects (such as
    dgCMatrix) could be potentially garbage collected while C++ was
    still trying to access the data in rare circumstances.
  - Fixed case when dimnames were not preserved when calling
    convert_matrix_type() twice in a row such that it cancels out (e.g.
    double -> uint32_t -> double). Thanks to @brgrew reporting issue #43
  - Caused and fixed issue resulting in unusably slow performance
    reading matrices from HDF5 files. Broken versions range from
    commit 21f8dcf until the fix in 3711a40
    (October 18-November 3, 2023). Thanks to @abhiachoudhary for
    reporting this in issue #53
  - Fixed error with svds() not handling row-major matrices correctly.
    Thanks to @ycli1995 for reporting this in issue #55
  - Fixed error with row/col name handling for AnnData matrices. Thanks
    to @lisch7 for reporting this in issue #57
  - Fixed error with merging matrices of different data types. Thanks to
    @Yunuuuu for identifying the issue and providing a fix (#68 and #70)
  - Fixed issue with losing dimnames on subset assignment [<-. Thanks to
    @Yunuuuu for identifying the issue #67
  - Fixed incorrect results with some cases of scaling matrix after
    shifting. Thanks to @Yunuuuu for identifying the issue #72
  - Fixed infinite loop bug when calling transpose_storage_order() on a
    densely-transformed matrix. Thanks to @Yunuuuu for reporting this in
    issue #71
  - h5ad outputs will now subset properly when loaded by the Python
    anndata package (Thanks to issue described by @ggruenhagen3 in issue
    #49 and fixed by @ycli1995 in pull #81)
  - Disk-backed fragment objects now load via absolute path, matching
    the behavior of matrices and making it so objects loaded via
    readRDS() can be used from different working directories.
  - footprints() now respects user interrupts via Ctrl-C

                        Changes in version 0.1.0                        

Features

  - ATAC-seq Analysis
      - Reading/writing 10x fragment files on disk
      - Reading/writing compressed fragments on disk (in folder or hdf5
        group)
      - Interconversion of fragments objects with GRanges / data.frame
      - Merging of multiple source fragment files transparently at run
        time
      - Calculation of Cell x Peak matrices, and Cell x Tile matrices
      - ArchR-compatible QC calculations
      - ArchR-compatible gene activity score calculations
      - Filtering fragments by chromosmes, cells, lengths, or genomic
        region
      - Fast peak calling approximation via overlapping tiles
  - Single cell matrices
      - Conversion to/from R sparse matrices
      - Read-write access to 10x hdf5 feature matrices, and read-only
        access to AnnData files
      - Reading/writing of compressed matrices on disk (in folder or
        hdf5 group)
      - Support for integer or single/double-precision floating point
        matrices on disk
      - Fast transposition of storage order, to switch between indexing
        by cell or by gene/feature.
      - Concatenation of multiple source matrix files transparently at
        run time
      - Single-pass calculation of row/column mean and variance
      - Wilcoxon marker feature calculation
      - Transparent handling of vector +, -, *, /, and log1p for
        streaming normalization, along with other less common
        operations. This allows implementation of ATAC-seq LSI and
        Seurat default normalization, along with most published
        log-based normalizations.
      - SCTransform pearson residual calculation
      - Multiplication of sparse matrices
  - Single cell plotting utilities
      - Read count knee cutoffs
      - UMAP embeddings
      - Dot plots
      - Transcription factor footprinting / TSS profile plotting
      - Fragments vs. TSS Enrichment ATAC-seq QC plot
      - Pseudobulk genome track plots, with gene annotation plots
  - Additional utility functions
      - Matching gene symbols/IDs to canonical symbols
      - Download transcript annotations from Gencode or GTF files
      - Download + parse UCSC chromosome sizes
      - Parse peak files BED format; Download ENCODE blacklist region
      - Wrappers for knn graph calculation + clustering

Note: All operations interoperate with all storage formats. For example,
all matrix operations can be applied directly to an AnnData or 10x
matrix file. In many cases the bitpacking-compressed formats will
provide performance/space advantages, but are not required to use the
computations.