flox

This project explores strategies for fast GroupBy reductions with dask.array. It used to be called dask_groupby It was motivated by

Dask Dataframe GroupBy blogpost
numpy_groupies in Xarray issue

(See a presentation about this package, from the Pangeo Showcase).

Acknowledgements

This work was funded in part by

NASA-ACCESS 80NSSC18M0156 "Community tools for analysis of NASA Earth Observing System Data in the Cloud" (PI J. Hamman, NCAR),
NASA-OSTFL 80NSSC22K0345 "Enhancing analysis of NASA data with the open-source Python Xarray Library" (PIs Scott Henderson, University of Washington; Deepak Cherian, NCAR; Jessica Scheick, University of New Hampshire), and
NCAR's Earth System Data Science Initiative.

It was motivated by very very many discussions in the Pangeo community.

API

There are two main functions

flox.groupby_reduce(dask_array, by_dask_array, "mean") "pure" dask array interface
flox.xarray.xarray_reduce(xarray_object, by_dataarray, "mean") "pure" xarray interface; though work is ongoing to integrate this package in xarray.

Implementation

See the documentation for details on the implementation.

Custom reductions

flox implements all common reductions provided by numpy_groupies in aggregations.py. It also allows you to specify a custom Aggregation (again inspired by dask.dataframe), though this might not be fully functional at the moment. See aggregations.py for examples.

mean = Aggregation( # name used for dask tasks name="mean", # operation to use for pure-numpy inputs numpy="mean", # blockwise reduction chunk=("sum", "count"), # combine intermediate results: sum the sums, sum the counts combine=("sum", "sum"), # generate final result as sum / count finalize=lambda sum_, count: sum_ / count, # Used when "reindexing" at combine-time fill_value=0, # Used when any member of `expected_groups` is not found final_fill_value=np.nan, )

Name		Name	Last commit message	Last commit date
Latest commit History 685 Commits
.github		.github
asv_bench		asv_bench
docs		docs
flox		flox
tests		tests
.git_archival.txt		.git_archival.txt
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
codecov.yml		codecov.yml
pyproject.toml		pyproject.toml
setup.py		setup.py
uv-numpy1.toml		uv-numpy1.toml
uv-upstream.toml		uv-upstream.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

flox

Acknowledgements

API

Implementation

Custom reductions

About

Uh oh!

Releases 70

Uh oh!

Contributors 18

Uh oh!

Languages

License

xarray-contrib/flox

Folders and files

Latest commit

History

Repository files navigation

flox

Acknowledgements

API

Implementation

Custom reductions

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 70

Uh oh!

Contributors 18

Uh oh!

Languages