|
142 | 142 | ### PyTorch
|
143 | 143 | * [PyTorch](https://github.com/pytorch/pytorch) - Tensors and Dynamic neural networks in Python with strong GPU acceleration. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
|
144 | 144 | * [pytorch-lightning](https://github.com/Lightning-AI/lightning) - PyTorch Lightning is just organized PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
|
145 |
| -* [torchvision](https://github.com/pytorch/vision) - Datasets, Transforms, and Models specific to Computer Vision. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible"> |
146 |
| -* [torchtext](https://github.com/pytorch/text) - Data loaders and abstractions for text and NLP. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible"> |
147 |
| -* [torchaudio](https://github.com/pytorch/audio) - An audio library for PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible"> |
148 | 145 | * [ignite](https://github.com/pytorch/ignite) - High-level library to help with training neural networks in PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
|
149 | 146 | * [skorch](https://github.com/dnouri/skorch) - A scikit-learn compatible neural network library that wraps PyTorch. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
|
150 | 147 | * [Catalyst](https://github.com/catalyst-team/catalyst) - High-level utils for PyTorch DL & RL research. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
|
|
172 | 169 | ### MXNet
|
173 | 170 | * [MXNet](https://github.com/apache/incubator-mxnet) - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler. <img height="20" src="img/mxnet_big.png" alt="MXNet based">
|
174 | 171 | * [Gluon](https://github.com/gluon-api/gluon-api) - A clear, concise, simple yet powerful and efficient API for deep learning (now included in MXNet). <img height="20" src="img/mxnet_big.png" alt="MXNet based">
|
175 |
| -* [gluon-cv](https://github.com/dmlc/gluon-cv) - Provides implementations of the state-of-the-art deep learning models in computer vision. <img height="20" src="img/mxnet_big.png" alt="MXNet based"> |
176 |
| -* [gluon-nlp](https://github.com/dmlc/gluon-nlp) - NLP made easy. <img height="20" src="img/mxnet_big.png" alt="MXNet based"> |
177 | 172 | * [Xfer](https://github.com/amzn/xfer) - Transfer Learning library for Deep Neural Networks. <img height="20" src="img/mxnet_big.png" alt="MXNet based">
|
178 | 173 | * [MXNet](https://github.com/ROCmSoftwarePlatform/mxnet) - HIP Port of MXNet. <img height="20" src="img/mxnet_big.png" alt="MXNet based"> <img height="20" src="img/amd_big.png" alt="Possible to run on AMD GPU">
|
179 | 174 |
|
180 |
| -### Others |
| 175 | +### JAX |
181 | 176 | * [jax](https://github.com/google/jax) - Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more.
|
| 177 | + |
| 178 | +### Others |
| 179 | +* [transformers](https://github.com/huggingface/transformers) - State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. |
182 | 180 | * [Tangent](https://github.com/google/tangent) - Source-to-Source Debuggable Derivatives in Pure Python.
|
183 | 181 | * [autograd](https://github.com/HIPS/autograd) - Efficiently computes derivatives of numpy code.
|
184 | 182 | * [Myia](https://github.com/mila-udem/myia) - Deep Learning framework (pre-alpha).
|
185 | 183 | * [nnabla](https://github.com/sony/nnabla) - Neural Network Libraries by Sony.
|
186 | 184 | * [Caffe](https://github.com/BVLC/caffe) - A fast open framework for deep learning.
|
187 |
| -* [hipCaffe](https://github.com/ROCmSoftwarePlatform/hipCaffe) - The HIP port of Caffe. <img height="20" src="img/amd_big.png" alt="Possible to run on AMD GPU"> |
188 | 185 |
|
189 | 186 | ## Reinforcement Learning
|
190 | 187 | * [Gymnasium](https://github.com/Farama-Foundation/Gymnasium) - An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly [Gym](https://github.com/openai/gym)).
|
|
231 | 228 | * [GPyTorch](https://github.com/cornellius-gp/gpytorch) - A highly efficient and modular implementation of Gaussian Processes in PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible">
|
232 | 229 | * [sklearn-crfsuite](https://github.com/TeamHG-Memex/sklearn-crfsuite) - A scikit-learn-inspired API for CRFsuite. <img height="20" src="img/sklearn_big.png" alt="sklearn">
|
233 | 230 |
|
234 |
| -## Data Manipulation |
235 |
| - |
236 |
| -### Data Frames |
237 |
| -* [pandas](https://pandas.pydata.org/pandas-docs/stable/) - Powerful Python data analysis toolkit. |
238 |
| -* [polars](https://github.com/pola-rs/polars) - A fast multi-threaded, hybrid-out-of-core DataFrame library. |
239 |
| -* [koalas](https://github.com/databricks/koalas) - pandas API on Apache Spark. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
240 |
| -* [Arctic](https://github.com/manahl/arctic) - High-performance datastore for time series and tick data. |
241 |
| -* [datatable](https://github.com/h2oai/datatable) - Data.table for Python. <img height="20" src="img/R_big.png" alt="R inspired/ported lib"> |
242 |
| -* [pandas_profiling](https://github.com/pandas-profiling/pandas-profiling) - Create HTML profiling reports from pandas DataFrame objects |
243 |
| -* [cuDF](https://github.com/rapidsai/cudf) - GPU DataFrame Library. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> <img height="20" src="img/gpu_big.png" alt="GPU accelerated"> |
244 |
| -* [blaze](https://github.com/blaze/blaze) - NumPy and pandas interface to Big Data. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
245 |
| -* [pandasql](https://github.com/yhat/pandasql) - Allows you to query pandas DataFrames using SQL syntax. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
246 |
| -* [pandas-gbq](https://github.com/pydata/pandas-gbq) - pandas Google Big Query. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
247 |
| -* [xpandas](https://github.com/alan-turing-institute/xpandas) - Universal 1d/2d data containers with Transformers .functionality for data analysis by [The Alan Turing Institute](https://www.turing.ac.uk/). |
248 |
| -* [pysparkling](https://github.com/svenkreiss/pysparkling) - A pure Python implementation of Apache Spark's RDD and DStream interfaces. <img height="20" src="img/spark_big.png" alt="Apache Spark based"> |
249 |
| -* [modin](https://github.com/modin-project/modin) - Speed up your pandas workflows by changing a single line of code. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
250 |
| -* [swifter](https://github.com/jmcarpenter2/swifter) - A package that efficiently applies any function to a pandas dataframe or series in the fastest available manner. |
251 |
| -* [pandas-log](https://github.com/eyaltrabelsi/pandas-log) - A package that allows providing feedback about basic pandas operations and finds both business logic and performance issues. |
252 |
| -* [vaex](https://github.com/vaexio/vaex) - Out-of-Core DataFrames for Python, ML, visualize and explore big tabular data at a billion rows per second. |
253 |
| -* [xarray](https://github.com/pydata/xarray) - Xarray combines the best features of NumPy and pandas for multidimensional data selection by supplementing numerical axis labels with named dimensions for more intuitive, concise, and less error-prone indexing routines. |
254 |
| - |
255 |
| - |
256 |
| -### Pipelines |
257 |
| -* [pdpipe](https://github.com/shaypal5/pdpipe) - Sasy pipelines for pandas DataFrames. |
258 |
| -* [SSPipe](https://sspipe.github.io/) - Python pipe (|) operator with support for DataFrames and Numpy, and Pytorch. |
259 |
| -* [pandas-ply](https://github.com/coursera/pandas-ply) - Functional data manipulation for pandas. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
260 |
| -* [Dplython](https://github.com/dodger487/dplython) - Dplyr for Python. <img height="20" src="img/R_big.png" alt="R inspired/ported lib"> |
261 |
| -* [sklearn-pandas](https://github.com/scikit-learn-contrib/sklearn-pandas) - pandas integration with sklearn. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
262 |
| -* [Dataset](https://github.com/analysiscenter/dataset) - Helps you conveniently work with random or sequential batches of your data and define data processing. |
263 |
| -* [pyjanitor](https://github.com/ericmjl/pyjanitor) - Clean APIs for data cleaning. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
264 |
| -* [meza](https://github.com/reubano/meza) - A Python toolkit for processing tabular data. |
265 |
| -* [Prodmodel](https://github.com/prodmodel/prodmodel) - Build system for data science pipelines. |
266 |
| -* [dopanda](https://github.com/dovpanda-dev/dovpanda) - Hints and tips for using pandas in an analysis environment. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
267 |
| -* [Hamilton](https://github.com/DAGWorks-Inc/hamilton) - A microframework for dataframe generation that applies Directed Acyclic Graphs specified by a flow of lazily evaluated Python functions. |
268 |
| - |
269 |
| -### Data-centric AI |
270 |
| -* [cleanlab](https://github.com/cleanlab/cleanlab) - The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels. |
271 |
| -* [snorkel](https://github.com/snorkel-team/snorkel) - A system for quickly generating training data with weak supervision. |
272 |
| -* [dataprep](https://github.com/sfu-db/dataprep) - Collect, clean, and visualize your data in Python with a few lines of code. |
273 |
| - |
274 |
| -### Synthetic Data |
275 |
| - |
276 |
| -* [ydata-synthetic](https://github.com/ydataai/ydata-synthetic) - A package to generate synthetic tabular and time-series data leveraging the state-of-the-art generative models. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
277 |
| - |
278 | 231 | ## Feature Engineering
|
279 | 232 |
|
280 | 233 | ### General
|
|
415 | 368 | * [Chaos Genius](https://github.com/chaos-genius/chaos_genius) - ML powered analytics engine for outlier/anomaly detection and root cause analysis
|
416 | 369 |
|
417 | 370 | ## Natural Language Processing
|
| 371 | +* [torchtext](https://github.com/pytorch/text) - Data loaders and abstractions for text and NLP. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible"> |
| 372 | +* [gluon-nlp](https://github.com/dmlc/gluon-nlp) - NLP made easy. <img height="20" src="img/mxnet_big.png" alt="MXNet based"> |
| 373 | +* [KerasNLP](https://github.com/keras-team/keras-nlp) - Modular Natural Language Processing workflows with Keras. <img height="20" src="img/keras_big.png" alt="Keras based/compatible"> |
418 | 374 | * [spaCy](https://spacy.io/) - Industrial-Strength Natural Language Processing.
|
419 | 375 | * [NLTK](https://github.com/nltk/nltk) - Modules, data sets, and tutorials supporting research and development in Natural Language Processing.
|
420 | 376 | * [CLTK](https://github.com/cltk/cltk) - The Classical Language Toolkik.
|
|
426 | 382 |
|
427 | 383 |
|
428 | 384 | ## Computer Audition
|
| 385 | +* [torchaudio](https://github.com/pytorch/audio) - An audio library for PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible"> |
429 | 386 | * [librosa](https://github.com/librosa/librosa) - Python library for audio and music analysis.
|
430 | 387 | * [Yaafe](https://github.com/Yaafe/Yaafe) - Audio features extraction.
|
431 | 388 | * [aubio](https://github.com/aubio/aubio) - A library for audio and music analysis.
|
|
436 | 393 | * [madmom](https://github.com/CPJKU/madmom) - Python audio and music signal processing library.
|
437 | 394 |
|
438 | 395 | ## Computer Vision
|
| 396 | +* [torchvision](https://github.com/pytorch/vision) - Datasets, Transforms, and Models specific to Computer Vision. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible"> |
| 397 | +* [gluon-cv](https://github.com/dmlc/gluon-cv) - Provides implementations of the state-of-the-art deep learning models in computer vision. <img height="20" src="img/mxnet_big.png" alt="MXNet based"> |
| 398 | +* [KerasCV](https://github.com/keras-team/keras-cv) - Industry-strength Computer Vision workflows with Keras. <img height="20" src="img/keras_big.png" alt="MXNet based"> |
439 | 399 | * [OpenCV](https://github.com/opencv/opencv) - Open Source Computer Vision Library.
|
440 | 400 | * [scikit-image](https://github.com/scikit-image/scikit-image) - Image Processing SciKit (Toolbox for SciPy).
|
441 | 401 | * [imgaug](https://github.com/aleju/imgaug) - Image augmentation for machine learning experiments.
|
|
452 | 412 | * [scikit-posthocs](https://github.com/maximtrp/scikit-posthocs) - Pairwise Multiple Comparisons Post-hoc Tests.
|
453 | 413 | * [Alphalens](https://github.com/quantopian/alphalens) - Performance analysis of predictive (alpha) stock factors.
|
454 | 414 |
|
| 415 | + |
| 416 | +## Data Manipulation |
| 417 | + |
| 418 | +### Data Frames |
| 419 | +* [pandas](https://pandas.pydata.org/pandas-docs/stable/) - Powerful Python data analysis toolkit. |
| 420 | +* [polars](https://github.com/pola-rs/polars) - A fast multi-threaded, hybrid-out-of-core DataFrame library. |
| 421 | +* [koalas](https://github.com/databricks/koalas) - pandas API on Apache Spark. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
| 422 | +* [Arctic](https://github.com/manahl/arctic) - High-performance datastore for time series and tick data. |
| 423 | +* [datatable](https://github.com/h2oai/datatable) - Data.table for Python. <img height="20" src="img/R_big.png" alt="R inspired/ported lib"> |
| 424 | +* [pandas_profiling](https://github.com/pandas-profiling/pandas-profiling) - Create HTML profiling reports from pandas DataFrame objects |
| 425 | +* [cuDF](https://github.com/rapidsai/cudf) - GPU DataFrame Library. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> <img height="20" src="img/gpu_big.png" alt="GPU accelerated"> |
| 426 | +* [blaze](https://github.com/blaze/blaze) - NumPy and pandas interface to Big Data. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
| 427 | +* [pandasql](https://github.com/yhat/pandasql) - Allows you to query pandas DataFrames using SQL syntax. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
| 428 | +* [pandas-gbq](https://github.com/pydata/pandas-gbq) - pandas Google Big Query. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
| 429 | +* [xpandas](https://github.com/alan-turing-institute/xpandas) - Universal 1d/2d data containers with Transformers .functionality for data analysis by [The Alan Turing Institute](https://www.turing.ac.uk/). |
| 430 | +* [pysparkling](https://github.com/svenkreiss/pysparkling) - A pure Python implementation of Apache Spark's RDD and DStream interfaces. <img height="20" src="img/spark_big.png" alt="Apache Spark based"> |
| 431 | +* [modin](https://github.com/modin-project/modin) - Speed up your pandas workflows by changing a single line of code. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
| 432 | +* [swifter](https://github.com/jmcarpenter2/swifter) - A package that efficiently applies any function to a pandas dataframe or series in the fastest available manner. |
| 433 | +* [pandas-log](https://github.com/eyaltrabelsi/pandas-log) - A package that allows providing feedback about basic pandas operations and finds both business logic and performance issues. |
| 434 | +* [vaex](https://github.com/vaexio/vaex) - Out-of-Core DataFrames for Python, ML, visualize and explore big tabular data at a billion rows per second. |
| 435 | +* [xarray](https://github.com/pydata/xarray) - Xarray combines the best features of NumPy and pandas for multidimensional data selection by supplementing numerical axis labels with named dimensions for more intuitive, concise, and less error-prone indexing routines. |
| 436 | + |
| 437 | +### Pipelines |
| 438 | +* [pdpipe](https://github.com/shaypal5/pdpipe) - Sasy pipelines for pandas DataFrames. |
| 439 | +* [SSPipe](https://sspipe.github.io/) - Python pipe (|) operator with support for DataFrames and Numpy, and Pytorch. |
| 440 | +* [pandas-ply](https://github.com/coursera/pandas-ply) - Functional data manipulation for pandas. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
| 441 | +* [Dplython](https://github.com/dodger487/dplython) - Dplyr for Python. <img height="20" src="img/R_big.png" alt="R inspired/ported lib"> |
| 442 | +* [sklearn-pandas](https://github.com/scikit-learn-contrib/sklearn-pandas) - pandas integration with sklearn. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
| 443 | +* [Dataset](https://github.com/analysiscenter/dataset) - Helps you conveniently work with random or sequential batches of your data and define data processing. |
| 444 | +* [pyjanitor](https://github.com/ericmjl/pyjanitor) - Clean APIs for data cleaning. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
| 445 | +* [meza](https://github.com/reubano/meza) - A Python toolkit for processing tabular data. |
| 446 | +* [Prodmodel](https://github.com/prodmodel/prodmodel) - Build system for data science pipelines. |
| 447 | +* [dopanda](https://github.com/dovpanda-dev/dovpanda) - Hints and tips for using pandas in an analysis environment. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
| 448 | +* [Hamilton](https://github.com/DAGWorks-Inc/hamilton) - A microframework for dataframe generation that applies Directed Acyclic Graphs specified by a flow of lazily evaluated Python functions. |
| 449 | + |
| 450 | +### Data-centric AI |
| 451 | +* [cleanlab](https://github.com/cleanlab/cleanlab) - The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels. |
| 452 | +* [snorkel](https://github.com/snorkel-team/snorkel) - A system for quickly generating training data with weak supervision. |
| 453 | +* [dataprep](https://github.com/sfu-db/dataprep) - Collect, clean, and visualize your data in Python with a few lines of code. |
| 454 | + |
| 455 | +### Synthetic Data |
| 456 | + |
| 457 | +* [ydata-synthetic](https://github.com/ydataai/ydata-synthetic) - A package to generate synthetic tabular and time-series data leveraging the state-of-the-art generative models. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
| 458 | + |
455 | 459 | ## Distributed Computing
|
456 | 460 | * [Horovod](https://github.com/uber/horovod) - Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. <img height="20" src="img/tf_big2.png" alt="sklearn">
|
457 | 461 | * [PySpark](https://spark.apache.org/docs/0.9.0/python-programming-guide.html) - Exposes the Spark programming model to Python. <img height="20" src="img/spark_big.png" alt="Apache Spark based">
|
|
0 commit comments