|
16 | 16 |
|
17 | 17 | > Probably the best curated list of data science software in Python |
18 | 18 |
|
19 | | -[cp]: https://raw.githubusercontent.com/krzjoa/awesome-python-datascience/master/img/cupy.png 'CuPy based' |
20 | | - |
21 | | -[mx]: https://raw.githubusercontent.com/krzjoa/awesome-python-datascience/master/img/mxnet.png 'MXNet based' |
22 | | - |
23 | | -[r]: https://raw.githubusercontent.com/krzjoa/awesome-python-datascience/master/img/R.png 'R inspired/ported lib' |
24 | | - |
25 | | -[gpu]: https://raw.githubusercontent.com/krzjoa/awesome-python-datascience/master/img/gpu.png 'GPU accelerated' |
26 | | - |
27 | 19 | ## Contents |
28 | 20 | * [Machine Learning](#ml) |
29 | 21 | * [Deep Learning](#dl) |
|
55 | 47 | * [scikit-learn](http://scikit-learn.org/stable/) - Machine learning in Python. <img height="20" src="img/sklearn_big.png" alt="sklearn"> |
56 | 48 | * [Shogun](http://www.shogun-toolbox.org/) - Machine learning toolbox. |
57 | 49 | * [xLearn](https://github.com/aksnzhy/xlearn) - High Performance, Easy-to-use, and Scalable Machine Learning Package. |
58 | | -* [cuML](https://github.com/rapidsai/cuml) - RAPIDS Machine Learning Library. <img height="20" src="img/sklearn_big.png" alt="sklearn"> ![alt text][gpu] |
| 50 | +* [cuML](https://github.com/rapidsai/cuml) - RAPIDS Machine Learning Library. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/gpu_big.png" alt="GPU accelerated"> |
59 | 51 | * [modAL](https://github.com/cosmic-cortex/modAL) - Modular active learning framework for Python3. <img height="20" src="img/sklearn_big.png" alt="sklearn"> |
60 | 52 | * [Sparkit-learn](https://github.com/lensacom/sparkit-learn) - PySpark + scikit-learn = Sparkit-learn. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/spark_big.png" alt="Apache Spark based"> |
61 | 53 | * [mlpack](https://github.com/mlpack/mlpack) - A scalable C++ machine learning library (Python bindings). |
|
108 | 100 | ### Extreme Learning Machine |
109 | 101 | * [Python-ELM](https://github.com/dclambert/Python-ELM) - Extreme Learning Machine implementation in Python. <img height="20" src="img/sklearn_big.png" alt="sklearn"> |
110 | 102 | * [Python Extreme Learning Machine (ELM)](https://github.com/acba/elm) - A machine learning technique used for classification/regression tasks. |
111 | | -* [hpelm](https://github.com/akusok/hpelm) - High performance implementation of Extreme Learning Machines (fast randomized neural networks). ![alt text][gpu] |
| 103 | +* [hpelm](https://github.com/akusok/hpelm) - High performance implementation of Extreme Learning Machines (fast randomized neural networks). <img height="20" src="img/gpu_big.png" alt="GPU accelerated"> |
112 | 104 |
|
113 | 105 | <a name="ml-fm"></a> |
114 | 106 | ### Kernel Methods |
|
117 | 109 | * [tffm](https://github.com/geffy/tffm) - TensorFlow implementation of an arbitrary order Factorization Machine. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/tf_big2.png" alt="sklearn"> |
118 | 110 | * [liquidSVM](https://github.com/liquidSVM/liquidSVM) - An implementation of SVMs. |
119 | 111 | * [scikit-rvm](https://github.com/JamesRitchie/scikit-rvm) - Relevance Vector Machine implementation using the scikit-learn API. <img height="20" src="img/sklearn_big.png" alt="sklearn"> |
120 | | -* [ThunderSVM](https://github.com/Xtra-Computing/thundersvm) - A fast SVM Library on GPUs and CPUs. <img height="20" src="img/sklearn_big.png" alt="sklearn"> ![alt text][gpu] |
| 112 | +* [ThunderSVM](https://github.com/Xtra-Computing/thundersvm) - A fast SVM Library on GPUs and CPUs. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/gpu_big.png" alt="GPU accelerated"> |
121 | 113 |
|
122 | 114 | <a name="ml-gbt"></a> |
123 | 115 | ### Gradient Boosting |
124 | | -* [XGBoost](https://github.com/dmlc/xgboost) - Scalable, Portable and Distributed Gradient Boosting. <img height="20" src="img/sklearn_big.png" alt="sklearn"> ![alt text][gpu] |
125 | | -* [LightGBM](https://github.com/Microsoft/LightGBM) - A fast, distributed, high performance gradient boosting. <img height="20" src="img/sklearn_big.png" alt="sklearn"> ![alt text][gpu] |
126 | | -* [CatBoost](https://github.com/catboost/catboost) - An open-source gradient boosting on decision trees library. <img height="20" src="img/sklearn_big.png" alt="sklearn"> ![alt text][gpu] |
127 | | -* [ThunderGBM](https://github.com/Xtra-Computing/thundergbm) - Fast GBDTs and Random Forests on GPUs. <img height="20" src="img/sklearn_big.png" alt="sklearn"> ![alt text][gpu] |
| 116 | +* [XGBoost](https://github.com/dmlc/xgboost) - Scalable, Portable and Distributed Gradient Boosting. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/gpu_big.png" alt="GPU accelerated"> |
| 117 | +* [LightGBM](https://github.com/Microsoft/LightGBM) - A fast, distributed, high performance gradient boosting. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/gpu_big.png" alt="GPU accelerated"> |
| 118 | +* [CatBoost](https://github.com/catboost/catboost) - An open-source gradient boosting on decision trees library. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/gpu_big.png" alt="GPU accelerated"> |
| 119 | +* [ThunderGBM](https://github.com/Xtra-Computing/thundergbm) - Fast GBDTs and Random Forests on GPUs. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/gpu_big.png" alt="GPU accelerated"> |
128 | 120 |
|
129 | 121 | <a name="dl"></a> |
130 | 122 | ## Deep Learning |
|
172 | 164 |
|
173 | 165 | <a name="dl-mxnet"></a> |
174 | 166 | ### MXNet |
175 | | -* [MXNet](https://github.com/apache/incubator-mxnet) - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler. ![alt text][mx] |
176 | | -* [Gluon](https://github.com/gluon-api/gluon-api) - A clear, concise, simple yet powerful and efficient API for deep learning (now included in MXNet). ![alt text][mx] |
177 | | -* [MXbox](https://github.com/Lyken17/mxbox) - Simple, efficient and flexible vision toolbox for mxnet framework. ![alt text][mx] |
178 | | -* [gluon-cv](https://github.com/dmlc/gluon-cv) - Provides implementations of the state-of-the-art deep learning models in computer vision. ![alt text][mx] |
179 | | -* [gluon-nlp](https://github.com/dmlc/gluon-nlp) - NLP made easy. ![alt text][mx] |
180 | | -* [Xfer](https://github.com/amzn/xfer) - Transfer Learning library for Deep Neural Networks. ![alt text][mx] |
181 | | -* [MXNet](https://github.com/ROCmSoftwarePlatform/mxnet) - HIP Port of MXNet. ![alt text][mx] <img height="20" src="img/amd_big.png" alt="Possible to run on AMD GPU"> |
| 167 | +* [MXNet](https://github.com/apache/incubator-mxnet) - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler. <img height="20" src="img/R_big.png" alt="MXNet based"> |
| 168 | +* [Gluon](https://github.com/gluon-api/gluon-api) - A clear, concise, simple yet powerful and efficient API for deep learning (now included in MXNet). <img height="20" src="img/R_big.png" alt="MXNet based"> |
| 169 | +* [MXbox](https://github.com/Lyken17/mxbox) - Simple, efficient and flexible vision toolbox for mxnet framework. <img height="20" src="img/R_big.png" alt="MXNet based"> |
| 170 | +* [gluon-cv](https://github.com/dmlc/gluon-cv) - Provides implementations of the state-of-the-art deep learning models in computer vision. <img height="20" src="img/R_big.png" alt="MXNet based"> |
| 171 | +* [gluon-nlp](https://github.com/dmlc/gluon-nlp) - NLP made easy. <img height="20" src="img/R_big.png" alt="MXNet based"> |
| 172 | +* [Xfer](https://github.com/amzn/xfer) - Transfer Learning library for Deep Neural Networks. <img height="20" src="img/R_big.png" alt="MXNet based"> |
| 173 | +* [MXNet](https://github.com/ROCmSoftwarePlatform/mxnet) - HIP Port of MXNet. <img height="20" src="img/R_big.png" alt="MXNet based"> <img height="20" src="img/amd_big.png" alt="Possible to run on AMD GPU"> |
182 | 174 |
|
183 | 175 | <!--a name="dl-cntk"></a--> |
184 | 176 | <a name="dl-chainer"></a> |
|
217 | 209 | <a name="dm-cont"></a> |
218 | 210 | ### Data Containers |
219 | 211 | * [pandas](https://pandas.pydata.org/pandas-docs/stable/) - Powerful Python data analysis toolkit. |
220 | | -* [cuDF](https://github.com/rapidsai/cudf) - GPU DataFrame Library. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> ![alt text][gpu] |
| 212 | +* [cuDF](https://github.com/rapidsai/cudf) - GPU DataFrame Library. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> <img height="20" src="img/gpu_big.png" alt="GPU accelerated"> |
221 | 213 | * [blaze](https://github.com/blaze/blaze) - NumPy and pandas interface to Big Data. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
222 | 214 | * [pandasql](https://github.com/yhat/pandasql) - Allows you to query pandas DataFrames using SQL syntax. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
223 | 215 | * [pandas-gbq](https://github.com/pydata/pandas-gbq) - pandas Google Big Query. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
224 | 216 | * [xpandas](https://github.com/alan-turing-institute/xpandas) - Universal 1d/2d data containers with Transformers .functionality for data analysis by [The Alan Turing Institute](https://www.turing.ac.uk/). |
225 | 217 | * [pysparkling](https://github.com/svenkreiss/pysparkling) - A pure Python implementation of Apache Spark's RDD and DStream interfaces. <img height="20" src="img/spark_big.png" alt="Apache Spark based"> |
226 | 218 | * [Arctic](https://github.com/manahl/arctic) - High performance datastore for time series and tick data. |
227 | | -* [datatable](https://github.com/h2oai/datatable) - Data.table for Python. ![alt text][r] |
| 219 | +* [datatable](https://github.com/h2oai/datatable) - Data.table for Python. <img height="20" src="img/R_big.png" alt="R inspired/ported lib"> |
228 | 220 | * [koalas](https://github.com/databricks/koalas) - pandas API on Apache Spark. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
229 | 221 | * [modin](https://github.com/modin-project/modin) - Speed up your pandas workflows by changing a single line of code. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
230 | 222 | * [swifter](https://github.com/jmcarpenter2/swifter) - A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner. |
|
234 | 226 | * [pdpipe](https://github.com/shaypal5/pdpipe) - Sasy pipelines for pandas DataFrames. |
235 | 227 | * [SSPipe](https://sspipe.github.io/) - Python pipe (|) operator with support for DataFrames and Numpy and Pytorch. |
236 | 228 | * [pandas-ply](https://github.com/coursera/pandas-ply) - Functional data manipulation for pandas. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
237 | | -* [Dplython](https://github.com/dodger487/dplython) - Dplyr for Python. ![alt text][r] |
| 229 | +* [Dplython](https://github.com/dodger487/dplython) - Dplyr for Python. <img height="20" src="img/R_big.png" alt="R inspired/ported lib"> |
238 | 230 | * [sklearn-pandas](https://github.com/scikit-learn-contrib/sklearn-pandas) - pandas integration with sklearn. <img height="20" src="img/sklearn_big.png" alt="sklearn"> <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
239 | 231 | * [Dataset](https://github.com/analysiscenter/dataset) - Helps you conveniently work with random or sequential batches of your data and define data processing. |
240 | 232 | * [pyjanitor](https://github.com/ericmjl/pyjanitor) - Clean APIs for data cleaning. <img height="20" src="img/pandas_big.png" alt="pandas compatible"> |
|
285 | 277 | * [FairML](https://github.com/adebayoj/fairml) - FairML is a python toolbox auditing the machine learning models for bias. <img height="20" src="img/sklearn_big.png" alt="sklearn"> |
286 | 278 | * [L2X](https://github.com/Jianbo-Lab/L2X) - Code for replicating the experiments in the paper *Learning to Explain: An Information-Theoretic Perspective on Model Interpretation*. |
287 | 279 | * [PDPbox](https://github.com/SauceCat/PDPbox) - Partial dependence plot toolbox. |
288 | | -* [pyBreakDown](https://github.com/MI2DataLab/pyBreakDown) - Python implementation of R package breakDown. <img height="20" src="img/sklearn_big.png" alt="sklearn">![alt text][r] |
| 280 | +* [pyBreakDown](https://github.com/MI2DataLab/pyBreakDown) - Python implementation of R package breakDown. <img height="20" src="img/sklearn_big.png" alt="sklearn"><img height="20" src="img/R_big.png" alt="R inspired/ported lib"> |
289 | 281 | * [PyCEbox](https://github.com/AustinRochford/PyCEbox) - Python Individual Conditional Expectation Plot Toolbox. |
290 | 282 | * [Skater](https://github.com/datascienceinc/Skater) - Python Library for Model Interpretation. |
291 | 283 | * [model-analysis](https://github.com/tensorflow/model-analysis) - Model analysis tools for TensorFlow. <img height="20" src="img/tf_big2.png" alt="sklearn"> |
|
298 | 290 | * [Netron](https://github.com/lutzroeder/Netron) - Visualizer for deep learning and machine learning models (no Python code, but visualizes models from most Python Deep Learning frameworks). |
299 | 291 | * [FlashLight](https://github.com/dlguys/flashlight) - Visualization Tool for your NeuralNetwork. |
300 | 292 | * [tensorboard-pytorch](https://github.com/lanpa/tensorboard-pytorch) - Tensorboard for pytorch (and chainer, mxnet, numpy, ...). |
301 | | -* [mxboard](https://github.com/awslabs/mxboard) - Logging MXNet data for visualization in TensorBoard. ![alt text][mx] |
| 293 | +* [mxboard](https://github.com/awslabs/mxboard) - Logging MXNet data for visualization in TensorBoard. <img height="20" src="img/R_big.png" alt="MXNet based"> |
302 | 294 |
|
303 | 295 | <a name="rl"></a> |
304 | 296 | ## Reinforcement Learning |
|
317 | 309 |
|
318 | 310 | <a name="bayes"></a> |
319 | 311 | ## Probabilistic Methods |
320 | | -* [pomegranate](https://github.com/jmschrei/pomegranate) - Probabilistic and graphical models for Python. ![alt text][cp] |
| 312 | +* [pomegranate](https://github.com/jmschrei/pomegranate) - Probabilistic and graphical models for Python. <img height="20" src="img/gpu_big.png" alt="GPU accelerated"> |
321 | 313 | * [pyro](https://github.com/uber/pyro) - A flexible, scalable deep probabilistic programming library built on PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible"> |
322 | 314 | * [ZhuSuan](http://zhusuan.readthedocs.io/en/latest/) - Bayesian Deep Learning. <img height="20" src="img/tf_big2.png" alt="sklearn"> |
323 | 315 | * [PyMC](https://github.com/pymc-devs/pymc) - Bayesian Stochastic Modelling in Python. |
|
339 | 331 | * [hsmmlearn](https://github.com/jvkersch/hsmmlearn) - A library for hidden semi-Markov models with explicit durations. |
340 | 332 | * [pyhsmm](https://github.com/mattjj/pyhsmm) - Bayesian inference in HSMMs and HMMs. |
341 | 333 | * [GPyTorch](https://github.com/cornellius-gp/gpytorch) - A highly efficient and modular implementation of Gaussian Processes in PyTorch. <img height="20" src="img/pytorch_big2.png" alt="PyTorch based/compatible"> |
342 | | -* [MXFusion](https://github.com/amzn/MXFusion) - Modular Probabilistic Programming on MXNet ![alt text][mx] |
| 334 | +* [MXFusion](https://github.com/amzn/MXFusion) - Modular Probabilistic Programming on MXNet <img height="20" src="img/R_big.png" alt="MXNet based"> |
343 | 335 | * [sklearn-crfsuite](https://github.com/TeamHG-Memex/sklearn-crfsuite) - A scikit-learn inspired API for CRFsuite. <img height="20" src="img/sklearn_big.png" alt="sklearn"> |
344 | 336 |
|
345 | 337 | <a name="gp"></a> |
|
0 commit comments