Name	Name	Last commit message	Last commit date
Latest commit History 1,010 Commits
.circleci	.circleci
dev	dev
docs	docs
java-jsi-clus-fire	java-jsi-clus-fire
java-jsi-clus-fr	java-jsi-clus-fr
java-jsi-clus-pct-ts	java-jsi-clus-pct-ts
java-jsi-clus-pct	java-jsi-clus-pct
java-jsi-clus-rm	java-jsi-clus-rm
java-jsi-streams-modeltree	java-jsi-streams-modeltree
java-jsi-streams-regressiontree	java-jsi-streams-regressiontree
java-rapidminer-knn	java-rapidminer-knn
java-rapidminer-naivebayes	java-rapidminer-naivebayes
ontology-jsi-neurodegenerative	ontology-jsi-neurodegenerative
python-anova	python-anova
python-correlation-heatmap	python-correlation-heatmap
python-distributed-kmeans	python-distributed-kmeans
python-histograms	python-histograms
python-jsi-hedwig	python-jsi-hedwig
python-jsi-hinmine	python-jsi-hinmine
python-knn	python-knn
python-linear-regression	python-linear-regression
python-longitudinal	python-longitudinal
python-sgd-regression	python-sgd-regression
python-summary-statistics	python-summary-statistics
python-tsne	python-tsne
r-3c	r-3c
r-ggparci	r-ggparci
r-heatmaply	r-heatmaply
r-linear-regression	r-linear-regression
scripts	scripts
.gitignore	.gitignore
.gitmodules	.gitmodules
.pre-commit-config.yaml	.pre-commit-config.yaml
Guidelines.md	Guidelines.md
README.md	README.md
after-git-clone.sh	after-git-clone.sh
after-update.sh	after-update.sh
build.sh	build.sh
cleanup.sh	cleanup.sh
setup.sh	setup.sh

Algorithm repository

This is the repository of algorithms for the MIP.

Algorithms, written in their native language (R, Matlab, Python, Java...) are encapsulated in a Docker container that provides them with the runtime environment necessary to execute this function.

The environment variables provided to the Docker container are used as parameters to the function or algorithm to execute.

Currently, we expect the Docker containers to be autonomous:

they should connect to a database and retrieve the dataset to process
they should process the data, taking into account the parameters given as environment variables to the Docker container
they should store the results into the results database.

The format of the results should be easily shared.

For algorithms providing statistical analysis or machine learning, we require the results to be in PFA format in its YAML or JSON form.
For algorithms providing visualisations, we support different formats, including Highcharts, Vis.js, PNG and SVG.
For algorithms providing tabular data, we expect a JSON output in this format: Tabular Data Resource

List of algorithms

hbpmip/python-anova: Anova algorithm

This is a Python implementation of Anova.

hbpmip/python-correlation-heatmap: Correlation heatmap and PCA

Calculate correlation heatmap, only works for real variables. Run it on single node or in a distributed mode. First, intermediate mode calculates covariance matrix from a single node, then aggregate mode is used after intermediate to combine statistics from multiple jobs and produce the final graph.

python-distributed-kmeans Implementation of distributed k-means clustering (https://github.com/MRN-Code/dkmeans) in Python. It uses Single-Shot Decentralized LLoyd (https://github.com/MRN-Code/dkmeans#single-shot-decentralized-lloyd).

Intermediate mode calculates clusters on a single node, while aggregate mode is merging the clusters according to least merging error (e.g. smallest distance between centroids).

python-histograms Calculates histogram of nominal or real variable grouped by nominal variables in independent variables. Histogram edges are taken from minValue and maxValue property of dependent variable. If not avaiable, then these values are calculated dynamically from dependent values (this won't work in distributed mode though).

python-jsi-hedwig Hedwig method for semantic subgroup discovery. (https://github.com/anzev/hedwig).

python-jsi-hinmine The HINMINE algorithm for network-based propositionalization is an algorithm for data analysis based on network analysis methods.

The input for the algorithm is a data set containing instances with real-valued features. The purpose of the algorithm is to construct a new set of features for further analysis by other data mining algorithms. The algorithm outputs a data set with features, generated for each data instance in the input data set. The features represent how close a given instance is to the other instances in the data set. The closeness of instances is measured using the PageRank algorithm, calculated on a network constructed from instance similarities. python-knn Implementation of k-nearest neighbors algorithm (https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) in Python.

Run it on single node or in a distributed mode.

python-linear-regression Python implementation of multivariate linear regression. It supports both continuous and categorical as independent variables. Run it on single node or in a distributed mode. Python implementation of logistic regressions on one class versus the others. Only single-node mode is supported

python-sgd-regression This is a Python implementation of scikit-learn estimators (http://scikit-learn.org/stable/modules/scaling_strategies.html) that use partial_fit method for distributed learning.

Implemented methods: linear_model - calls SGDRegressor or SGDClassifier neural_network - calls MLPRegressor or MLPClassifier naive_bayes - calls MixedNB (mix of GaussianNB and MultinomialNB), only works for classification tasks gradient_boosting - calls GradientBoostingRegressor or GradientBoostingClassifier, does not support distributed training.

python-summary-statistics It calculates various summary statistics for entire dataset and also for all subgroups created by combining all possible values of nominal covariates. Run it on single node or in a distributed mode.

python-tsne The python-tsne is a wrapper for the the A-tSNE algorithm developed by N. Pezzotti. The underlying algorithm is an improvement on the Barnes-Hut tSNE (http://lvdmaaten.github.io/publications/papers/JMLR_2014.pdf) using an approximated k-nearest neighbor calculation.

Acknowledgements

This work has been funded by the European Union Seventh Framework Program (FP7/20072013) under grant agreement no. 604102 (HBP)

This work is part of SP8 of the Human Brain Project (SGA1).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Algorithm repository

List of algorithms

hbpmip/python-anova: Anova algorithm

hbpmip/python-correlation-heatmap: Correlation heatmap and PCA

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 10

Uh oh!

Languages

LREN-CHUV/algorithm-repository

Folders and files

Latest commit

History

Repository files navigation

Algorithm repository

List of algorithms

hbpmip/python-anova: Anova algorithm

hbpmip/python-correlation-heatmap: Correlation heatmap and PCA

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 10

Uh oh!

Languages

Packages