krasserm
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎gaussian-processes/gaussian_processes.ipynb‎
Lines changed: 1 addition & 1 deletion b/‎gaussian-processes/gaussian_processes.ipynb‎
Lines changed: 1 addition & 1 deletion
@@ -17,7 +17,7 @@ some of the notebooks via [nbviewer](https://nbviewer.jupyter.org/) to ensure a
  scikit-learn and GPy ([requirements.txt](gaussian-processes/requirements.txt)). 
 
 - [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/krasserm/bayesian-machine-learning/blob/dev/gaussian-processes/gaussian_processes_classification.ipynb)
- [Gaussian processes for classification](https://nbviewer.jupyter.org/github/krasserm/bayesian-machine-learning/blob/dev/gaussian-processes/gaussian_processes_classification.ipynb). 
+ [Gaussian processes for classification](https://github.com/krasserm/bayesian-machine-learning/blob/dev/gaussian-processes/gaussian_processes_classification.ipynb). 
  Introduction to Gaussian processes for classification. Example implementations with plain NumPy/SciPy as well as with 
  scikit-learn ([requirements.txt](gaussian-processes/requirements.txt)). 
 
 
@@ -42,7 +42,7 @@
  "\n",
  "Observations that are closer to $\\mathbf{x}$ have a higher weight than observations that are further away. Weights are computed from $\\mathbf{x}$ and observed $\\mathbf{x}_i$ with a kernel $\\kappa$. A special case is k-nearest neighbors (KNN) where the $k$ closest observations have a weight $1/k$, and all others have weight $0$. Non-parametric methods often need to process all training data for prediction and are therefore slower at inference time than parametric methods. On the other hand, training is usually faster as non-parametric models only need to remember training data. \n",
  "\n",
- "Another example of non-parametric methods are [Gaussian processes](https://en.wikipedia.org/wiki/Gaussian_process) (GPs). Instead of inferring a distribution over the parameters of a parametric function Gaussian processes can be used to infer a distribution over functions directly. A Gaussian process defines a prior over functions. After having observed some function values it can be converted into a posterior over functions. Inference of continuous function values in this context is known as GP regression but GPs can also be used for [classification](https://nbviewer.jupyter.org/github/krasserm/bayesian-machine-learning/blob/dev/gaussian-processes/gaussian_processes_classification.ipynb). \n",
+ "Another example of non-parametric methods are [Gaussian processes](https://en.wikipedia.org/wiki/Gaussian_process) (GPs). Instead of inferring a distribution over the parameters of a parametric function Gaussian processes can be used to infer a distribution over functions directly. A Gaussian process defines a prior over functions. After having observed some function values it can be converted into a posterior over functions. Inference of continuous function values in this context is known as GP regression but GPs can also be used for [classification](https://github.com/krasserm/bayesian-machine-learning/blob/dev/gaussian-processes/gaussian_processes_classification.ipynb). \n",
  "\n",
  "A Gaussian process is a [random process](https://en.wikipedia.org/wiki/Stochastic_process) where any point $\\mathbf{x} \\in \\mathbb{R}^d$ is assigned a random variable $f(\\mathbf{x})$ and where the joint distribution of a finite number of these variables $p(f(\\mathbf{x}_1),...,f(\\mathbf{x}_N))$ is itself Gaussian:\n",
  "\n",