(C) 2016-2025 by Damir Cavar
NLP-Lab at Indiana University.
The notebooks below are my personal course material used in classes since 2016 until now (2025). Feel free to use those notebooks in your classes and tutorials and share the information with me, if you do.
- Ollama Communication from Jupyter Notebooks
- Anthropic / VoyageAI Embeddings
- OpenAI Embeddings
- BERT Embeddings
- Claude 3 Interaction using the Anthropic API
- Claude 4 Interaction using the Anthropic API
- GPT-4 interaction using the OpenAI API
- Gemini Communication via API
- N-gram Model for Text Generation
- Neo4j interaction
- Allegro Graph example
- RDFlib Graphs
- YAGO Knowledge Graph Endpoint
- Virtuoso Graph Server Endpoint
- Processing IAB Taxonomies
- Converting SEC CIKs to a Knowledge Graph
- Simple Transformer-based Text Classification
- Bayesian Classification for Machine Learning for Computational Linguistics
- Scikit-learn Logistic Regression 1
- Scikit-learn Logistic Regression 2
- Simple Feed Forward Network 1
- Simple Feed Forward Network 2
- spaCy Tutorial
- spaCy 3.x Tutorial: Transformers Spanish
- spaCy Named Entity Recognition Bootstrapping Model
- spaCy Model from CoNLL Data
- Train spaCy Model for Marathi (mr)
- Linear Algebra and Embeddings - spaCy
- NLTK: Texts and Frequencies - N-gram models and frequency profiles
- Parsing with NLTK
- Parsing with NLTK and Foma
- Categorial Grammar Parsing in NLTK
- Dependency Grammar in NLTK
- Document Classification Tutorial 1 - Amazon Reviews
- WordNet using NLTK
- WordNet and NLTK
- Framenet in NLTK
- FrameNet Examples using NLTK
- PropBank in NLTK
- Machine Translation in Python 3 with NLTK
- N-gram Models from Text for Language Models
- Probabilistic Context-free Grammar (PCFG) Parsing using NLTK
- Python for Text Similarities 1
- Python Tutorial 1: Part-of-Speech Tagging 1
- Lexical Clustering
- Linear Algebra
- Neural Network Example with Keras
- Computing Finite State Automata
- Parallel Processing on Multiple Threads
- Perceptron Learning in Python
- Clustering with Scikit-learn
- Simple Language ID with N-grams
- Support Vector Machine (SVM) Classifier Example
- Scikit-Learn for Computational Linguists
- Tutorial: Tokens and N-grams
- Tutorial 1: Part-of-Speech Tagging 1
- Tutorial 2: Hidden Markov Models
- Word Sense Disambiguation
- Python examples and notes for Machine Learning for Computational Linguistics
See the licensing details on the individual documents and in the LICENSE file in the code folder.
The files in this folder are Jupyter-based tutorials for NLP, ML, AI in Python for classes I teach in Computational Linguistics, Natural Language Processing (NLP), Machine Learning (ML), and Artificial Intelligence (AI) at Indiana University.
If you find this material useful, please cite the author and source (that is Damir Cavar and all the sources cited in the relevant notebooks). Please let me know if you have some suggestions on how to correct the notebooks, improve them, or add some material and explanations.
The instructions below are somewhat outdated. I use just Jupyter-Lab now. Follow the instructions here to set it up on different machine types and operating systems.
To run this material in Jupyter you need to have Python 3.x and Jupyter installed. You can save yourself some trouble by using the Anaconda Python 3.x distribution.
Clone the project folder using:
git clone https://github.com/dcavar/python-tutorial-for-ipython.git Some of the notebooks may contain code that requires various kinds of [Python] modules to be installed in specific versions. Some of the installations might be complicated and problematic. I am working on a more detailed description of installation procedures and dependencies for each notebook. Stay tuned, this is coming soon.
Jupyter is a great tool for computational publications, tutorials, and exercises. I set up my favorite components for Jupyter on Linux (for example Ubuntu) this way:
Assuming that I have some of the development tools installed, as for example gcc, make, etc., I install the packages python3-pip and python3-dev:
sudo apt install python3-pip python3-dev After that I update the global system version of pip to the newest version:
sudo -H pip3 install -U pip Then I install the newest Jupyter and Jupyterlab modules globally, updating any previously installed version:
sudo -H pip3 install -U jupyter jupyterlab The module that we should not forget is plotly:
sudo -H pip3 install -U plotly Scala, Clojure, and Groovy are extremely interesting languages as well, and I love working with Apache Spark, thus I install BeakerX as well. This requires two other [Python] modules: py4j and pandas. This presupposes that there is an existing Java JDK version 8 or newer already installed on the system. I install all the BeakerX related packages:
sudo -H pip3 install -U py4j sudo -H pip3 install -U pandas sudo -H pip3 install -U beakerx To configure and install all BeakerX components I run:
sudo -H beakerx install Some of the components I like to use require Node.js. On Ubuntu I usually add the newest Node.js as a PPA and not via Ubuntu Snap. Some instructions how to achieve that can be found here. To install Node.js on Ubuntu simply run:
sudo apt install nodejs The following commands will add plugins and extensions to Jupyter globally:
sudo -H jupyter labextension install @jupyter-widgets/jupyterlab-manager sudo -H jupyter labextension install @jupyterlab/plotly-extension sudo -H jupyter labextension install beakerx-jupyterlab Another useful package is Voilà, which allows you to turn Jupyter notebooks into standalone web applications. I install it using:
sudo -H pip3 install voila Now the initial version of the platform is ready to go.
To start the Jupyter notebook viewer/editor on your local machine change into the notebooks folder within the cloned project folder and run the following command:
jupyter notebook A browser window should open up that allows you full access to the notebooks.
Alternatively, check out the instructions how to launch JupyterLab, BeakerX, etc.
Enjoy!