Spark MLlib is a machine learning library designed for parallel execution on clusters, featuring algorithms for various learning tasks like classification, regression, and clustering. It consists of two packages: spark.mllib, based on RDDs, and the recommended spark.ml, which utilizes DataFrames for greater flexibility. MLlib supports a range of supervised and unsupervised algorithms, along with model-based collaborative filtering for recommender systems using the ALS algorithm.