This document discusses computationally intensive machine learning at large scales. It compares the algorithmic and statistical perspectives of computer scientists and statisticians when analyzing big data. It describes three science applications that use linear algebra techniques like PCA, NMF and CX decompositions on large datasets. Experiments are presented comparing the performance of these techniques implemented in Spark and MPI on different HPC platforms. The results show Spark can be 4-26x slower than optimized MPI codes. Next steps proposed include developing Alchemist to interface Spark and MPI more efficiently and exploring communication-avoiding machine learning algorithms.