MLlib edit  

Spark library for running Machine Learning algorithms. Supports a range of algorithms (including classifications, regressions, decision trees, recommendations, clustering and topic modelling), including iterative algorithms. As of Spark 2.0 utilises a DataFrame (Spark SQL) based API, with the original RDD based API now in maintenance only. First introduced in Spark 0.8 after being collaboratively developed with the UC Berkeley MLbase project, and still under active development.

Technology Information

TypeSub-Project
Parent ProjectApache Spark
Last UpdatedAugust 2017

Blog Posts