Alternating Least Squares (ALS) System
Jump to navigation
Jump to search
An Alternating Least Squares (ALS) System is a least squares system that implements an alternating least squares algorithm to solve an alternating least squares task (typically supporting a collaborative filtering task).
- Context:
- It can range from being a Single-CPU ALS System to being a Distributed ALS System.
- It can (often) be a Matrix Factorization-based Recommender System.
- …
- Example(s):
- Counter-Example(s):
- See: SVD Matrix Factorization, Weighted Regularized Matrix Factorization.
References
from pyspark.mllib.recommendation import ALS, MatrixFactorizationModel, Rating
# Load and parse the data data = sc.textFile("data/mllib/als/test.data") ratings = data.map(lambda l: l.split(','))\ .map(lambda l: Rating(int(l[0]), int(l[1]), float(l[2])))
# Build the recommendation model using Alternating Least Squares rank = 10 numIterations = 10 model = ALS.train(ratings, rank, numIterations)
# Evaluate the model on training data testdata = ratings.map(lambda p: (p[0], p[1])) predictions = model.predictAll(testdata).map(lambda r: ((r[0], r[1]), r[2])) ratesAndPreds = ratings.map(lambda r: ((r[0], r[1]), r[2])).join(predictions) MSE = ratesAndPreds.map(lambda r: (r[1][0] - r[1][1])**2).mean() print("Mean Squared Error = " + str(MSE))
# Save and load model model.save(sc, "target/tmp/myCollaborativeFilter") sameModel = MatrixFactorizationModel.load(sc, "target/tmp/myCollaborativeFilter")
2017
- http://blog.datumbox.com/drilling-into-sparks-als-recommendation-algorithm/
- QUOTE: The ALS algorithm introduced by Hu et al., is a very popular technique used in Recommender System problems, especially when we have implicit datasets (for example clicks, likes etc). It can handle large volumes of data reasonably well and we can find many good implementations in various Machine Learning frameworks. Spark includes the algorithm in the MLlib component which has recently been refactored to improve the readability and the architecture of the code.
2017
- http://mahout.apache.org/release-notes/Apache-Mahout-0.13.0-Release-Notes.pdf
- QUOTE: Mahout has historically focused on highly scalable algorithms, and since moving on from MapReduce-based jobs, Mahout now includes some Mahout-Samsara based implementations:
- Distributed Alternating Least Squares (ALS)
- QUOTE: Mahout has historically focused on highly scalable algorithms, and since moving on from MapReduce-based jobs, Mahout now includes some Mahout-Samsara based implementations:
2016
- http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html
- QUOTE: In this chapter, we will use MLlib to make personalized movie recommendations tailored for you. …
… In particular, we implement the alternating least squares (ALS) algorithm to learn these latent factors.
…
import org.apache.spark.mllib.recommendation.{ALS, Rating, MatrixFactorizationModel}
…… We will use MLlib’s ALS to train a
MatrixFactorizationModel
, which takes a RDD[Rating] object as input. ALS has training parameters such as rank for matrix factors and regularization constants.
- QUOTE: In this chapter, we will use MLlib to make personalized movie recommendations tailored for you. …
object ALS { def train(ratings: RDD[Rating], rank: Int, iterations: Int, lambda: Double) : MatrixFactorizationModel = { // ... } }