This article will go through how to use the popular XGBoost library for Learning-to-rank(LTR) problems
The most common use cases of LTR are Search Engines and Recommender Systems. The ultimate goal of ranking is to order items in a meaningful order.
This article will use the popular XGBoost library for movie recommendations.
When starting working on LTR, my first question was, what is the difference between traditional machine learning and ranking problems? So this is what I found. Each instance has a target class or value in traditional machine learning problems. For example, If you are working with a churn prediction problem, you have the feature set for each customer and relevant classes. Likewise, our output would be a customer id and predicted class or probability score. But in LTR, we don’t have a single class or value for each instance. Instead, we have multiple items and their ground truth value per instance, and our output will be the optimal ordering of those items. For example, If we have a user’s past interaction with items, Our aim is to build a model capable of predicting optimal user-item pairs.
Now it’s time to get into the coding part. For simplicity, I’ll use the movielens¹ small dataset. You can download the dataset using the below link.
Let’s load the dataset and do basic preprocessing on the dataset.
In this dataset, we have 100,000 ratings and 3,600 tag applications applied to 9,000 movies by 600 users.
Let’s quickly look at the rating column.
After looking at the above plots, I added a time-based, day-based feature for the modeling. So, I will create user-level and item-level features. For example, for some movie “X”, I am getting a total number of users interacted with, 5,4,3,2, and 1-star reviews received. Also, I am adding received reviews daily and received reviews after 5 PM.
Let’s split the dataset into train and test sets. I’ll use the past as training, and the latest data will use to evaluate the model.
Now it’s time to create model inputs. Since the ranking model differs from traditional supervised models, we have to input additional information into the model. Now time to create the model. We will use xgboost, XGBRanker. Let’s focus on it’s.fit method. Below is the docstring for XGBRanker().fit().
Signature: model.fit(X, y, group, sample_weight=None, eval_set=None, sample_weight_eval_set=None, eval_group=None, eval_metric=None, early_stopping_rounds=None, verbose=False, xgb_model=None, callbacks=None)
Docstring: Fit the gradient boosting model
X : array_like Feature matrix
y : array_like Labels
group : array_like group size of training data
sample_weight : array_like group weights
.. note:: Weights are per-group for ranking tasks In ranking task, one weight is assigned to each group (not each data point). This is because we only care about the relative ordering of data points within each group, so it doesn’t make sense to assign weights to individual data points.
As per the below docstring, we have to input group for both training and test samples. So the question is how to create a group array in ranking models. I saw many people who struggled to understand this group parameter.
In simple words, the group parameter indicates the number of interactions per user. As per the below snippet, you can see user number one has interacted with two items (11 & 12). Hence, the user 1 group size is 2. Furthermore, group length should equal the number of unique users in the dataset, and the sum of group size should equal the total number of records in the dataset. In below example group parameter is [2,1,4].
Let’s create model inputs. We can use the below code for that.
Now we have train and test inputs to feed into the model. It’s time to train and evaluate the model. Before doing that, I have a few terminologies to explain for the article’s completeness.
When model building, measuring the quality of the predictions is essential. What are the available measurements for evaluating recommendation models? There are few, but the most common measures are Normalized Discounted Cumulative Gain (NDCG) and Mean Average Precision (MAP). Here I will use NDCG as an evaluation metric. NDCG is the enhanced version of CG (Cumulative Gain). In CG, recommending an order does not matter. If your results contain relevant items in any order, this will give you a higher value, indicating our predictions are good. But in the real world, it’s not the case. We should prioritize relevant items when recommending. To achieve this, we should penalize when low relevance items appear earlier in the results. That is what DCG does. But still, DCG suffers when different users have a different set of item/interaction counts. That’s where Normalized Discounted Cumulative Gain (NDCG) comes into play. It will bring normalization to the DCG metric.
Now we can move to the model part.
Now we can generate some predictions.
Here are some generated predictions.
It’s always nice to evaluate the coverage of your recommender model. Using the coverage metric, you can check the percentage of training products in the test set. The more coverage better the model. In some cases model tries to predict popular merchants to maximize NDCG and MAP@k. I had this issue when I was working on starpoints product recommendations. When we have doubts about our evaluation metric, we can quickly check the coverage of our model. In this model, I got around 2% coverage. Indicates our model should improve further.
Additionally, we can plot feature importance as follows.
In this article, we went through the basics of learning-to-rank problems, how we can model rank problems, and a few tips related to evaluating recommendation models. Although this article shows how we can use xgboost for product ranking problems, we can also use this approach for other ranking problems.
1 [Grouplence ]
F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4: 19:1–19:19. https://doi.org/10.1145/2827872
Thanks for reading. Connect with me on LinkedIn.
Learning to Rank for Product Recommendations Republished from Source https://towardsdatascience.com/learning-to-rank-for-product-recommendations-a113221ad8a7?source=rss----7f60cf5620c9---4 via https://towardsdatascience.com/feed