The Two-Edged Sword
In the last blog, we discussed how we can target users given an app using personas. In this blog, we will discuss the other side of it, given a user, how we can find the best set of apps based on their recent history. Basically, the two use cases are connected because both rely on users’ behavior but are different in terms of the AI (Artificial Intelligence) approach and the entity in consideration.
In user targeting, we talked about Type II error (false negative) in statistics. Whereas in recommending the next apps to a user might download, we should focus on not recommending the apps which a user won’t like i.e. reducing the Type I error (false positive). So let us treat them as two separate use-cases for the business as well as data science.
Next Apps to Download
Through extensive research and data analysis, we observed that the top 5 popular apps are common across device models but the difference is significant to post that. The top 50 apps used are 60% common across device models and the order of preference is moderately correlated at 0.56. Showing the same set of apps won’t make sense, since the above statistics become more significant at the user level. In order to learn the set of apps a user can download next, we can divide the problem into three parts:
- A batch recommender model, based on the last-day app usage of the user.
- A real-time sequential model, that adjusts the batch recommendations based on more recent user behavior.
- A Cold start module, that serves when no information is available i.e. a new user.
The Batch Recommender Model
For the batch model, the recommender system can be reduced to two steps:
- Look for users who share the same app usage/ install pattern with the active user (the user whom the prediction is for).
- Use the relevance from those like-minded users found in step 1 to calculate a prediction for the active user.
This falls under the category of user-based collaborative filtering. We can daily train the ALS matrix factorization model with recent data, the output will be latent features for user and app in the same vector space. The dot product gives the relevance score for each user and app.
We can also club the app recommendations with content-based filtering (similar apps) but if a user installs two similar apps significantly then that should ideally be captured in the latter approach. Through research, analysis and feature engineering, we should be very cautious about how we define relevance i.e. how much a user liked the app, as this is going to define how well the model trains itself to real-world scenarios. In case we get explicit user feedback on an app, we may directly use the same as relevance.
The Real-time Sequential Model
The real-time model can be a logical model that uses session-level feedback data to adjust the relevance scores. The positive signals (installing an app) and negative signals both are collected and pushed as feature vectors to the pre-computed relevance scores.
Another way could be to treat this as a supervised machine learning problem and predict the probability of app installation based on session-level data. Using the probabilities, we can adjust the relevance scores. This takes both historical and current user behavior to map the apps.
The final way could be to use deep learning models like LSTM (Long Short-Term Memory) model, a recurrent neural network that performs well in real-world scenarios, owing to a powerful update equation and backpropagation dynamics. It takes sequential data for input for example first 10 or 15 apps for a user then the next 5 or 10 apps that they installed will be fed sequentially in the model in each step correcting the model by comparing the model output and actual app.
User cold start problems can be reduced using LSTM to some extent. This can be used independently with batch recommendations which act as the first-time recommendations and then real-time recommendations keep overtaking as we keep collecting the data. When processing long sequences, a transformer model or additive attention model should be preferred over LSTM, while incurring the low computational cost.
Cold Start Module
Whenever no user information is present we can show a default next app download recommendations based on:
- Top popular apps based on a cluster (city, state, device_type)
- Trending apps
Final Note
- Always start with understanding the user behavior through EDA.
- Do A/B tests on different logical / machine learning/ deep learning models or combinations of models along with UI.
- The way we show content is as important as the content itself.
- Sometimes, simple logically driven recommendations can outperform more sophisticated algorithms.
“Personalization is the automatic tailoring of sites and messages to the individuals viewing them so that we can feel that somewhere there’s a piece of software that loves us for who we are.”
— David Weinberger