GER (Good Enough Recommendations) is a recommendations engine that could directly add value and increase user engagement for many existing applications. GER is an open source npm module that you could download and start using right now. However, you probably want to know how GER works and how to use it to get good recommendations out of it.

In this post I describe GER’s core model, its practical features and its limitations to help you use GER to get good enough recommendations.

The GER Model

I am sorry for the formality, but formal models are the easiest way to remove ambiguity and describe precisely what is going on.

The core sets of GER are:

  1. P people
  2. T things
  3. A actions

Events are in the set that is the Cartesian product of these, i.e. P × A × T. For example, when bob likes the hobbit movie, this is represented with the event <bob, like, hobbit>.

The history of any given person is all the thing’s they have actioned in the form <action,thing>, i.e. A × T. The function H takes a person and returns their history. For example the history for bob after he liked the hobbit would be H(bob) = {<like, hobbit>}.

The Jaccard similarity metric, defined as the function J, is used to calculate the similarity between people using their histories. That is, the similarity between two people p1, p2 is the Jaccard metric between their two histories J(p1,p2) = (|H(p1) INTERSECTION H(p2)| / |H(p1) UNION H(p2)|).

For example, given that bob liked the hobbit and hated the x-men, where alice only hated the x-men.

  1. H(alice) = {<hate, x-men>}
  2. H(bob) = {<like, hobbit>, <hate, x-men>}
  3. H(bob) INTERSECTION H(alice) = {<hate, x-men>} with cardinality 1
  4. H(bob) UNION H(alice) = {<like, hobbit>, <hate, x-men>} with cardinality 2
  5. The similarity between bob and alice is therefore J(bob,alice) = 1/2

Jaccard similarity is a proper metric, so it is comes with many useful properties like symmetry where J(bob, alice) = J(alice,bob).

It is also useful to define similarity:

  1. Two people are said to be similar if the have a non-zero Jaccard similarity

Recommendations

Recommendations are a set of weighted things which are calculated for a person p and action a using the function R(p,a). The weight of a thing t is the sum of the similarities between the person p and all people who have <a, t> in their history. One additional constraint on R is that it only returns non-zero weighted recommendations.

For Example, given that:

  1. bob likes the x-men but hates harry potter.
  2. alice hates harry potter, and likes the x-men and avengers
  3. carl likes the x-men, the avengers and batman

What should be movie recommendations should bob like, i.e. R(bob,like)? We can calculate that:

  1. J(bob,bob) = 1
  2. J(bob,alice) = 2/3
  3. J(bob,carl) = 1/4

bob has three potential recommendations to like: x-men, avengers and batman. For each of these we can calculate the weight that bob will like them:

  1. x-men is J(bob,bob) + J(bob,alice) + J(bob,carl) = 1.92
  2. avengers is J(bob,alice) + J(bob,carl) = 0.92
  3. batman is J(bob,carl) = 0.25

Therefore, the recommendations for bob to like are R(bob,like) = {<x-men, 1.92>, <avengers, 0.92>, <batman, 0.25>}. Even though bob has seen x-men it has been included in the recommendations because he does like it. This would make sense if the recommendations were for something that could be consumed multiple times, like food or music.

Practical Changes to the Model

The above model is simple and wouldn’t be able to deal with some of the real world requirements and limitations. Therefore, some additional features and required limitations to the model to make it practical have been applied.

Additional Features

In the simple model each action is treated equally when measuring a persons similarity to another; this is not the case in reality. If two people liked the same thing they may be more similar than if they hated the same thing. By weighting each action, and finding the Jaccard similarity per-action then combining the results with respect to the action’s weight, the similarity function can more accurately represent reality.

When an event occurs is a very important concept ignored in the simple model. If a person liked the hobbit today, and x-men last year; they are probably more receptive to recommendations like the hobbit. To handle this, every event has an attached date of when it most recently occurred and:

  • The most recent events (defined using a variable for a number of days) are weighted higher than past events, done by calculating multiple Jaccard similarities with a weighted mean. Note: this may break the symmetry of our similarity function, further mathematicians are required

Recommending something that a person has already actioned (e.g. bought) could be undesirable. By providing a list of the actions to filter recommendations, selected recommendations can be removed if they occurred in a persons history. For example, it makes sense to filter hate actions to stop recommending things they clearly don’t want. However, they could potentially still receive recommendations for things they may have already liked, because every year they might like to re-watch movies again.

Limitations

When dealing with large sets of data practical limitations are necessary to ensure performance. Here is the list of limitations imposed on the above model and features.

Model Limitations

The first limitation is to not generate recommendations for a person that has under a minimum amount of history. For example, if a person has only liked one movie, their generated recommendations will probably be random. In this case GER return no recommendations and lets the client handle this situation.

The most expensive aspect of GER is finding and calculating similarity between people. This is especially expensive for any person who has a large history and every person they are similar to. Given that a person with a large history is similar to many people, only a few such people can significantly decrease the performance of the entire engine. To ensure this is not the case, a few limitations were put in place:

  1. Limit the number of similar people to find, while attempting to find the most similar people for a users recent activity
  2. Limit the size of the history when calculating similarities

Finding and weighting every potential recommendation from all similar people may also be expensive and returning every recommendation is likely superfluous. For this the limitations in place are:

  1. Only recommend the most recent events from the similar users
  2. Only return a number of the best recommendations

Limiting the number of similar people, the length of their history, and the amount of recommendations to find, all have different performance and accuracy impacts per data-set. Finding the best values for these is a learning process through trial and error.

An important aspect to note about these limits is that they may create the potential for abuse and malicious manipulation of the recommendations. A way to see this is by considering a person who hates all movies, but only likes one. The implications of such a user are:

  1. They will be similar to all people who have hated anything
  2. Due to limiting history size, they may be a much higher similarity than they would otherwise have been
  3. Every person would include in their potential recommendations the movie the malicious person likes

Therefore, a person who profits from manipulating recommendations of other users, may attempt to manipulate the system this way.

Data-set Compacting Limitations

Given the above description, it is cleat that some events will never be used. For example, if the event are old or belong to a user who has a long history they will not be used in any calculations. These events just loiter, take up space and slow calculations down. By trying to identify these events with some basic heuristics and removing them, it can dramatically speed up performance and decreases the size of the data-set. I call these compacting algorithms.

Currently there are two main compacting algorithms:

  1. Limit the number of events per person, per action, e.g. ensure bob has a maximum of 1000 hates.
  2. Limit the number of events per thing, per action, e.g. ensure that a hobbit only has 1000 hates.

These compacting algorithms delete the oldest events first as newer events carry more practical importance. They also solve the problem stated above about the malicious user who hates everything, as their history will be reduced and they will be similar to less people.

Like the other limitations, the numbers associated with the compacting limitations are data-set specific, and can probably be best found through trial and error.

The Algorithmic Description

The API for recommendations follows the core model and accepts a person and an action and returns a list of weighted things by following these steps:

  1. Find similar people to person by looking at their history (limiting the number of returned similar people)
  2. Calculate the similarities from person to the list of people (limiting the amount of history)
  3. Find a list of the most recent things the similar people have actioned (limiting the number returned)
  4. Calculating the weights of things using the similarity of the people (filtering based on filter actions and retuning the highest weighted)

Technology

GER is implemented in Coffee-Script on top of Node.js (here are my reasons for using Coffee-Script).

A core abstraction is the Event Store Manager (ESM), which implements the persistency and similarity calculation. Currently there is an in memory ESM and a PostgreSQL ESM. There is also a RethinkDB ESM in the works being implemented by the awesome linuxlich.

Help

Now you know how GER works you can help out. Please consider using and testing it out. If you are able to contribute, consider creating an ESM for you favourite database. The links are:

  1. NPM package: https://www.npmjs.org/package/ger
  2. GitHub repo: https://github.com/grahamjenson/ger

I am open to suggestions and improvements (especially if they are in the form of a pull-request or fork!)

The overall goal is a recommendations engine that will be good enough for most users.