Recommendation engines could be beneficial for many applications as they can directly add value and lead to greater engagement for users. However, there is significant overhead in implementing a custom solution and many off-the-shelf engines have overcomplicated APIs, or try to be infinitely scalable which is not needed by most applications.
In this post I introduce the Good Enough Recommendation (GER) engine. GER (pronounced like this) is built to be easily usable through a simple API, as well as being reasonably fast and scalable, to let developers focus on their applications and not a recommendation engine.
Good Enough is All You Need (Right Now)
A recommendation engine is a feature (not a product) — Why You Should Not Build a Recommendation Engine
When developing a product, the recommendation engine is a secondary consideration (unless the product is a recommendation engine). Building a custom engine is a difficult and time consuming challenge, and many existing engines are complex to setup and get running. These problems make many developers choose not to use a recommendation engine, even if there could benefit their application.
GER’s goal is to let developers easily integrate a recommendation engine that is satisfactory for their product, and not overly complex to get up and running. As your product grows and becomes successful, if your recommendations need finer configuration or greater scale, then other solutions like Apache Mahout can be used. But right now, if you want to add a recommendation engine, then GER will be good enough.
GER and its API
GER is a collaborative filtering calculator. Events go into GER and predictions about future events come out.
Its API includes only four ‘types’: person:String, a thing:String, and an action:String that has a weight:Integer.
An event is a person performing an action on/to a thing, e.g. “ann” “buys” “product_1”:
event(person, action, thing)
Each action has a weight (defaulting to 1) which determines how important it is to GER’s predictions, e.g. buying is more important than viewing. The weight of an action can be altered with:
set_action_weight(action, weight)
GER can return an ordered list of recommended things for a person to action, e.g. recommend things for “ann” to “buy”:
reccommendations_for_person(person, action)
GER calculates these recommendations by:
- finding a list of similar people to the person
- then finding things those similar people have actioned
- each thing is then scored and sorted based on the number and similarity of the people who have actioned it.
GER contains no business rules, limited configuration, and almost no setup required. The benefit is that it is fast and easy to understand, sacrificing the infinite scalability and endless configurations of other engines.
Technology
I plan to write more posts about the inner workings and development of GER, as it was really challenging and fun to create. Here is a brief description of GER:
- GER is implemented in Node.js
- using CoffeeScript
- with Q promises and knex for understandable, asynchronous code.
- GER was developed in a TDD style, and is well tested using Mohca, Chai and Sinon.
Accuracy
It would be impossible to state how accurate GER is for a specific application. However, in the interest of testing GER, I have performed some of my own little experiments.
I took 7 months worth of events where users viewed or purchased products. I then used the first month of events and added them GER. I weighted the purchasing action to 100, and the viewing action to 1. These were intuitive weights gained through trial and error, not rigorous optimisation.
I then selected user events from the last 6 months where:
- the user had at least 10 events in the first month (to select users with a reasonable chance for prediction)
- purchased at least one thing in the last 6 months that GER knew about
From there I selected 500 random purchase events and compared each of them to GER’s top 10 recommendations for the purchasing person. The results were:
- about 16% of the purchases were in the 10 predicted things by GER
- The mean position for a correct prediction was 2.5
- GER recommendations took on average 124.8ms to complete
For a control, I compared GER’s recommendations against just always recommending the top 10 purchased things. The results for this control were:
- 5% of the purchases were in the top 10
- The mean position for a purchase in the top 10 was 3.5
These are very positive results that show with limited bootstrapping (only one months worth of data) and a naive configuration GER was able to:
- predict a significant amount of user actions.
- have a high accuracy in what was predicted.
- return results significantly better than a top 10 list
Things Left To Do
GER is not finished, there are many things left to complete and this post was just an introduction to get some feedback. I still need to make GER an easily deployable micro-service and increase documentation and tools to support GER’s use.
GER would be more usable if it were wrapped into a micro-service. To easily deploy GER it could be further wrapped into a Docker container which would greatly simplify integration.
Documentation for GER is lacking. I will iteratively improve this as I continue to develop it.
Conclusion
Watch this space, as GER continues to be developed. Go try it out, and as always, comments and feedback are welcome.
Thanks to Cam Evans for the thumbnail image
Other Recommendation Engines and Links
There are some great recommendation engines and other resources I came across when developing GER:
A Comparative Study of Collaborative Filtering Algorithms is a paper going over many different collaborative filtering algorithms, implementations and characteristics.
The Algorithm Design Manual is a really useful book when thinking about designing and implementing algorithms.
PredicionIO is a larger, more product based recommendation application.
Apache Mahout is a a massively scaleable engine that can use clusters of computers executing many different algorithms.
Raccoon was the initial inspiration for GER and a recommendation engine written in Node.js using Redis.