Personalisation with Recombee

The following is a transcript taken from Murray Woodman's talk at DrupalSouth 2019. It has been edited and enhanced with images for clarity.

10 December 2019

 

Transcript

Thanks for all coming along. My name is Murray Woodman and today I am going to be showing you how to personalize your website using a recommender as a service called Recombee.  

The topic of personalization is something that has interested me for the last 10 years or so where I was getting interested in collaborative filtering and collective intelligence but back then you had to write the code yourself there was a lot of CPU required to sort of manage the data and frankly it was too much like hard work to write your own recommender as a service but these days it's different, ten years have passed and there are a lot of different services out there providing these kinds of things. Recombee is a service that is very easy to use and I'll be showing you how to use that today.

Before we jump into that I do want to cover a couple of areas.
 

Content management is a solved problem

The first one is that content management is essentially a solved problem. The things we're doing today in Drupal are very similar to the things we were doing ten years ago. We were performing operations doing the presentation and more recently we've got restful web services and things such as JSON API. But really things haven't changed so much. 

In the marketplace Drupal does have a lot of competition, of course there are other CMS's out there and conferences like this are designed to make Drupal better to compete against them. Drupal still does have its strengths and in my mind largely this is based around content modeling and the ability for editors to be able to edit that content in a nice UI. We've got entities, we've got fields, we can make relationships. This is one of the things that really got me into Drupal all those years ago. Now we've got JSON API making it super easy to build decoupled services. And of course Drupal's great at doing integrations.

So Drupal has a lot of strengths and we're in an excellent position to compete in this marketplace. But I think we really do have to start thinking about how we put systems together to be relevant going forward.
 

The CMS is only one part of the puzzle

Furthermore the CMS is only one part of the puzzle. I think as Drupal people we probably think of problems in terms of a CMS and what we can do with it and how we can solve these issues. But really there's a lot more pieces to the puzzle.

Personalisation Channels

This particular diagram was created by a group of researchers called the Real Story Group and a lot of their work is quite good. They've sort of basically analyzed that the CMS market and other markets. You can see here that the web content management system the green one is just one part of a number of different engagement services. 

Importantly we have this thing called marketing automation that's living alongside the web content management system and this is an area which has been sort of exploding in the last ten years. There are hordes of marketers out there doing marketing automation using tools and they're not even thinking about a CMS. They're thinking about how to communicate with their customers. The CMS is not necessarily the first thing they're thinking about.

We also have the data backbone (CDP's) which is another area that's been flourishing in the last five years. 

All of these things are starting to build out content management services to help them operate and that potentially is going to start competing with Drupal as well. So I think as Drupal people when we start building sites we need to shift our thinking.

We need to stop thinking about just publishing stuff out from one to many and we have to start thinking about communicating with people one-to-one, what do we know about those people, how can we personalize content for them.

We also have to think about user experience not just being in the website as we've seen there's a variety of sort of touch points there such as mobile and social and email our users are communicating with our organization's on a number of platforms and the user experience we're orchestrating has to operate across all of those, not just the website.

A lot of the smarts of the system and the way we are personalizing is going to be living outside the CMS. It could be living in a CDP or a marketing automation system. A lot of the decisions that are being made on how things are going to be personalized are not necessarily going to be happening in Drupal. 

I think in the Drupal world as well we’ve got this dichotomous view of how we treat users. They're either logged in and receiving a customized experience with heavy page loads or on the other side of the coin we have anonymous users who are receiving pages that have been cached. A decoupled way of thinking is going to break down this dichotomy. It's something that we really have to think about a lot. How can we provide personalized experiences for anonymous users?

I think to answer these questions we have to reconsider how we build out our systems. A lot of the ways we've thought about web sites potentially have to change as we try to deliver these personalized experiences to anonymous people.
 

Personalisation

When we are building sites in Drupal we love modeling our data and our content. We have the user experience as well where we're defining different personas and different user journeys and what will happen on a website. We imagine how those users are flowing through. 

However, there's a third part of the picture here which I’ve generically called “marketing”. What are the users doing? What are they interested in and how can we start talking to those users?

Personalisation Venn Diagram

That bit in the middle is personalization and for me this is the most exciting part about web development these days. How can we bring all of these parts together so that it forms a coherent picture and so that we can start personalizing things for users.

It requires design thinking around how do we model the content what do we know about the personas and the people who are accessing the site. How can we take a data-driven approach to personalizing for these people.

I have presented the big picture things but today I'm taking a much more sort of smaller piece of the puzzle - one that we can solve. And that is to provide recommendations to users. I'm not going to show you how to build out a whole marketing stack. We're just showing you how to do recommendations to users. This will include recommendations to a user on the homepage for example, and you want to show them content relevant to them, as well as recommendations when a user is on a particular item which provides context to Recombee when providing recommendations.
 

Recombee

Recombeei bills itself as an “artificial intelligence powered recommender as a service with an intuitive RESTful API and SDK tailored by data scientists”. 

The really important takeaway here is that it's a SaaS service. It has a friendly API to use and it's got a lot of smarts behind it. It's scalable and it's doing some cool things under the hood that you don't necessarily have to worry about. It's very easy to interact with.

So how does Recombee do it? In two main ways. There are a number of other strategies but the main two would be collaborative filtering and content similarity.

Collaborative filtering is where we're tracking user behavior across a number of different users and then providing recommendations. So if we have users that like similar things it’s a fair bet that someone who is similar to those users is also going to like the same kinds of content.

In order to do this we need to track user behavior on the site.

The second way that recommendations can be derived is by content-based mechanisms and in this case we're building up recommendations based on item similarity.

Say we have two articles: Article A and Article B. They both share similar tags and categories. Maybe the titles are similar or they're written by the same author. Obviously they're going to be very similar so Recombee is able to use these fundamental properties of the content to provide the recommendations. It's very similar to “more like this” with Solr which you may be familiar with. Solr is taking a similar approach there where it's lining up the different facets and working out similarities.

In order for this one to work we've got to get the data from Drupal over into Recombee. How do we do that? Well we use their private API to push that stuff across and a little bit later I'll be showing a module that we've built at Morpht to allow this to take place.
 

Using Recombee clientside

// Get some recommendations:
var callback = function (err, res) {
  if (err) {
    console.log(err);
    // use fallback ...
    return;
  }
  console.log(res.recomms);
}
// Get 5 recommendations for user-13434
client.send(new recombee.RecommendItemsToUser('user-13434', 5), callback);

This is what it looks like. It's simply tracking things that a user is doing around the site. In this case we're tracking a detailed view. There are a number of other events you could listen to, like when someone purchases an item or someone saw a [partial] view. There are a few different events there but essentially we're just pushing over the user’s interaction with an item and then getting the data back.

We are saying “hey Recombee” give me five items there which are recommended for this particular user. That request goes to Recombee, a response comes back as JSON and our callback fires. If there's an error that can be handled, then the recommendations are there for you to consume. Generally you would be consuming JSON, populating it into a HTML template and then writing that back down into the DOM.

{
  "recommId": "e8bdec4c-6357-47bf-be4f-8bed5f739db2",     
  "recomms": [
    {
      "id": "beta-node-371", 
      "values": {
        "url": "https://beta.site-showcase.com/article/wordpress-hosts-australia", 
        "site": "beta", 
        "image": null, 
        "authors": [], 
        "summary": "WordPress hosts in Australia.", 
        "products": ["product:wordpress"], 
        "title": "WordPress hosts in Australia",
        "type": "article", 
        "audiences": [], 
        "topics": ["topic:hosting"]
      }
    },
    ...
  ]
}

And here is a little example payload here of some data that's that's coming back all of these values that we see have been pushed over into the Recombeee back-end. You can see this is pretty easy to iterate through and build up recommendations that will be displayed on the website on the server-side. 

[On the backend] there's a separate API. You can see that there's a private token for your back-end to talk to Recombee. You do not want to have anonymous users using this because they're going to be able to mess up your database essentially. So in this case we have an example of us pushing the values of a node over into a Recombee.
 

Search API Recombee

This is the module we've built. The module is called Search API Recombee. 

We're using a Search API back-end in order to get that data across. It may seem a little bit strange to use Search API. We're not actually doing any searching but what we are doing is indexing the content. We're using the indexing abilities of Search API so that whenever a node is updated the node index knows about it and says to the backend “hey go update this node” and it gets pushed over into Recombee. So we're not using any search or facets. Once that data is in there we can see it and we can access it via the Recombee API, just not through search or facets.

A really cool feature we've built in is support for federated indexes. So let's say you have a suite of marketing sites, Alpha and Beta. They both can both be pushing their nodes into the same index in the back-end of Recombee. By getting these two into one we're then able to do recommendations between both of the sites.

This is a very nice feature because if you have a suite of marketing sites and the user is only interacting on one you can then start pushing content to them on another website and get them to jump across and start interacting with another marketing site. The way we do this is to send a “site ID” across into the index which allows us to filter these things. 

Personalisation Backend

This is what it looks like in the back-end of Drupal. It's just Search API Server we're looking at there and you can see we have the Reconbee back-end. You configure the connection and away you go. It's exactly like a Database or Solr back-end.

Personalisation Recombee

This is what the data looks like when you are over in Recombee. You can see here I'm just filtering on “Drupal” and we've just got a few IDs coming back. You can see we've got Alpha and Beta websites there on the left-hand side.
 

Ontology

We're indexing content from both sites. This is where ontology comes in. Getting your taxonomies right is super important. The content modeling that you do up front in the research part of the project gets reflected in the metadata we've got inside Drupal and how it is then stored in Recombee.

There are two important things with this. Firstly, Recombee's able to use this information to improve the item similarity. It’s able to look at these various columns and properties and work out the similarity for its recommendations so that will improve the results you're getting back. Secondly when you're getting that JSON payload back all of the data is there that you want. You don't have to hit Drupal to get the data back. This makes displaying the response a much more efficient process.
 

Orchestration

I do want to say a few words on what I'm calling orchestration - how do we bring this whole system together? When we sat down to solve the problem of how can we get a nice simple personalization system up and running we we threw around a few ideas... “Oh can we build a super module to rule them all” or “How should we get all this together?” We've basically said “Okay, we're going to be using best-of-breed services and we're going to have them as being loosely coupled and this is going to allow us to provide personalization solutions together in the way that we want.”

We didn't want to necessarily buy into a monolithic stack. This is an area where there's a lot of change and there's a lot of new services coming up. Obviously people have different needs and will have different services they want to use.

We've taken a decoupled architecture as I've mentioned and the solution is largely a Drupal agnostic. All of the code you are about to see will work on any site. It will work on Drupal, WordPress or a static site. The personalization approach we've taken will essentially work anywhere. The only Drupal stuff that we've seen is the Search API backend to help with indexing the data.

We also want it to be scalable. Whilst Drupal is going to be a super important service for modeling the data and holding it we don't necessarily want to be hitting it every time. So that ability for Recombee to give us a nice payload with everything there allows us to get a result to the page in a couple of hundreds of milliseconds, ie. as fast as possible because you want those results coming back quickly.

The way we've decided to do this is with Google Tag Manager.

We didn't want to put JavaScript into the Drupal site or into the module or into a theme. We wanted it sitting outside and that's going to give us a lot more flexibility. The fact that it's sitting outside of Drupal opens it up to data specialists and to marketers and these are the kinds of people who may want to wire together solutions of their own.

Google Tag Manager handles things such as revisions and it's got some variables and environments. It handles the asynchronous nature of the internet with tag management where tags can fire after other tags. You can really put together some quite sort of complicated things relatively easily. It is the glue between the systems.

We're not talking a no code or low code approach here. We are actually using code in the most efficient way - and that's to wire the systems together. Because sometimes a few lines of code can replace a whole heavy system.

So this is what the snippet looks like….

<div class="recombee" 
  data-booster="if ''topic:php'' in 'topics' then 2 else 1" 
  data-count="5" 
  data-filter="'title' != null and 'site' == ''alpha''" 
  data-type="items-user">

  <script type="text/x-handlebars-template">
    <ul>
      {{#each recomms ~}}
        <li><a href="{{values.url}}">{{values.title}}</a></li>
      {{~/each}}
    </ul>
  </script>
</div> 

We've tried to go as low tech as possible:

  • We have a “div” here with the Recombee class
  • We have a number of data attributes that can be passed as parameters to Recombee
  • In this case we can see some boosting components here where we can boost certain topics if we want.
  • We've got a count of five for the results that are coming back and…  
  • We can filter out some sort of results that we don't want and in this case also filter that the site equals alpha.
  • Finally we've got the data type where we want some recommendations for a user.

We've also got a handlebars template there in that script tag. It's a low-tech approach that's going to work in a number of environments. This gives the site builder or editor a really quick and ready way to to get in and transform those results that are coming back. Essentially it makes it very easy to mold the the HTML that's been produced into something that's going to work depending on what design system or theming system you use.

So that's the snippet:

  • It is just going on the page in a block and… 
  • Google Tag Manager is going down and processing that and… 
  • Then accessing the Recombee API.
     

Conclusion

Personalization is an active area of research for us and we're really looking to work out ways that we can come up with cost-effective ways to do this personalization. We are looking to build our integrations with other services. We've seen that we're using Recombee. I've mentioned CRM s and email marketing and that would be the next thing.

We've seen the rise of the CMO marketing budgets are getting bigger and marketers are deciding what technology they're going to be using and increasingly this means looking at automation systems where the CMS is not necessarily foremost in their mind.

We know we've got challenges from in the CMS space but also from CDPs, CRMs and email marketing. Just last week MailChimp announced that they are releasing a website builder. The week before that Salesforce said there they've got a website builder as well. So we can see these other systems are getting into the content game. 

Where is that going to leave Drupal? If Drupal is not addressing these problems we're just going to be a CMS doing CRUD operations and the challenge is how can we get out of that. Drupal needs to leverage its strengths: its content modeling and JSON:API are two core things that we can use. I think if we do that we're going to remain relevant.

For the Drupal developers out there when we're talking about decoupled I think “decoupled” is often related to the concept of decoupling the presentation from the data in Drupal. But really decoupled is much bigger than that. It's taking a data first approach and thinking of how can we build systems that are using data in a decoupled way so don't just think of it as presentation think of it as how can we put systems together where Drupal is playing a really important role in providing that data.

And I think if we do that Drupal will find a place as a first class citizen in the marketing stack of the future.

Download the slides

 

More like this