Facebook - I spoke about pizza. Now I’m seeing adverts for pizza. Why?

Intro

You’ve probably heard about Facebook a lot in the news recently. And this has probably caused you some privacy concerns. I’ve even heard of some people who have come off Facebook completely because of this. Maybe some understanding of how Facebook works would have either stopped people doing this, or made them run a mile, we’ll see…

This blog post will talk about how natural language processing can be used to make targeted advertisements work effectively (1,2). While Facebook are likely doing lots more to make their targeted adverts far more effective than this example, I felt it would be fun to see natural language processing in action and see what kind of results my quick experiment would get.

Why?

At earthware, we ask “why” on a regular basis – you’ll see it all over our website.

So why does Facebook use targeted advertisements? Revenue. Advertisements are their main source of income as companies, both smaller names and many larger ones, pay a lot of money to have their adverts shown to the right people – “right” being, who the product or service is aimed at and who is most likely to part with their money for it. Sure, Facebook Inc have many sibsidaries including WhatsApp, Instagram, Oculus VR and many others, but advertisement on facebook.com is their biggest source of income.

Facebook-Messenger-ads-1-e1478830704240

With more than 1 billion daily active users (3), it’s pretty impressive that Facebook are able to target me for Nvidia deep learning, Adobe CC, React courses and various clothing stores (with specific items I’ve been looking at)!

But when you think about it, with that many regular users of your service, you’d probably be mad not to try and monetize them in some way that doesn’t break any laws, breach your own terms of service and privacy policy or strike up any moral or ethical issues against you…

How?

Well firstly, there are many, many ways that Facebook and other platforms use which are able to create fairly accurate representations of yourself for advertisement purposes. I will just be looking at natural language processing (NLP) using keywords in instant messaging. WhatsApp for example is a service (owned by Facebook) which is end-to-end encrypted and therefore, Facebook can’t run your messages through NLP. Facebook Messenger however, isn’t (4), so can run your messages through NLP.

The natural language processor

Natural language processing is a really cool part of AI which bridges natural human languages and computers – turning text (or speech) into some kind of data that makes it easier for computers to understand.

There are several natural language processors out there, but for my experimentation, I will be using IBM Watson (5). IBM Watson has many AI and services and is starting to offer tools to assist with Machine Learning as well. The Natural Language Understanding API has several “features” that allow for different kinds of interpretation and can enable context aware data analysis (but I won’t be going into this level of analysis in this blog) such as emotion, entities and categories. Using the categories API for this experiment offers pretty good accuracy as to what the current message is about.

The user advertisement profile

The categories API classifies content into a hierarchy which can go up to five levels deep.

Each time a user in this experiement sends a message, the text will be ran through the Natural Language Understanding API and the results stored. An example API response for “What do you fancy for lunch?” looks like this:

{
 "categories": [
  {
   "score": 0.382236,
   "label": "/food and drink/food/fast food"
  },
  {
   "score": 0.282951,
   "label": "/food and drink/food/salads"
  },
  {
   "score": 0.205618,
   "label": "/food and drink/desserts and baking"
  }
 ]
}

We can see that already, we have the general gist that this user is talking about food and drink, but any further than that it’s kind of guessing given that none of the scores are much higher than 0.38. Think of the score as it’s certainty. Storing these categories and some kind of weighting for them can build up a user advertisement profile by keep track of what is most relevant for that person. So, all I need to do to find the most relevant advertisement for a given user is to traverse through the most mentioned categories I have stored about a user and I should have a pretty good idea of what to show the user.

For testing, I have ran each of these messages through the categories API and recorded the results.

The user says…

What would you like for lunch?

I quite fancy some pizza.

How about that new Italian place around the corner?

Pizza Hut? No... I was thinking classier than that.

How about Pizza Express? Their pasta is good and I feel like pasta.

Yeah okay, let's go there.

Fab. 12:30 okay?

Perfect, see you then.

From the conversation above, the most relevant categories for this given user are:

  • Food and drink > food > salads
  • Food and drink > food > fast food
  • Food and drink > food > grains and pasta

Do these closely relate to pizza? Somewhat.

Salads are commonly served with pizza and Pizza Hut has a salad bar. Some would argue Pizza is fast food. Grains and pasta, well, pasta is commonly served in Italian restaurants where they also commonly sell pizza.

Had I taken into account some of IBM Watson’s NLP features such as entities, Pizza Hut and Pizza Express will have likely been mentioned. Had I taken into account emotions, I could probably determine whether or not a given user likes pizza or not.

Conclusion

Could I build the most accurate advertisement recommending algorithm with a week on support at work? No.

Could I build an advertisement recommending algorithm that isn’t that bad at accuracy with a lot more time? Probably.

Could Facebook build an advertisement recommending algorition with an accuracy right down to the exact product that your mind is on right now? Almost certainly.

In fact, they already have, and natural language processing plays a big part in it...

Footnotes

(1) There are many other ways that Facebook are likely using to make their targeted adverts more effective, this is just a quick example of how easy something like this could be implemented.

(2) By no means is this blog post claiming to be a “How does Facebook work?” article. Facebook likely have their own natural language processing engine and will probably run more than just messages through it to build up their profile on you.

(3) Over 1 billion daily active users, over 2 billion monthly active users: https://sproutsocial.com/insights/facebook-stats-for-marketers/

(4) By default, Facebook Messenger doesn’t use end-to-end encryption. It is however, possible to start an end-to-end encrypted chat with someone but this runs as a separate messages thread.

(5) https://www.ibm.com/watson/