Data

Machine Learning: Who’s Teaching Whom?

Posted by Alan Raistrick

February, 2014

I love learning. And being a software engineer, naturally, the learning never stops. That inclination took me to a workshop on machine learning techniques and their applications taught by Dan Wiesenthal (CTO at Scout.ai, MS in Computer Science from Stanford) at Hackbright Academy in SF. Machine learning sounds intimidating. The first question is who is teaching whom? The engineer or the machine? The answer is surprisingly straightforward.

The instructor used a streaming movie example. He discussed how to build a website that recommends movies you might like (think Netflix or Amazon). Suppose our website knows you like The Avengers. How would we recommend related movies? The simplest approach might be to recommend movies in the same genre like superhero movies. However, what about a really bad superhero movie? We wouldn’t want to recommend that one. How could we make better recommendations, with as little work as possible?

If you only have one user, and that one user has only watched one movie, it would be purely subjective. However, a large company could have millions of users, who have watched thousands of movies. We could leverage all that data to provide a great recommendation.

Following the example, out of people on the website who liked The Avengers, what other movies did they like? Let’s say 8 out of 10 of those people liked Iron Man, but only 1 out of 10 liked Swordfish. Based on this data, we should recommend Iron Man instead of Swordfish. This technique is called collaborative filtering, because the users are indirectly collaborating to improve each other’s recommendations.

The process can be automated using machine learning. No one needs to be manually tagging “superhero movies” or looking up ratings. The machine is learning and improving its automatic recommendations based on the data from users. Machine learning helps your computer find interesting insights you might have trouble finding yourself.

Recommendations using collaborative filtering are used by many companies’ websites to engage and retain users. (We want the users to watch more movies, buy more products, and become loyal.) But it’s good for the users, too. During Dan Wiesenthal’s awesome workshop, I found that machine learning techniques range from simple to complicated, and discovered I can already implement some of them. Machine learning is impressive because it can solve problems that are too complicated or too large to tackle with more familiar techniques. It’s learning I’ve added to my toolkit here at Kiosk.