How Netflix and YouTube Know What You’ll Watch Next: The Magic of Machine Learning
We’ve all had that moment when Netflix or YouTube suggests exactly what we want to watch next. Ever wondered how they know? It’s not magic — it’s machine learning.
Let’s dive into the story of two friends, Ashish and Tejaswi, to explore how these platforms work.
The Story: How Machine Learning Shapes Your Viewing Experience
Scene 1: A Weekend Movie Search
Ashish and Tejaswi are chilling at Ashish’s place on a Saturday night. Ashish, a tech-savvy guy, is scrolling through Netflix, and his recommendations pop up: The Squid Game, Stranger Things, and Devara.
Ashish laughs. “It’s like Netflix knows me. It’s all sci-fi and dystopian stuff — just what I love!”
Tejaswi, who just finished watching a food vlog on YouTube, replies, “YouTube’s been doing the same for me. It keeps suggesting more cooking and travel videos. How do they always get it right?”
Ashish smiles. “That’s machine learning! Both platforms use recommendation systems to figure out what we’ll like based on our past choices.”
Tejaswi leans forward, curious. “Okay, tell me more. How do they actually do it?”
Scene 2: Ashish Explains Collaborative Filtering
Ashish clears his throat, ready to explain. “There’s something called Collaborative Filtering. Netflix looks at what other people with similar tastes to you have watched. If they like the same shows as you and have also watched something you haven’t, Netflix will recommend that show to you.”
Tejaswi looks impressed. “So, it’s not just about what I like, but also about what other people like who are like me?”
“Exactly!” says Ashish. “Here’s a simple example in code to show how it works.”
Collaborative Filtering Code Example:
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity
# Sample data: users and their ratings for different shows
user_ratings = pd.DataFrame({
'Stranger Things': [5, 3, 0, 0, 5],
'Black Mirror': [4, 0, 4, 0, 4],
'The Expanse': [5, 0, 5, 4, 5],
'Breaking Bad': [0, 4, 0, 5, 0]
}, index=['Ashish', 'User2', 'User3', 'User4', 'User5'])
# Calculate similarity between users based on their ratings
similarity_matrix = cosine_similarity(user_ratings.fillna(0))
# Display similarity matrix
similarity_df = pd.DataFrame(similarity_matrix, index=user_ratings.index, columns=user_ratings.index)
print(similarity_df)
# Find which user is most similar to Ashish
most_similar_user = similarity_df['Ashish'].sort_values(ascending=False).index[1]
print(f"Ashish's viewing habits are most similar to: {most_similar_user}")
Output:
Ashish User2 User3 User4 User5
Ashish 1.000000 0.506706 0.816497 0.365148 0.894427
User2 0.506706 1.000000 0.000000 0.267261 0.801784
User3 0.816497 0.000000 1.000000 0.000000 0.577350
User4 0.365148 0.267261 0.000000 1.000000 0.333333
User5 0.894427 0.801784 0.577350 0.333333 1.000000
Ashish's viewing habits are most similar to: User5
Ashish points to the output. “In this case, Netflix sees that I’m most similar to User5, so it will suggest shows that User5 has watched but I haven’t.”
Tejaswi grins. “Wow, that’s pretty cool. So, it’s like Netflix finds me a viewing twin?”
“Exactly!” Ashish nods. “And that’s how they keep you hooked with the perfect show suggestions.”
Scene 3: Content-Based Filtering
Tejaswi then asks, “But what about YouTube? I doubt everyone is as obsessed with food and travel videos as I am.”
Ashish explains, “That’s where Content-Based Filtering comes in. YouTube analyzes the details of the videos you’ve watched — like their titles, tags, and descriptions. It then recommends similar content. For example, if you’ve been watching Italian cooking videos, it’ll show you more videos about Italian cuisine or similar topics.”
He types another code example.
Content-Based Filtering Code Example & Output:
from sklearn.feature_extraction.text import TfidfVectorizer
# Sample video descriptions
video_data = pd.DataFrame({
'Video Title': ['Italian Cooking 101', 'How to Make Pizza', 'Travel Guide to Italy', 'French Cooking for Beginners'],
'Description': ['Learn the basics of Italian cooking', 'Step-by-step guide to making pizza at home',
'Top places to visit in Italy', 'A beginner’s guide to French cuisine']
})
# Convert descriptions into TF-IDF matrix
tfidf = TfidfVectorizer(stop_words='english')
tfidf_matrix = tfidf.fit_transform(video_data['Description'])
# Calculate similarity between videos
cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)
# Display similarity matrix for videos
cosine_df = pd.DataFrame(cosine_sim, index=video_data['Video Title'], columns=video_data['Video Title'])
print(cosine_df)
# Recommend similar videos to 'Italian Cooking 101'
recommended_videos = cosine_df['Italian Cooking 101'].sort_values(ascending=False)
print("Recommended videos for 'Italian Cooking 101':", recommended_videos.index[1:]
)
Italian Cooking 101 How to Make Pizza Travel Guide to Italy French Cooking for Beginners
Italian Cooking 101 1.000000 0.212462 0.000000 0.204516
How to Make Pizza 0.212462 1.000000 0.000000 0.238883
Travel Guide to Italy 0.000000 0.000000 1.000000 0.000000
French Cooking for Beginners 0.204516 0.238883 0.000000 1.000000
Recommended videos for 'Italian Cooking 101': ['How to Make Pizza', 'French Cooking for Beginne
Ashish points to the results. “If you watch a video called ‘Italian Cooking 101,’ YouTube will recommend related videos like ‘How to Make Pizza’ or ‘French Cooking for Beginners.’ It’s all about understanding the content you enjoy.”
Tejaswi nods. “That explains why my feed is full of pasta recipes and travel vlogs!”
Conclusion: How Machine Learning Enhances Our Experience
Ashish sums it up: “These platforms are constantly learning about what we like through our interactions. Whether it’s comparing our habits to others with similar tastes or analyzing the content we consume, machine learning ensures we always have something interesting to watch.”
Tejaswi smiles, impressed. “Now I get it! The next time YouTube shows me a travel vlog, I’ll know it’s not just a coincidence.”
Why It Matters
Every time you browse Netflix or YouTube, you’re engaging with machine learning in action. By understanding how collaborative filtering and content-based filtering work, we can appreciate the personalized experience these platforms create for us, keeping us entertained, engaged, and coming back for more.
Feel free to explore this in your own viewing experience and see how these algorithms play a role in your daily media consumption. Share your insights and let’s discuss how machine learning continues to shape the future of entertainment!