Spaces:

Mohamed284
/

Netflix_Recommendation_System

Sleeping

App Files Files Community

Mohamed284 commited on Mar 7

Commit

cb8e014

1 Parent(s): 6a9be58

.

Browse files

Files changed (5) hide show

Recommedation_System_Netflix.ipynb +0 -0
Recommedation_System_Netflix.wiki +144 -0
netflix_titles.csv +0 -0
recommendation_network.png +0 -0
top_10_recommendations.png +0 -0

Recommedation_System_Netflix.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

Recommedation_System_Netflix.wiki ADDED Viewed

	@@ -0,0 +1,144 @@

+= Netflix Content Recommendation System =
+== Introduction ==
+This wiki entry explores a sophisticated hybrid recommendation system for Netflix movies and TV shows. The system combines multiple advanced approaches to provide accurate and diverse content recommendations:
+* Content-based filtering using TF-IDF vectorization
+* Collaborative filtering based on user preferences
+* Node representation learning for enhanced content understanding
+== Methodology ==
+=== Content-Based Filtering ===
+The content-based filtering approach utilizes TF-IDF (Term Frequency-Inverse Document Frequency) vectorization to analyze content features:
+==== TF-IDF Vectorization ====
+TF-IDF is a numerical statistic that reflects the importance of a word in a document collection:
+* Term Frequency (TF): Measures how frequently a term appears in a document
+* Inverse Document Frequency (IDF): Downweights terms that appear in many documents
+Implementation details:
+<syntaxhighlight lang="Python">
+tfidf = TfidfVectorizer(stop_words='english')
+tfidf_matrix = tfidf.fit_transform(df['combined_features'])
+</syntaxhighlight>
+==== Cosine Similarity ====
+Cosine similarity measures the similarity between two vectors by computing the cosine of the angle between them:
+* Range: [-1, 1] where 1 means identical direction, 0 means orthogonal, -1 means opposite
+* Used to compare TF-IDF vectors of different content items
+<syntaxhighlight lang="Python">
+cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)
+</syntaxhighlight>
+=== Collaborative Filtering ===
+The collaborative filtering component employs matrix factorization techniques to:
+==== User-Item Matrix ====
+* Creates a matrix of user ratings for items
+* Handles sparsity through matrix factorization
+* Simulates user behavior patterns
+<syntaxhighlight lang="Python">
+def create_user_item_matrix(df, n_users=1000):
+    np.random.seed(42)
+    n_items = len(df)
+    user_item_matrix = np.random.randint(0, 6, size=(n_users, n_items)) * \
+                      (np.random.random((n_users, n_items)) > 0.8)
+    return user_item_matrix
+</syntaxhighlight>
+==== Matrix Factorization ====
+* Singular Value Decomposition (SVD) for dimensionality reduction
+* Captures latent features in user-item interactions
+* Predicts missing ratings
+<syntaxhighlight lang="Python">
+def matrix_factorization(ratings, n_factors=50):
+    user_ratings_mean = np.mean(ratings, axis=1)
+    ratings_norm = ratings - user_ratings_mean.reshape(-1, 1)
+    U, sigma, Vt = svds(ratings_norm, k=n_factors)
+    predicted_ratings = np.dot(np.dot(U, np.diag(sigma)), Vt) + \
+                       user_ratings_mean.reshape(-1, 1)
+    return predicted_ratings
+</syntaxhighlight>
+=== Node Representation Learning ===
+Implements graph-based learning using Node2Vec:
+==== Content Graph Creation ====
+* Builds a graph representing content relationships
+* Nodes represent movies/shows and genres
+* Edges represent content-genre associations
+<syntaxhighlight lang="Python">
+def create_content_graph(df):
+    G = nx.Graph()
+    # Pre-process genres
+    genre_dict = {}
+    for idx, row in df.iterrows():
+        if isinstance(row['listed_in'], str):
+            genres = [g.strip() for g in row['listed_in'].split(',')]
+            genre_dict[idx] = genres
+            # Add unique genres as nodes
+            for genre in genres:
+                if not G.has_node(genre):
+                    G.add_node(genre, type='genre')
+    return G
+</syntaxhighlight>
+==== Node2Vec Algorithm ====
+* Random walk-based approach for learning node embeddings
+* Preserves network neighborhood information
+* Parameters:
+** Dimensions: 32 (embedding size)
+** Walk length: 10 (steps per walk)
+** Number of walks: 50 (walks per node)
+=== Hybrid Recommendation Function ===
+Combines multiple recommendation approaches:
+==== Weighted Combination ====
+* Content similarity: 70% weight
+* Node embeddings: 30% weight
+* Adaptive weighting based on availability
+<syntaxhighlight lang="Python">
+def get_hybrid_recommendations(query, cosine_sim, df, n_recommendations=10):
+    """
+    Get hybrid recommendations based on content similarity and node embeddings.
+    Args:
+        query (str): Title or description to base recommendations on
+        cosine_sim (np.ndarray): Pre-computed cosine similarity matrix
+        df (pd.DataFrame): DataFrame containing Netflix content
+    """
+</syntaxhighlight>
+== Results Visualization ==
+=== Recommendation Scores Bar Chart ===
+Visualization of top 10 recommendations with their similarity scores:
+[[File:top_10_recommendations.png|thumb|200px|center|Top 10 Recommendations: Bar chart visualization showing similarity scores for the most relevant content recommendations based on the hybrid recommendation system.]]
+=== Network Analysis ===
+The recommendation network analysis reveals:
+* Number of recommended items: 10
+* Number of connections: 45
+* Network density: 1.000
+=== Recommendation Network Graph ===
+Visualization of content relationships and similarities:
+[[File:recommendation_network.png|thumb|200px|center|Recommendation Network: Graph visualization depicting content relationships, with nodes representing recommended items and edges showing content similarities between them.]]
+The network graph shows:
+* Nodes: Recommended content items
+* Node size: Recommendation score
+* Edges: Content similarities
+* Edge thickness: Similarity strength

netflix_titles.csv ADDED Viewed

The diff for this file is too large to render. See raw diff

recommendation_network.png ADDED Viewed

top_10_recommendations.png ADDED Viewed