Top-10000-Movies-Based-On-Ratings

OpenML dataset with id 43772

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/22102597/Top-10000-Movies-Based-On-Ratings.arff

Upload date: 24 March 2022

Dataset Characteristics

Number of features: 6 (numeric: 3, symbolic: 0 and in total binary: 0 )
Number of instances: 10,000
Number of instances with missing values: 44
Number of missing values: 44

Description

Context People love movies because:

 It takes you on a journey.
 Its an escape from reality.

Being a vivid movie watcher I always get amazed how sites like Netflix and Hotstar always exactly suggest the next movie I planned to watch on the back of mind. I researched a lot and decide to come up with something similar to that, so I decided to start with extracting a huge dataset of movies people love to watch and apply analysis on it. Content The dataset contains the following information:

Popularity: How popular the movie is. Vote Count: Number of people voted. Title: Name of the movie. Vote Average: Average number of people voted to watch this movie. Overview: Brief overview of what movie is (storyline). Release Date: Date when the movie was released.

Inspiration I would love to get the following answer:

Relationship between popularity and average vote count? Which machine algorithm would be effective to find relationship between movies?