Marginal-Revolution-Blog-Post-Data
OpenML dataset with id 43574
No author found.
Full work available at URL: https://api.openml.org/data/v1/download/22102399/Marginal-Revolution-Blog-Post-Data.arff
Upload date: 23 March 2022
Dataset Characteristics
Number of features: 22 (numeric: 17, symbolic: 0 and in total binary: 0 )
Number of instances: 12,820
Number of instances with missing values: 57
Number of missing values: 81
The following dataset contains data on blog posts from MarginalRevolution.com. For posts from Jan. 1, 2010 to 9/17/2016, the following attributes are gathered.
Author Name Post Title Post Date Post content (words) Number of Words in post Number of Comments in post Dummy variable for several commonly used categories
The data was scraped using Python's Beautiful Soup package, and cleaned in R. See my github page (https://github.com/wnowak10/) for the Python and R code.