Articles-From-Buzzfeed-2020

From MaRDI portal
Dataset:6036868



OpenML43775MaRDI QIDQ6036868

OpenML dataset with id 43775

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/22102600/Articles-From-Buzzfeed-2020.arff

Upload date: 24 March 2022


Dataset Characteristics

Number of features: 8 (numeric: 2, symbolic: 0 and in total binary: 0 )
Number of instances: 741
Number of instances with missing values: 741
Number of missing values: 1,575

Context This dataset was created by our in house teams at PromptCloud(https://www.promptcloud.com/) and DataStock(https://datastock.shop/). We have about 5K samples in this dataset. You can download the full dataset here(https://app.datastock.shop/?site_name=Articles_From_BuzzFeed_2020). We have a 30 discount on all datasets in our data repository. Feel free to head over to DataStock(https://datastock.shop/) and avail the discount. Content This dataset contains the following: Total Records Count :: 14831 Domain Name: buzzfeed.com Date Range: 01st Jan 2020 - 30th Apr 2020 File Extension :: csv Available Fields: Uniq Id, Crawl Timestamp, Title Headline, Short Description Sub Headline, Content Body, Author, Date And Time Of Posting, Image Urls Acknowledgements We wouldn't be here without the help of our web scraping and data mining experts at PromptCloud and DataStock. Inspiration The inspiration for this dataset came from Buzzfeed itself. We thought long and hard about the informative articles that we have on Buzzfeed. So we came up with a dataset for it.