OpenML43683MaRDI QIDQ6036778FDOQ6036778RO-CrateQ6036778
OpenML dataset with id 43683
Author name not available (Why is that?)
Full work available at URL: https://api.openml.org/data/v1/download/22102508/WebMD-Drug-Reviews-Dataset.arff
Upload date: 24 March 2022
Dataset Characteristics
Number of features: 11 (numeric: 4, symbolic: 0 and in total binary: 0 )
Number of instances: 362,806
Number of instances with missing values: 42
Number of missing values: 42
Context
The dataset provides user reviews on specific drugs along with related conditions, side effects, age, sex, and ratings reflecting overall patient satisfaction.
Content
Data was acquired by scraping WebMD site. There are around 0.36 million rows of unique reviews and is updated till Mar 2020.
Inspiration
This dataset intended to answer following questions:
I. Identifying the condition of the patient based on drug reviews?
II. How to predict drug rating based on patients reviews?
III. How to visualize drug rating, kind of drugs, types of conditions a patient can have, sentiments based on reviews
ROCrate
What is a RO-Crate?
A RO-Crate is a standardized research object package used to bundle data together with rich machine-readable metadata. Each RO-Crate contains:
- the files belonging to the dataset (e.g. CSVs, images, code, documentation)
- a ro-crate-metadata.json file describing the content, provenance, and context
- persistent identifiers and references to related research objects (e.g. software, publications)
This ensures that the dataset can be easily reused, cited, validated, and interpreted in a reproducible manner. More information can be found here.
Download
You can download a RO-Crate for this dataset here: Download RO-Crate
HINT: The RO-Crate is created dynamically, so it could take up to 30 seconds until the downloads starts.
This page was built for dataset: WebMD-Drug-Reviews-Dataset