Covid-19-Research-Articles-(NCBI)
Dataset:6036876
OpenML dataset with id 43783
No author found.
Full work available at URL: https://api.openml.org/data/v1/download/22102608/Covid-19-Research-Articles-(NCBI).arff
Upload date: 24 March 2022
Dataset Characteristics
Number of features: 5 (numeric: 0, symbolic: 0 and in total binary: 0 )
Number of instances: 1,198
Number of instances with missing values: 768
Number of missing values: 1,459
Context
I collected about 1200 Covid-19 research articles from the NCBI.NLM.NIH website to be utilized in ML algorithms/ Data Analysis such as Sentiment Analysis, Time Series, Recommender System and/or Classification.
Content
link: URL to the research article
title: research article
keywords: words under which the research article is categorized
dates: publication date online
abstract: a brief summary of the article (methods hypothesis included)
conclusion: findings of the research
For the sake of time, I left some columns with 'null' String values. It's your choice to filter the values, and use what is more appropriate for your ML model.
I didn't include authors/contributors as it won't serve a purpose in this datasets
Inspiration
I am interested in knowing the focus of those studies (by analyzing word frequencies) as well as analyzing the volume of publications over time.
This page was built for dataset: Covid-19-Research-Articles-(NCBI)