Covid-19-Research-Articles-(NCBI)

From MaRDI portal
Dataset:6036876



OpenML43783MaRDI QIDQ6036876

OpenML dataset with id 43783

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/22102608/Covid-19-Research-Articles-(NCBI).arff

Upload date: 24 March 2022


Dataset Characteristics

Number of features: 5 (numeric: 0, symbolic: 0 and in total binary: 0 )
Number of instances: 1,198
Number of instances with missing values: 768
Number of missing values: 1,459

Context I collected about 1200 Covid-19 research articles from the NCBI.NLM.NIH website to be utilized in ML algorithms/ Data Analysis such as Sentiment Analysis, Time Series, Recommender System and/or Classification. Content link: URL to the research article title: research article keywords: words under which the research article is categorized dates: publication date online abstract: a brief summary of the article (methods hypothesis included) conclusion: findings of the research For the sake of time, I left some columns with 'null' String values. It's your choice to filter the values, and use what is more appropriate for your ML model. I didn't include authors/contributors as it won't serve a purpose in this datasets Inspiration I am interested in knowing the focus of those studies (by analyzing word frequencies) as well as analyzing the volume of publications over time.