H7 hemagglutinin lineage identification
DOI10.5281/zenodo.2653110Zenodo2653110MaRDI QIDQ6709500FDOQ6709500
Dataset published at Zenodo repository.
Publication date: 28 April 2019
Copyright license: Creative Commons Attribution 4.0 International
This is the nucleotide sequence data and the R files for the k-means clustering of the H7 hemagglutinin segment of Influenza A. H7_HA_IRDB_2019_1_14.fasta is the initial data downloaded from the Influenza Research Database H7_HA_IRDB_2019_1_14_aligned_muscle.fasta is the sequence data aligned with Muscle H7_HA_IRDB_2019_1_14_aligned_muscle_cleaned.fas is the aligned sequence data after problem sequence have been removed. RNA_distances_HA_kmeans_analysis.R is the R code for carrying out the k-means clustering and diagnostics. kmeans_clustering.pdf is the R-markdown generated document describing the R-code for k-means clustering. clustered_HA_kmeans_RNA.csv are the results of the clustering usearch_clustering.csv is a summary of the USEARCH results. Clade_Alpha.tree - alpha clade phylogenetic tree Clade_Beta.tree - beta clade phylogenetic tree Clade A-D trees are the phylogenetic trees for clades A to D. H7_HA_IRDB_2019_1_14_aligned_muscle_cleaned_fasttree_coloured.tree is the overview tree of all the sequences.
This page was built for dataset: H7 hemagglutinin lineage identification