A two-tier bioinformatic pipeline to develop probes for target capture of nuclear loci with applications in Melastomataceae
DOI10.5281/zenodo.4411919Zenodo4411919MaRDI QIDQ6710126FDOQ6710126
Dataset published at Zenodo repository.
Ryan A. Folk, Douglas Soltis, Pamela S. Soltis, Prabha Amarasinghe, Fabian Michelangeli, Nico Cellinese, Johanna Jantzen, Marcelo Reginato
Publication date: 2 January 2021
Premise of the study: Putatively single-copy nuclear (SCN) loci, identified using genomic resources of closely related species, are ideal for phylogenomic inference. However, suitable genomic resources are not available for many clades, including Melastomataceae. We introduce a versatile approach to identify SCN loci for clades with few genomic resources and use it to develop probes for target enrichment in the distantly related Memecylon and Tibouchina (Melastomataceae). Methods: We present a two-tiered pipeline. First, we identified putatively SCN loci using MarkerMiner and transcriptomes from distantly related species in Melastomataceae. Published loci and genes of functional significance were added (384 total loci). Second, using HybPiper, we retrieved 689 homologous template sequences for these loci using genome-skimming data from within the focal clades. Results: We sequenced 193 loci from both Memecylon and Tibouchina, with probes designed from 56 template sequences successfully targeting sequences in both clades. Probes designed from genome-skimming data within a focal clade were more successful than probes designed from other sources. Discussion: Our pipeline successfully identified and targeted SCN loci in Memecylon and Tibouchina, enabling phylogenomic studies in both clades and potentially across Melastomataceae. This pipeline could be easily applied to other clades with few genomic resources.
This page was built for dataset: A two-tier bioinformatic pipeline to develop probes for target capture of nuclear loci with applications in Melastomataceae