Oral Squamous Cell Carcinoma - Mass Spectrometry Imaging
DOI10.5281/zenodo.7377802Zenodo7377802MaRDI QIDQ6702259FDOQ6702259
Dataset published at Zenodo repository.
Grzegorz Drazek, Mykola Chekan, Grzegorz Mrukwa, Janusz Wierzgon, Monika Pietrowska, Joanna Polanska, Marta Gawin, Piotr Widłak, Magdalena Kalinowska
Publication date: 29 November 2022
Copyright license: Creative Commons Attribution 4.0 International
The dataset was first featured in Widlak, Piotr, et al. Detection of molecular signatures of oral squamous cell carcinoma and normal epitheliumapplication of a novel methodology for unsupervised segmentation of imaging mass spectrometry data.Proteomics16.11-12 (2016): 1613-1621. For the tissue samples biochemical preparation details, please refer to the original publication. The biological material was collected from five patients who underwent surgery due to Oral Squamous Cell Carcinoma (OSCC). Tissue samples contained both tumor and surrounding healthy tissue. Each specimen was cut into 10 msections in a cryostat. During the sample preparation for the MS imaging, a high-resolution optical scan of each section was captured. Tissue sections were subjected to peptide imaging with the use of a MALDI ToF mass spectrometer. Spectra were recorded within m/zrange of 800-4,000. A raster width of 100 mwas applied, and 400 shots were collected from each ablation point. The obtained dataset consisted of 45,738 raw spectra with 109,568 mass channels. An experienced pathologist analyzed the optical scan obtained during the data acquisition process, and tissue regions were annotated. For the highest confidence of the results obtained in this work, we will focus on the two tissue samples out of the entire dataset (8,005 and 11,869 spectra), which have the highest confidence labels, as explained by the pathologist. The preprocessing of the spectra was conducted in MATLAB. Standard preprocessing steps were applied to the spectra. Spectra were resampled to unify the m/zaxis across the dataset. The baseline was removed with MATLAB procedure msbackadj()from the Bioinformatics Toolbox. Peaks were aligned using Fast Fourier Transform-based spectral alignment. The TIC normalization ensured a similar intensity level for all spectra. Finally, a GMM approach was used to model the spectra. GMM locates the peak but also estimates the peak area instead of a raw magnitude provided by most methods. Note that the peaks in MSI spectra are right-skewed, so the neighboring GMM components resulting from that phenomenon were identified and merged to better correspond to actual chemical compounds. The resulting dataset is characterized by 3,714 GMM components corresponding to MSI spectrum peaks.
This page was built for dataset: Oral Squamous Cell Carcinoma - Mass Spectrometry Imaging