Data science applications to string theory (Q2187812)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Data science applications to string theory
scientific article

    Statements

    Data science applications to string theory (English)
    0 references
    0 references
    3 June 2020
    0 references
    In the paper under review, the author provides a pedagogical introduction to data science techniques that are used to study large data sets, and outline their applications to string theory. The problem with the string landscape is that it is unfathomably big. There is a huge number of different choices for the compact component of the string's target space, and there is a huge number of additional data or boundary conditions, known as fluxes and branes, that are necessary to uniquely specify string theory in four dimensions. Early estimates argue that there are \(\mathcal{O}(10^{500})\) boundary data choices for any typical six dimensional compactification space [\textit{S. K. Ashok} and \textit{M. R. Douglas}, J. High Energy Phys. 2004, No. 1, 060, 36 p. (2004; Zbl 1243.83060)] . Estimates on the entire landscape are even much larger, \(\mathcal{O}(10^{272,000})\) [\textit{W. Taylor} and \textit{Y.-N. Wang}, J. High Energy Phys. 2015, No. 12, Paper No. 164, 21 p. (2015; Zbl 1388.81367)]. In addition, finding mathematically consistent and phenomenologically viable background configurations requires solving problems which are generically NP-complete, NP-hard, or even undecidable. The paper under review consists of two parts. In sections 2 to 9, the author introduces concepts of data science that are relevant for string theory studies. This introduction is general and does not make reference to string theory concepts. Sections 2 to 4 introduce neural networks (NNs) and section 5 describes genetic algorithms. Section 6 describes persistent homology as an example for topological data analysis. Section 7 describes machine learning algorithms other than NNs that can be used in unsupervised machine learning to cluster data or detect outliers and anomalies in a data set. After explaining a general problem that occurs in all these algorithms the author introduces common algorithms such as principal component analysis, \(K\)-means clustering, mean shift clustering, Gaussian expectation-maximization clustering, and clustering with BIRCH and with DBSCAN. Section 8 introduces reinforcement learning to search for solutions in a large space of possibilities, and finally section 9 discusses classification and regression algorithms besides NNs that can be used in supervised machine learning. The algorithms discussed are the \(k\)-nearest neighbor algorithm, decision trees and random forests, and support vector machines. In Section 10, the author explains the hardness of the problems encountered in string theory, reviews the existing machine learning literature and illustrates applications of the techniques explained in section 2 to 9 to problems that arise in string theory. This includes computing cohomologies of line bundles over Calabi-Yau manifolds, generating and proving conjectures based on observations made by the AI in some data sets, predicting the types of non-Higgsable gauge groups that appear in F-Theory on toric, elliptically fibered Calabi-Yau fourfolds, and generating superpotentials for \(4D\) \(\mathcal{N} = 1\) theories, studying the structure of string vacua and searching through the landscape of string vacua to identify viable models. The author also illustrates the use of genetic algorithms to distinguish high-scale SUSY breaking models, and the use of convolutional neural networks for toric diagrams to predict volumes of Sasaki-Einstein manifolds. Furthermore the author introduces the idea of using NNs to approximate the bulk metric in AdS/CFT, and discusses the deep Boltzmann machines and their relation to AdS/CFT and Riemann Theta functions.
    0 references
    machine learning
    0 references
    data science
    0 references
    neural networks
    0 references
    genetic algorithms
    0 references
    string landscape
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references