Non-negative Matrix Factorization for Time-Resolved Raman Spectroscopy Data: Difference between revisions

From MaRDI portal
T4schmidt (talk | contribs)
m Correcting the \Sigma symbol used in the definition of SVD
T4 reidelbach (talk | contribs)
Test Re-Creation
Tag: Blanking
Line 1: Line 1:
PID (if applicable): doi:[https://dx.doi.org/10.1007/s10910-020-01201-7 10.1007/s10910-020-01201-7]
==Problem Statement==
Crystallization of Paracetamol in Ethanol
===Object of Research and Objective===
Determination of Intermediate States and their Kinetics along the Crystallization of Paracetamol in Ethanol using Raman Spectroscopy.
===Procedure===
[[File:Workflow Figure.png|thumb|Workflow Illustration]]


<b>1. Data Acquisition</b>
Time-resolved Raman Spectroscopy to follow the crystallization of Paracetamol in accoustically-levitated Ethanol droplets. Surface, temperature (22.0 +/- 1.0 °C), and relative humidity (17.5 +/- 2.5 %) of the environment are controlled by Nitrogen stream.
<b>2. Data Extraction</b>
Extract the spectroscopic measurement matrix <math> \mathbf{M} \in \mathbb{R}_+^{n \times m} </math> (<math> m </math> measurements of <math> n </math> non-negative intensities) from measurement file containing spectroscopic data and metadata.
<b>3. Data Analysis</b>
Factorize <math> \mathbf{M} </math> such that <math> \mathbf{M} = \mathbf{W} \cdot \mathbf{H} </math> (<math> \mathbf{W} \in \mathbb{R}_+^{n \times r}, \mathbf{H} \in \mathbb{R}_+^{r \times m} </math> with <math> r </math> the rank of factorization or <i>expected</i> number of components) using a novel non-negative matrix factorization (NMF) approach:
<b><u>Pre-proceesing</u></b>
Singular Value Decomposition (SVD) for primary factorization <math> \mathbf{M^{T}} =</math> '''U''' <math> \mathbf \Sigma \mathbf{V^{T}} </math>.
Generate <math> \mathbf{U} </math> from '''U''' with <math> \mathbf{U}[:,0] = [ 1 , 1 , ... , 1 , 1 ]^T </math> and <math> \mathbf{U}[ : , 1 : ] = </math> '''U''' <math> [ : , : r - 1 ] </math>.
Ensure orthogonality among columns of <math> \mathbf{U} </math>.
<b><u>Initializing</u></b>
Apply PCCA+ to <math> \mathbf{U} </math> to obtain the transformation matrix <math> \mathbf{A} </math> to initialize <math> \mathbf{H} </math>, <math> \mathbf{W} </math>, and <math> \mathbf{P} </math>.
<math> \widetilde{\mathbf{H}} = (\mathbf{UA})^{T} </math>
<math> \widetilde{\mathbf{W }}= \mathbf{M}(\mathbf{A^{T}U^{T}})^{\#} </math> (<math>\#</math> the pseudoinverse of singular / non-square matrices)
<math> \widetilde{\mathbf{P}} = \mathbf{A^{-1}}(\mathbf{U_{-}^{\#}U_{+}})\mathbf{A} </math> (<math> \mathbf{U_{+/-}} </math> defined as <math> \mathbf{U} </math> without the first / last row)
<b><u>Minimizing</u></b>
Objective function <math> \Psi^2 </math> maintains positivity (1,2,4), column (3) and row stochastics (5), depending on <math> \mathbf{M} </math>, <math> \mathbf{U} </math>, and <math> \mathbf{A} </math>.
Minimization of <math> \Psi^2 </math> with respect to <math> \mathbf{A} </math> adjusts <math> \widetilde{\mathbf{W}} </math> and <math> \widetilde{\mathbf{H}} </math> numerically to the claimed structural properties.
<math> \Psi = \alpha ( \underset{i,j}{\min} \widetilde{W_{i,j}} ) + \beta ( \underset{i,j}{\min} \widetilde{H_{i,j}} ) + \gamma ( \underset{j}{\max} | \sum\limits_{i=1}^r \widetilde{H_{i,j}} - 1 | ) + \delta ( \underset{i,j}{\min} \widetilde{P_{i,j}} ) + \mu ( \underset{j}{\max} | \sum\limits_{j=1}^r \widetilde{P_{i,j}} - 1 | ) </math>
<b><u>Recovering</u></b>
Minimization returns <math> \mathbf{A_{opt}} </math>, which allows to recover <math> \mathbf{H_{rec}} </math>, <math> \mathbf{W_{rec}} </math>, and <math> \mathbf{P_{rec}} </math>.
<math> \mathbf{H_{rec}} = (\mathbf{UA_{opt}})^{T} </math>
<math> \mathbf{W_{rec}} = \mathbf{M}(\mathbf{A_{opt}^{T}{U}^{T}}) ^{\#} </math>
<math> \mathbf{P_{rec}} = \mathbf{A_{opt}^{-1}}(\mathbf{U^{\#}_{-} U_{+}})\mathbf{A_{opt}} </math>
<b>4. Data Interpretation</b>
<math> \mathbf{W_{rec}} </math> contains the spectra of the substances involved in the crystallization process, while <math> \mathbf{H_{rec}} </math> allows an inference on the kinetics. Interpretation of both matrices leads to the identification of intermediate states and the underlying kinetics.
===Involved Disciplines===
NFDI4Chem (wikidata:[[wikidata:Q96678459|Q96678459]])
    <math> \Rightarrow </math> Raman Spectroscopy (wikidata:[[wikidata:Q862228|Q862228]])
MaRDI (wikidata:[[wikidata:Q108327788|Q108327788]])
    <math> \Rightarrow </math> Numerical Analysis (wikidata:[[wikidata:Q11216|Q11216]])
    <math> \Rightarrow </math> Mathematical Optimization (wikidata:[[wikidata:Q141495|Q141495]])
    <math> \Rightarrow </math> Linear Algebra (wikidata:[[wikidata:Q82571|Q82571]])
===Data Streams===
NFDI4Chem <math> \Rightarrow </math> MaRDI (.txt File containing measurement matrix <math> \mathbf{M} </math>)
MaRDI <math> \Rightarrow </math> NFDI4Chem (.png Files containing component spectra <math> \mathbf{W} </math> and relative concentration profiles <math> \mathbf{H} </math>)
==Model==
<math> \mathbf{M}=\mathbf{WH} </math>
A measurement matrix <math> \mathbf{M} </math> is factorized in a matrix <math> \mathbf{W} </math>, containing the spectra of the substances involved, and a matrix <math> \mathbf{H} </math>, which allows an inference on the kinetics.
===Discretization===
*Time: Time-resolution of Raman Spectroscopy
*Space: -
===Variables===
{| class="wikitable"
|-
!Name !! Unit !! Symbol !! dependent (measured) / independent (controlled)
|-
|Time || s || t || independent
|-
|Wavelength || cm<sup>-1</sup> || <math> \lambda </math> || independent
|-
|Intensity || - ||I ||dependent (measured)
|-
|Substance Matrix || - ||<math> \mathbf{W} </math> || dependent (calculated)
|-
|Kinetic Matrix || - ||<math> \mathbf{H} </math> ||dependent (calculated)
|}
<br/>
===Parameter===
{| class="wikitable"
!Name
!Unit
!Symbol
|-
|Temperature
|°C
|T
|-
|Relative Humidity
| %
|RH
|-
|Rank of Factorization
| -
|<math> r </math>
|-
|Number of Singular Values
| -
|<math> k </math>
|-
|Singular Value Tolerance
| -
|<math> tol </math>
|-
|Objective Function Parameter
| -
|<math> \alpha, \beta, \gamma, \delta, \mu </math>
|-
|Maximum Number of Iterations
| -
|<math> MAXITER </math>
|}
<br/>
==Process Informationen==
===Process Steps===
{| class="wikitable"
!Name
!Description
!Input
!Output
!Method
!Parameter
!Environment
!Mathematical Area
|-
|Data Acquisition
|Measurement
| -
|.icraman
|Time-resolved Raman Spectroscopy
|T, RH
|RXN1™
| -
|-
|Data Extraction
|Extract Spectroscopic Data
|.icraman
|.txt
| -
| -
|iC Raman™
| -
|-
|Data Analysis
|Determine Component Spectra &amp; Concentration Profiles
|.txt
|.png
|NMF algorithm
|<math> r, tol, k </math> <math> MAXITER </math> <math> \alpha, \beta, \gamma, \delta, \mu </math>
|Matlab
|Numerical Analysis, Linear Algebra, Mathematical Optimization
|-
|Data Interpretation
|Determine Intermediate States &amp; Kinetics
|.png
| -
| -
| -
| -
| -
|}
<br/>
===Applied Methods===
{| class="wikitable"
!ID
!Name
!Process Step
!Parameter
!realised/implemented by
|-
|doi:[[doi:10.1007/s10910-020-01201-7|10.1007/s10910-020-01201-7]]
|NMF Algorithm
|Data Analysis
|<math> r, tol, k </math> <math> MAXITER </math> <math> \alpha, \beta, \gamma, \delta, \mu </math>
|[https://www.mdpi.com/2079-3197/6/1/20/s1 Matlab Script]
|-
|wikidata:[[wikidata:Q420904|Q420904]]
|SVD
|Data Analysis - Pre-Processing
|<math> k </math>
|Matlab R2019a
|-
|wikidata:[[wikidata:Q43219517|Q43219517]]
|Pseudoinverse
|Data Analysis - Initializing
|<math> tol </math>
|Matlab R2019a
|-
|doi:[[doi:10.1016/j.laa.2004.10.026|10.1016/j.laa.2004.10.026]]
|PCCA+
|Data Analysis - Initializing
|<math> r </math>
|[https://www.mdpi.com/2079-3197/6/1/20/s1 Matlab Script]
|-
|wikidata:[[wikidata:Q1253278|Q1253278]]
|Nelder-Mead Algorithm
|Data Analysis - Minimizing
|<math> \alpha, \beta, \gamma, \delta, \mu, MAXITER </math>
|Matlab R2019a
|-
|wikidata:[[wikidata:Q43219517|Q43219517]]
|Pseudoinverse
|Data Analysis - Recovering
|<math> tol </math>
|Matlab R2019a
|}
<br/>
===Software used===
{| class="wikitable"
!ID
!Name
!Description
!Version
!Programming Language
!Dependencies
!versioned
!published
!documented
|-
| -
|iC Raman
|Data Acquisition &amp; Reaction Analysis
|4.1
| ?
|Windows
|Yes
|Yes
|[https://www.mt.com/de/de/home/products/L1_AutochemProducts/AutoChem_software/iC_Raman.html#documents Yes]
|-
|sw:[https://swmath.org/software/558 558]
|Matlab
|Programming and numeric Computing
|R2019a
|C,C++,Fortran,Java
|Windows, Mac, Linux
|Yes
|Yes
|[https://de.mathworks.com/help/matlab/ Yes]
|}
<br/>
===Experimental Devices/Instruments and Computer-Hardware===
{| class="wikitable"
!ID
!Name
!Description
!Version
!Part Nr
!Serial Nr
!Location
!Software
|-
| -
|Raman RXN1
|Spectrometer
|
|
|
|
|
|-
| -
|GenunineIntel
|Intel(R) Core(TM) i7-9700T CPU @ 2.00 GHz
|
|
|
|
|
|}
<br/>
===Input Data===
{| class="wikitable"
!ID
!Name
!Size
!Data Structure
!Format Representation
!Format Exchange
!binary/text
!proprietary
!to publish
!to archive
|-
| -
|Spectroscopic &amp; Meta Data
|small
| -
| -
|.icraman
|binary
|Yes
|Yes
|Yes
|-
| -
|Spectroscopic Data ( <math> \mathbf{M} </math>)
|small
|Matlab Array
|dense matrix (csv)
|.txt
|text
|No
|Yes
|Yes
|-
| -
|<math> r </math>
|small
|integer
|
|
|text
|No
|Yes
|Yes
|-
| -
|<math> \alpha, \beta, \gamma, \delta, \mu </math>
|small
|float
|
|
|text
|No
|Yes
|Yes
|}
<br/>
===Output Data===
{| class="wikitable"
!ID
!Name
!Size
!Data Structure
!Format Representation
!Format Exchange
!binary/text
!proprietary
!to publish
!to archive
|-
| -
|Component Spectra (<math> \mathbf{W} </math>)
|small
|Matlab Array
|plot
|.png
|binary
|No
|Yes
|Yes
|-
| -
|Concentration Profile (<math>\mathbf{H} </math>)
|small
|Matlab Array
|plot
|.png
|binary
|No
|Yes
|Yes
|}
<br/>
==Reproducibility==
===Reproducibility of the Experiments on the original Devices/Instruments/Hardware===
Yes
===Reproducibility of the Experiments on other Devices/Instruments/Hardware===
Yes
===Transferability of the Experiments to===
a) other solutes, solvents, parameter
b) other chemical reactions
=Legend=
The following abbreviations are used in the document to indicate/resolve IDs:
doi: https://dx.doi.org/
sw: https://swmath.org/software/
wikidata: https://www.wikidata.org/wiki/
[[Category: Workflow]]

Revision as of 11:03, 20 April 2023