Sample size planning for multivariate data: a Raman spectroscopy based example

in: Analytical Chemistry (2018)
Girnus, Sophie; Rösch, Petra; Popp, Jürgen; Bocklitz, Thomas W.; Ali, Nairveen
The goal of sample size planning is to determine the number of measurements needed for a statistical analysis. This is necessary to achieve robust and significant results, while a minimal number of measurements need to be collected. This is a common procedure for univariate measurements, while for multivariate measurements, like spectra or time traces, no general sam-ple size planning method exists. Sample size planning (SSP) becomes more important for bio-spectroscopic data because the data generation is time consuming and costly. Additionally, ethical reasons don’t allow the use of unnecessary samples and measure an unnecessary number of spectra. In this paper, sample size planning for Raman-spectroscopic data is achieved by utilizing learning curves. The learning curve quantifies the improvement of a classifier for an increasing training set size. These curves are fitted by the inverse power law while the parameters of this fit can be utilized to predict the necessary training set size. The sample size planning is demonstrated for a bio-spectroscopic task of differentiating 6 different bacteria species including E. coli, K. terrigena,P. stutzeri, L. innocua, S. warneri, and S. cohnii based on their Raman spectra. Thereby, we estimate the required number of Raman spectra and biological replicates to train a classification model, which consists of principal component analysis (PCA) combined with a linear discriminant analysis (LDA). The presented algorithm revealed that 142 Raman spectra per specie and 7 biological replicates are needed for the above mentioned bio-spectroscopic classification task. Even though it was not demonstrated, the learning curve based sample size planning algorithm can be applied to all bio-spectroscopic classification tasks.

Third party cookies & scripts

This site uses cookies. For optimal performance, smooth social media and promotional use, it is recommended that you agree to third party cookies and scripts. This may involve sharing information about your use of the third-party social media, advertising and analytics website.
For more information, see privacy policy and imprint.
Which cookies & scripts and the associated processing of your personal data do you agree with?

You can change your preferences anytime by visiting privacy policy.