Logo Leibniz IPHT

Machine Learning and deep learning for the analysis of photonic data

According to R.E. Bellman (1978) artificial intelligence is the ’[automation of] activities that we associate with human thinking, activities such as decision-making, problem solving, learning …. ‘. With this broad definition the scientific fields of artificial intelligence features Greek ancient roots. Nevertheless, the first working AI programs were developed in the 1950’s and from this time also the term artificial intelligence (AI) originates. An important aspect of AI is machine learning (ML) and the first ML methods were developed in the field of statistics around 1900.

But what is ‘Machine learning (ML)’? ML is the science, which researches, develops and uses algorithms, statistical/mathematical methods that allow computer systems to improve their performance on a specific task. Machine learning methods either explicitly or implicitly construct a statistical/mathematical model of the data based on a sample dataset called ‘training data’. Based on the underlying statistical/mathematical model ML methods can perform predictions or make decisions without being explicitly programmed to do so. This is the un-formal version of the famous and often quoted definition of ML by Tom M. Mitchell, 1997. ML techniques are utilized in a wide range of application ranging from spam detection in email accounts to computer vision and data analysis of spectroscopic data. Deep learning is a special type of machine learning methods, which were developed since 2006 and are often applied since 2010. These methods are composed of multiple layers of nonlinear processing units forming a high parameterized version of ML, which features a high degree of non-linearity. The difference between classical machine learning and deep learning is described in the respective sections.

Like stated above machine learning is defined by Tom M. Mitchell (1997) as ‘a computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E’. In order to make this clearer, we like to link it to the most prominent machine learning task. This often tested machine learning task is the following: A computer program should classify hand written digits (MNIST database of handwritten digits). In that respect the input of the model are images of size 28x28 pixels and the output of the model is one of the digits 0, 1, …, 8, 9. This task (T) is visualized above and it gives an instructive example of the working of ML methods for image recognition. The experience E are the know ground truth for the trainings data, e.g. the known connection between the training images and the numbers. The performance measure is some error rate between the ML prediction and the ground truth. We utilize these ML and Deep Learning methods similarly to elucidate the difference between bio-medical conditions of samples, which are characterized by spectral or image measurements.


  • Guo, S., Mayerhöfer, T., Pahlow, S., Hübner, U. Popp, J., Bocklitz, T. Deep Learning for ‘Artefact’ Removal in Infrared Spectroscopy Analyst accepted
  • Houhou, R.; Barman, P.; Schmitt, M.; Meyer, T.; Popp, J. & Bocklitz, T. Deep learning as phase retrieval tool for CARS spectra Opt. Express, 2020, accepted, 1-10
    Pradhan, P.; Guo, S.; Ryabchykov, O.; Popp, J. & Bocklitz, T. W. Deep learning a boon for Biophotonics? J. Biophotonics, 2020, 13, e201960186
  • Ali, N.; Quansah, E.; Köhler, K.; Meyer, T.; Schmitt, M.; Popp, J.; Niendorf, A. & Bocklitz, T. Automatic Label-free Detection of Breast Cancer Using Nonlinear Multimodal Imaging and the Convolutional Neural Network ResNet50 Translational Biophotonics, 2019, online, e201900003
  • Bocklitz, T. Understanding of Non-linear Parametric Regression and Classification Models: a Taylor Series Based Approach Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, SciTePress, 2019, 874-88
  • Yarbakht, M.; Pradhan, P.; Köse-Vogel, N.; Bae, H.; Stengel, S.; Meyer, T.; Schmitt, M.; Stallmach, A.; Popp, J.; Bocklitz, T. W. & Bruns, T. Nonlinear Multimodal Imaging Characteristics of Early Septic Liver Injury in a Mouse Model of Peritonitis Anal. Chem., 2019, 91, 11116-11121
  • P. Pradhan; T. Meyer; M. Vieth; A. Stallmach; M. Waldner; M. Schmitt; J. Popp & T. Bocklitz Semantic segmentation of Non-Linear Multimodal images for disease grading of Inflammatory Bowel Disease - A SegNet-based application International Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, SciTePress, 2019, 396-405
  • E. Rodner; T. Bocklitz; F. von Eggeling; G. Ernst; O. Chernavskaia; J. Popp; J. Denzler & O. Guntinas-Lichius Fully convolutional networks in multimodal nonlinear microscopy images for automated detection of head and neck carcinoma Head & Neck, 2018, accepte
  • S. Guo; S. Pfeifenbring; T. Meyer; G. Ernst; F. von Eggeling; V. Maio; D. Massi; R. Cicchi; F. S. Pavone; J. Popp & T. Bocklitz Multimodal Image Analysis in Tissue Diagnostics for Skin Melanoma J. Chemom., 2018, 32, e2963
  • O. Ryabchykov; A. Ramoji; T. Bocklitz; M. Förster; S. Hagel; C. Kroegel; M. Bauer; U. Neugebauer & J. Popp Leukocyte subtypes classification by means of image processing 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), 2016, 309-316
  • O. Chernavskaia; S. Heuke; M. Vieth; O. Friedrich; S. Schürmann; R. Atreya; A. Stallmach; M. F. Neurath; M. Waldner; I. Petersen; M. Schmitt; T. Bocklitz & J. Popp Beyond endoscopic assessment in inflammatory bowel disease: real-time histology of disease activity by non-linear multimodal imaging Scientific Reports, 2016, 6, 29239
  • S. Heuke; O. Chernavskaia; T. Bocklitz; F. B. Legesse; T. Meyer; D. Akimov; O. Dirsch; G. Ernst; F. von Eggeling; I. Petersen; O. Guntinas-Lichius; M. Schmitt & J. Popp Multimodal nonlinear microscopic investigations on head and neck squamous cell carcinoma - toward surgery assisting frozen section analysis Head & Neck, 2016, 38, 1545-1552
  • T. Bocklitz; E. Kämmer; S. Stöckel; D. Cialla-May; K. Weber; R. Zell; V. Deckert & J. Popp Single virus detection by means of atomic force microscopy in combination with advanced image analysis J. Struct. Biol., 2014, 188, 30-38


  • O. Chernavskaia; T. Meyer; T. Bocklitz & J. Popp Property measurement on a biological tissue sample 2015


  • Integration of morphological and biological correlates using optical methods of toxicity analysis (MorphoTox, Free State of Thuringia, 2020-2023)
    The altered morphology of poisoned cells will be measured using different spectroscopic imaging methods, and the results of these different methods will then be analyzed by a machine learning platform. Possible applications of the platform are toxicity tests of drug candidates and individualized toxicity tests for personalized therapy decisions.
  • Digitisation of the life sciences: Paths into the future - DigLeben (TMWWDG, 2020-2025)
    In the DigLeben project, different biological-medical issues and the further development of machine learning methods are being researched, covering central areas of data analysis of high-throughput techniques in the life sciences, (meta-)genomics, metabolomics and image processing.
  • Function-determining Excited-State Dynamics in Transition-Metal Complex-Based Photodrugs in a Cellular Environment (BO4700/4-1, DFG, 2019-2022)
    Machine learning methods will be developed to investigate photoinduced properties of model substances in cellular environments based on transient absorption microscopy.
  • Endoscopic panoramic imaging and fiber optic spectroscopy in urology for multi-dimensional diagnostics (Uro-MDD, BMBF, 2017-2020)
    The project aims to investigate a three-dimensional multimodal imaging of the bladder. The resulting demonstrator will allow multidimensional imaging of the bladder based on a stereo-sensor integrated in the cystoscope. We investigate data fusion algorithms to combine the different measurement data and analyze them using machine learning. 
  • Multimodal pattern analysis to characterize inflammation in patients with ulcerative colitis - Imaging disease activity and predicting clinical remission (BO4700/1-1, DFG, 2017-2020)
    In this project, Raman spectroscopic and multimodal image data will be used to predict clinical remission in patients with ulcerative colitis and we investigate the required machine learning tools for this task.
  • Whole Blood Imaging (BLOODi), Subproject: Automatic quantification of cell morphology of white blood cells (BLOODi, Leibniz Association, 2016-2019)
    In this project, we are investigating the possibility to use the cell morphology of white blood cells as a marker for an infection. The subproject deals with the quantification of the morphological changes using machine learning.

Classical Machine Learning

In classical machine learning (CML) the link between the images and higher information about this data, e.g class labels, is carried out in a two-step procedure. In the first step image features are extracted. The concrete types of image features are designed by a researcher using his/her knowledge and intuition about the samples and data at hand. Therefore human intervention is needed for this feature extraction. After the features are extracted the (multivariate) dataset is analyzed using easy classification models, like linear discriminant analysis (LDA).

This procedure is sketched in the figure above. A multimodal image is used as input image and the desired output is an image with three gray levels: one indicating background pixel and the other both gray levels represent different tissue classes. Therefore the machine learning task is to perform the translation between both images and is called semantic segmentation. After the image features ware extracted the so called training data set is used to train the classification model, where a number of images together with their true tissue segmentation is known.

In the figure above a classical machine learning framework is visualized for the study of Heuke et al. (2016). The multimodal images (left side) are utilized to calculate statistical moments of the histogram in a local mode, which characterize the texture. In a region around a central pixel the histogram of the image values is calculated. Based on that first order statistical moments or derivate of them, like mean, standard deviation and entropy, are calculated and this is performed on every pixel of the image as central pixel. In that way feature images (mid part of the figure) are generated which are subsequently utilized to predict, which tissue regions are present at the central pixel. The results are false color images like shown on the right of the above figure, where green represents healthy epithelium, while red indicates cancerous areas.


Deep Learning 

Deep Learning is a special version of machine learning, which is inspired by the working of the human brain in processing of visual data or other kinds of input. Deep learning for semantic segmentation is visualized in the figure above. The main property of deep learning is that the feature extraction is done by the method implicitly in combination with the construction of a classification model. This is the main advantage, because no human is needed to construct a suitable feature extraction. The drawback is that these models can’t be interpreted and the model has a lot of parameters (typically 1Mio parameters). Nevertheless, these models feature a unique potential to solve a large class of machine learning tasks, especially if a large amount of data is existing.

 In the figure above a special kind of deep learning method, a convolutional neuronal network is visualized (CN24 architecture ( pre-trained using the ILSVRC2012 dataset). It adapts the processing of visual input by the human brain, by performing a large number of subsequent convolutional operations. Therefore, the main part of these networks consists of convolutional filters applied subsequently to generate feature images, which are subsequently converted into class prediction. Again a local prediction of tissue types based on multimodal images is generated using this network, which can be seen on right of the figure. The deep learning technique can be seen as non-linear translation tool from the multimodal images (left side) into the false color images characterizing the tissue types (right side).

Logo Leibniz-Gemeinschaft