Overview of manifold learning techniques for the investigation of disruptions on JET

Cannas, Barbara; Fanni, Alessandra; Murari, A; Pau, Alessandro; Sias, Giuliana

doi:10.1088/0741-3335/56/11/114005

Identifying a low-dimensional embedding of a high-dimensional data set allows exploration of the data structure. In this paper we tested some existing manifold learning techniques for discovering such embedding within the multidimensional operational space of a nuclear fusion tokamak. Among the manifold learning methods, the following approaches have been investigated: linear methods, such as principal component analysis and grand tour, and nonlinear methods, such as self-organizing map and its probabilistic variant, generative topographic mapping. In particular, the last two methods allow us to obtain a low-dimensional (typically two-dimensional) map of the high-dimensional operational space of the tokamak. These maps provide a way of visualizing the structure of the high-dimensional plasma parameter space and allow discrimination between regions characterized by a high risk of disruption and those with a low risk of disruption. The data for this study comes from plasma discharges selected from 2005 and up to 2009 at JET. The self-organizing map and generative topographic mapping provide the most benefits in the visualization of very large and high-dimensional datasets. Some measures have been used to evaluate their performance. Special emphasis has been put on the position of outliers and extreme points, map composition, quantization errors and topological errors.