Accurately constraining velocity information from spectral imaging observations using machine learning techniques

Determining accurate plasma Doppler (line-of-sight) velocities from spectroscopic measurements is a challenging endeavour, especially when weak chromospheric absorption lines are often rapidly evolving and, hence, contain multiple spectral components in their constituent line profiles. Here, we present a novel method that employs machine learning techniques to identify the underlying components present within observed spectral lines, before subsequently constraining the constituent profiles through single or multiple Voigt fits. Our method allows active and quiescent components present in spectra to be identified and isolated for subsequent study. Lastly, we employ a Ca II 8542 {\AA} spectral imaging dataset as a proof-of-concept study to benchmark the suitability of our code for extracting two-component atmospheric profiles that are commonly present in sunspot chromospheres. Minimisation tests are employed to validate the reliability of the results, achieving median reduced $\chi^2$ values equal to 1.03 between the observed and synthesised umbral line profiles.

Determining accurate plasma Doppler (line-of-sight) velocities from spectroscopic measurements is a challenging endeavour, especially when weak chromospheric absorption lines are often rapidly evolving and, hence, contain multiple spectral components in their constituent line profiles. Here, we present a novel method that employs machine learning techniques to identify the underlying components present within observed spectral lines, before subsequently constraining the constituent profiles through single or multiple Voigt fits. Our method allows active and quiescent components present in spectra to be identified and isolated for subsequent study. Lastly, we employ a Ca II 8542 Å spectral imaging dataset as a proof-of-concept study to benchmark the suitability of our code for extracting two-component atmospheric profiles that are commonly present in sunspot chromospheres. Minimisation tests are employed to validate the reliability of the results, achieving median reduced χ 2 values equal to 1.03 between the observed and synthesised umbral line profiles.

Introduction
Tomographic analysis of the complex solar atmosphere and its dynamics requires the combined use of different spectral lines (e.g. Ca II, Mg II, Lα, and Hα). Indeed, due to radiative transfer effects, these lines are formed across different heights of the solar atmosphere, with spectral information extracted at a particular percentage line depth corresponding to differing optical depths, and hence particular atmospheric heights [1]. Hence, it is possible to extract solar plasma properties as a function of optical depth (and atmospheric height if radiative transfer processes are well constrained) by taking measurements at specific percentage line depths [2].
The challenge of the solar chromosphere is that non-local thermodynamic equilibrium (non-LTE) plays an important role in the propagation of photons through the relatively dense and tenuous atmosphere [3]. Furthermore, due to spatial resolution limitations of even the largestaperture ground-based solar telescopes, and the intrinsic fine-scale structuring of the solar atmosphere (scales smaller than 100 − 150 km), often the observed spectrum within the resolution element is the superposition of several components owing to different plasma states (e.g., both quiescent and magnetised atmospheric components). The effects of a multi-component atmosphere are generally taken into account in state-of-the-art spectropolarimetric inversion codes (e.g. NICOLE; [3]). However, these techniques are extremely demanding computationally. In addition to the effects of spatial resolution, multi-components can arise from the fact that chromospheric spectral lines sample a wide range of heights; from the mid-photosphere (wings of the line), through to the upper chromosphere (core of the line), thus mixing very different physical regimes [4].
Additional spectroscopic components that are superimposed on top of the quiescent background spectral profile can arise from a wealth of solar phenomena, including dynamic events such as magnetic reconnection, propagating waveforms, and shock development, all of which have the ability to form across a range of atmospheric heights. This adds a secondary component to an otherwise quiescent atmosphere [5], which is often highly Doppler-shifted relative to the approximately stationary line core, creating a significantly broadened resultant spectral profile. Being able to segregate the multi-component atmospheric contributions from a spectral line would allow the simultaneous examination of both 'dynamic' and 'quiescent' regimes of the atmosphere much more readily.
One of the most dynamic and widely recognised processes that creates multi-component chromospheric spectral lines is the propagation of magnetoacoustic waves, and their subsequent development into shock fronts, in the umbrae of sunspots [6][7][8][9][10][11]. Such umbral oscillations propagate energy upwards from the photosphere into the chromosphere, where the steep density drop results in the rapid increase of the velocity amplitude in an attempt to conserve energy flux, which is readily captured by the temperature-sensitive Ca II 8542 Å spectral line. Once the velocity amplitude exceeds the local sound speed, the waves develop into shock fronts -a process that can occur as low as ∼250 km [12] through to the uppermost region of the chromosphere [11], resulting in the production of optically thin Ca II 8542 Å emission [13]. It is this optically thin emission that superimposes on top of the quiescent Ca II 8542 Å spectral profile, appearing in spectroscopic observations as an almost instantaneous blue-shifted (i.e., upwardly propagating) emission that slowly decays from its maximum intensity, providing the characteristic 'saw-tooth' spectral shape that is synonymous with umbral shock formation. Such spectral profile evolution is coupled to the underlying p-mode wave spectrum, providing a periodicity of approximately 3 minutes to the umbral flashes [14,15]. Attempting to fit a multi-component umbral flash spectral profile using centre-of-gravity methods [16,17] or a single Gaussian, Lorentzian, or Voigt profile is prone to error, since it may vastly under-or over-estimate the plasma characteristics of the developing shock. Hence, to fully understand the physics behind rapid atmospheric evolution (including those related to umbral flashes), both components must be considered separately; not simply modelled as a single-component atmosphere that attempts to bridge quiescent and active states [12,14]. Doppler (line-of-sight) velocities are typically acquired from spectroscopic observations using a number of techniques. They can be measured across a range of optical depths by calculating, at specific percentage line depths, the spectral line bisector shift from the central line core wavelength [2,18]. If full spectropolarimetric measurements are available, other atmospheric parameters (such as temperature, line-of-sight magnetic flux, and magnetic field inclination angles) can be estimated by inversion codes [19]. Such high-precision measurements of Stokes I/Q/U/V are difficult to obtain for deep chromospheric absorption lines, including Ca II 8542 Å, due to the relatively small rate at which photons reach the detector [20,21]. To combat the inherently weak signal, exposure times could be increased accordingly, but this is not always feasible when tracking the evolution of dynamic and rapidly evolving features in the lower solar atmosphere [22]. If only Stokes I spectral observations are available, Voigt profiles [23,24] can be fitted to the spectra. As we will discuss in Section 3, the Voigt profile includes enhanced spectral wing broadening, which provides a more suitable realisation of the radiative transfer effects omnipresent throughout the lower solar atmosphere, including Doppler and pressure broadening mechanisms [25][26][27][28]. Alternative spectral shapes, such as the Gaussian and Lorentzian profiles, are useful when trying to represent profile shapes manifesting via different radiative transfer effects that are often present in the spectra.
Previous studies [12] have attempted to fit complex Ca II 8542 Å line profiles using a linear combination of two Gaussians, one for each component of the atmosphere attempting to be modelled. Boundary conditions for each fitted profile are employed such that the first Gaussian models an absorption component representative of the quiescent atmosphere, while the second Gaussian models an emission component linked to the dynamic phenomenon under investigation (e.g., magnetoacoustic shocks). This works well when two distinct components are present in an observed line profile, yet it introduces a number of challenges when only a single absorption component is resolved. Firstly, it takes more computational time and resources to fit the twoprofile combination than a single profile, since twice the number of fitting parameters need to be constrained. Secondly, the boundary conditions placed on the fitted Gaussians tend to be subjectively chosen, and therefore may subject any extracted results to a user bias. Thirdly, the most problematic issue is the tendency for the two-profile combination to overfit the observed data as the fitting algorithm naturally attempts to smooth out noise and asymmetries in the observed spectra. This results in neither of the constituent profiles, by themselves, accurately modelling the absorption components present in the spectra, and Doppler shifts extracted from any one of the two fitted components would be potentially misleading and incorrect.
A novel method to alleviate the degree of human interaction required when processing large datasets hinges on the concept of machine learning, which is becoming increasingly commonplace in the field of solar physics. Machine learning, in particular neural networks, have been used in solar physics research from at least the early 1990s for applications such as predicting the maximum number of sunspots during a solar cycle [29], and also predicting the number of sunspots produced throughout the Sun's lifetime [30]. More recently, they have been used for predicting and detecting features on the Sun such as flares [31], coronal mass ejections [32], and active regions on the far side of the solar surface [33]. On small spatial scales, machine learning has been shown to play an important role in the investigation of velocity flows [34] and rapid spectropolarimetric inversions of Stokes I/Q/U/V spectra [35]. Hence, the application of machine learning and neural networks to the challenging problem of fitting complex spectral line profiles (e.g., Ca II 8542 Å) is both timely and important for the accurate study of small-scale dynamic phenomena that are ubiquitously observed to permeate the solar atmosphere.
In this paper we detail a method for fitting profiles to spectral lines that often contain multiple atmospheric components. Our method uses machine learning techniques to distinguish between spectra that have multi-component atmospheres and those that can be best represented by single spectral fits, allowing us to adjust the model and fitting method accordingly. We also provide a proof-of-concept study on a challenging Ca II 8542 Å spectral imaging dataset to show the suitability of this technique for widespread usage.

Observations
Spectral imaging observations of active region NOAA 12149 were acquired from 14:37 -17:37 UT on 2014 August 30 using the Interferometric BIdimensional Spectrometer (IBIS; [36]) instrument at the National Science Foundation's Dunn Solar Telescope, New Mexico, USA. IBIS is a Fabry-Pérot instrument able to obtain high temporal, spatial, and spectral resolution imaging spectroscopy measurements of the solar atmosphere. The Ca II 8542 Å spectral line was chosen for our observations, with IBIS operating in a Stokes I imaging mode (i.e., no polarimetric measurements were acquired) in order to maximise the temporal resolution of the spectroscopic scans. The observations obtained consisted of 2103 spectral scans utilising 27 non-equidistant wavelength points sampling over a 2.4 Å window centred on the Ca II 8542 Å line core. However, for the purposes of subsequent analysis, we focus on the central 23 wavelength points over a 1.6 Å window centred on the Ca II 8542 Å line core. As shown in Figure 1, closer wavelength spacing was chosen around the line core to provide better quiescent profile fitting. The spatial sampling was 0 . 098 per pixel, and the cadence of each Ca II 8542 Å scan was 5.8 s.
A contextual continuum image of active region NOAA 12149 was acquired from the Helioseismic and Magnetic Imager (HMI; [37]) onboard the Solar Dynamics Observatory (SDO; [38]). This measurement was taken at the start of the IBIS spectral imaging sequence, and once processed through standard SunPy data reduction algorithms [39,40] provided a full disk reference image with a spatial sampling equal to 0 . 6 per pixel. Using the HMI contextual image to define absolute solar coordinates, the IBIS dataset was subsequently co-aligned to it using crosscorrelation techniques [41]. Figure 2 shows example images from the IBIS dataset co-aligned with the SDO/HMI continuum.

Methods
Presented in this section are the details of a method which utilises a neural network to classify a spectrum based on its profile shape. A suitable spectral fitting model is then selected for the spectrum according to its neural network classification. By accurate fitting of the chosen model, a Doppler shift, and therefore a velocity, can be uncovered for each of the constituent spectral    components, thus allowing quiescent and dynamic parts of the atmosphere to be isolated and studied independently. An average quiet Sun spectrum was extracted from the dataset by averaging all of the IBIS spectra contained within the rectangular region plotted in Figure 2 across all 2103 scans. The average quiet Sun spectrum incorporated the averaging of 70 660 800 individual spectra, with the resulting profile plotted in Figure 1. This location was chosen for constructing the quiet Sun spectrum because it is isolated from both the sunspot umbra (where complex Ca II 8542 Å profiles synonymous with shock formation are known to form) and the regions containing magnetic pores and bright points that often result in enhanced Ca II 8542 Å line core emission due to the formation of chromospheric plage [42][43][44]. The quiet Sun spectrum shown in Figure 1 demonstrates a deep absorption profile with no clear evidence of enhanced wing or line-core emission, and hence forms a good 'rest' profile of the solar atmosphere that can be used to benchmark spectral fluctuations resulting from dynamic Ca II 8542 Å phenomena.

(a) Choosing a fitting profile
Observations of a spectral line corresponding to a specific atomic electron transition provides emission and absorption signatures over a range of wavelengths (i.e., not just an infinitely narrow feature at the wavelength associated with the transitioning electron). This increased width of the spectral line is due to a number of effects including Doppler broadening, pressure broadening, and Zeeman splitting [25][26][27][28]. These effects can be replicated closely by convolving a Gaussian function with a Lorentzian profile, producing a Voigt function, V , defined by [23], where A is a constant scaling for the amplitude, is a Gaussian function centred at zero, and L(x; γ) = γ/(π(x 2 + γ 2 )) is a Lorentzian function centred at zero. The parameters σ and γ are the standard deviations of the Gaussian and the half-width at halfmaximum of the Lorentzian, respectively. The variable of integration is u. Since the Lorentzian and Gaussian functions are centred on zero, the variable x that is passed into the Voigt function must first be shifted by subtracting the wavelength of the line core, x 0 , such that the line core wavelength is at zero in the corresponding function. The x 0 parameter must be re-added to the fitted spectra in order to precisely account for Doppler-shifted plasma.
A linear combination of Voigt profiles was chosen to model the absorption and emission components of the spectra, hence producing a 'double Voigt' model. If the chosen spectrum shows only emission or absorption profile shapes (e.g., like shown in Figure 1), then a 'single Voigt' model is selected, thus avoiding the computational and overfitment drawbacks associated with multiple component fits when not necessary. Note that a background intensity level is not included in the generated models computed from Equation 3.1 as this will be calculated and subtracted from the spectra before fitting, as detailed in Section 3d(ii). By modelling the spectra with Voigt profiles instead of just simple Gaussian functions, we can fit the spectral lines more accurately as we are also including line wing broadening that mimics the observed spectral lines.
Single or double Voigt profiles are fitted iteratively to each spectrum until it is sufficiently similar to the observed line profile. During the fitting process, the employed model ('single Voigt' or 'double Voigt') is statistically evaluated numerous times to monitor the level of convergence between the synthetic and observed line profiles. It is therefore necessary for the selected model to be able to be evaluated efficiently, thereby saving computational time. The Voigt function was implemented in Python with the convolution being carried out by a numerical integration function, quad, provided by the scipy.integrate Python module [45,46]. An approximation of the Voigt profile [47] was also explored. However, even though this function requires less time to compute when compared to the numerical integration approach, the fitting methods exhibited very slow convergence, and hence we adopted the numerical integration technique for the remainder of our work.
We utilise a number of techniques to maximise the efficiency of our calculated Voigt functions. The integrand of the Voigt function is written in the C programming language and compiled into a shared library. This C function can then be imported into Python using the ctypes library. The scipy.integrate numerical integration function is then able to use this more efficient function as an integrand. We also adjust the absolute and relative error tolerances for the numerical integration, setting them to 0.149 and 0.000149, respectively. In practical terms, this means that the error in the raw intensity value at each wavelength point will be 0.149 for |I − I BW | ≤ 1000 and 0.000149 × |I − I BW | for |I − I BW | > 1000, where I are the intensity values at a each wavelength point on a profile, and I BW is the blue wing (rest line core −1.2 Å) intensity value for the same profile.
Taking a typical full IBIS imaging spectral scan from our dataset, we computed the |I − I BW | value at every intensity value across all 664 796 individual spectra in the scan. We then calculated the error in each of the raw intensity values, I, using these computed |I − I BW | values. By dividing each error by I, a percentage error was determined for every intensity value. Over the entire fieldof-view, the average percentage error was 0.00666%, with an error of 0.00739% over the subset of wavelength points ±0.1 Å around the stationary line core. For the umbral spectra, which have much lower I BW values, the average percentage error was 0.0179%, with an error of 0.0180% around the stationary line core. In our particular test, the maximum percentage error across the whole scan was only 0.0488%. Given that the IBIS spectra typically have line core and wing intensity values exceeding 1730 and 2550 detector counts, respectively, and |I − I BW | values of 757 and 149, respectively, these values correspond to absolute intensity errors of 0.149 detector counts. As such, these small percentage error tolerances are sufficient for our investigation.
(b) Neural network (i) Classifying spectra As discussed in the Section 1, fitting a double Voigt model to all spectra introduces temporal efficiency and overfitting problems for those that contain only a single absorption/emission component. In these cases, a single Voigt model would be more appropriate. Our method for fitting spectral lines employs machine learning techniques to tailor the specific model (i.e., single Voigt or double Voigt functions) used for each spectrum sampled. This way, if a spectrum only contains a single component atmosphere consisting of an absorption dip or an emission peak, a single Voigt profile is fitted to the spectrum, with boundary constraints chosen objectively for such a line profile. On the other hand, if two components are detected, one absorption and one emission component, then a double Voigt profile is fitted, with each Voigt function appropriately constrained to fit the absorption and emission components of the observed line profile. In order to perform this automated task, we require a robust methodology to be able to reliably identify how many spectral components are present in each observed spectrum.
Machine learning, specifically artificial neural networks, are used to classify the spectra based on the amount of emission relative to absorption present in profile (see Section 3(c) for an in-depth discussion of how this is performed). Due to the diversity of spectral profiles in our dataset, it would have been very challenging to devise a fixed criterion that can be implemented into an algorithm and used to produce classifications more accurately than our neural network classifier. The neural network was trained to assign one of five classifications to each observed spectrum (see Figure 3), with the assigned classification used to adjust the model employed for final fitting. Specifically, the five classifications chosen consist of, 0 -A profile exhibiting purely absorption characteristics, with no evidence of emission or asymmetric line wings; 1 -An absorption line profile similar to the classification '0', but with evidence of a less precise line profile minimum due to, e.g., flattening of the line core intensities; 2 -A profile that begins to show signs of embedded emission characteristics, yet the line core intensity is still less than the continuum intensity; 3 -A line profile similar to classification '2', but with enhanced emission signatures such that the line core intensity is now just above the continuum level; and 4 -A line profile showing heightened emission that is considerably brighter than the neighbouring continuum, hence dominating the resultant spectral profile.
This classification regime was chosen as it allows key features of the spectral shape to be identified and used to adapt the model, and it also identifies regions where the level of emission is particularly high. The neural network infrastructure is provided by the scikit-learn Python module [48][49][50]. From this package we use the multi-layer perceptron (MLP) classifier [51] along with the L-BFGS-B solver [52], which is a unique version of the Broyden-Fletcher-Goldfarb-Shanno (BFGS) solver for limited memory and boundary constrained problems.

(ii) Mathematical processes
The observed spectrum is first interpolated on to a constant wavelength grid, resulting in a profile consisting of 33 data points. Our neural network implementation, with the multi-layer perceptron (MLP) classifier, takes the interpolated spectrum as a one-dimensional input layer with 33 neurons. The neural network then standardises its input by rescaling the input vector to range from 0 to 1. As a general overview, each of the neurons of the input layer are subsequently connected to each of forty neurons on a single hidden layer. Finally, each neuron of the hidden layer is connected to all of the classifications on the five neuron output layer. The classification with the largest probability is assigned to the specific spectrum.
More specifically, each neuron-neuron connection between two layers has a weight, w ij , where i is the index of the neuron in the current layer, and j is the index of the neuron on the previous layer. All of the neurons of the hidden and output layers have a bias, b. The input is propagated through the neural network along the connections between neurons of adjacent layers, with the applied weights and biases determining the level of connectivity between the subsequent layers. The result at each neuron is calculated using the equation, where x j are the outputs from previous layers, w j are their associated weights, and g represents the activation function. The activation function for the neurons of the hidden layer was chosen to be the rectified linear unit function (ReLU; [53]) which is defined as f (x) = max(0, x), meaning that the neuron only 'activates' and passes a non-zero value on to the next layer if its result is positive. The output layer employed the softmax activation function, which is defined as softmax(x) i = exp(x i )/( 5 k=1 exp(x k )), where i ∈ {1, 2, 3, 4, 5} are the five possible classifications that the neural network can assign.
This equation can be extended to apply to the output of a whole layer. Firstly, the outputs from the previous layer, x j , can be represented as a column vector, x ∈ R n , where n is the number of neurons in the previous layer. Secondly, the weights, w ij , can be represented as a matrix, W ∈ R m×n , where m is the number of neurons in the current layer. Importantly, the matrix of weights between the input layer and the one hidden layer is given by W (01) , and the input values is given by x (0) . Similarly, W (12) represents the matrix of weights between the hidden layer and output layer, with x (1) representing the output of the hidden layer. The biases, b ∈ R m , for each neuron of the hidden layer is given by b (1) , while the output layer can be represented by b (2) . The output of the neural network is therefore y ∈ R 5 . As a result, a spectrum can be classified by evaluating the following set of equations in order, where f : R → R and softmax : R → R are the ReLU and softmax activation functions, respectively, applied to the matrix elements. Subsequently, the spectrum is then assigned a classification associated with the largest output neuron value. Many spectra can be classified simultaneously by increasing the dimensions of the matrices forming tensors, which can be operated very efficiently using graphics processing units (GPUs; [54]). The architecture of the neural network described in this study can classify a full IBIS imaging spectral scan, consisting of 664 796 individual spectra contained within the circular aperture of the instrument (see Figure 2), in approximately 48 seconds using a single 2.10 GHz Intel Xeon CPU core. As a result of this reasonable timeframe, porting to GPUs was not investigated in the present study. However, with larger field-of-view sizes and more rapid cadences from upcoming next-generation instrumentation [55], the resulting increase in data volume may require attention to be turned towards GPUs to provide the necessary accelerated performance to implement this technique in near real-time on next-generation datasets.
(c) Training procedure In Section 3(b), the weights and biases are initially unknown. For the neural network to be able to accurately classify spectra, the optimal weights and biases must be calculated, which is achieved by fitting the parameters of the neural network to a set of manually labelled training spectra.
The current accuracy of the neural network is quantified by a loss function, calculated using the manually assigned classification and the classification currently being assigned by the neural network. The weights and biases are optimised by minimising the value of the loss function. In this method we use the log-loss function, where S is the set of ground truth labels (y), defined as either '0' or '1', and currently predicted probabilities (y ) spanning a range between 0 and 1, for each classification of the spectra in the training set. The parameter N S is the number of (y, y ) pairs in the set S.
To find the predicted probabilities for each classification, the algorithm passes the training set of spectra through the neural network, configured with an initially random set of weights and biases, before calculating the subsequent log-loss function. Finally, the L2 regularisation term, where ||W || 2 2 is the sum of the weight matrices squared and α is the parameter that controls the amount of regularisation, is added to the loss function. The training algorithm iteratively tries a number of different α terms, before selecting the specific value that produces the most accurate classifications after training.
The weights and biases that minimise the loss function are found using the L-BFGS-B quasi-Newton minimisation algorithm [52], a variant of L-BFGS. This efficient algorithm works by finding an approximation of the Hessian matrix, which is a matrix of second derivatives of the loss function with respect to the weights and biases. Importantly, the Hessian matrix forms part of the quadratic term in the second-order Taylor expansion of the loss function when evaluated for a one-dimensional vector composed of the current weights and biases. The inverse of the Hessian matrix can also be found efficiently, which is then used to determine the step direction the vector of weights and biases should take for minimisation of the loss function.
There are two separate sets of manually labelled ground truth data, one used for training the neural network and another used for testing its performance. By using a different set of spectra to test the performance of the neural network, we are able to detect if the neural network is overfitting, which would result in the test set being assigned less accurate classifications. Various statistics are determined, including a calculation of the percentage of spectra the neural network correctly assigned the ground truth classification.
The MLP classifier in the scikit-learn module also has a number of parameters that must be specified before the training is carried out. These include the numbers of hidden layers and neurons employed, the α parameter used for L2 regularisation, the activation function used for the hidden layers (ReLU), and the algorithm used to optimise the weights and biases. A neural network consisting of a single hidden layer with forty neurons was chosen for our study.
The L-BFGS, stochastic gradient descent (SGD), and Adam [56] optimisation algorithms were trialled. Upon successive trainings, the L-BFGS method produced trained neural networks with the most consistent accuracy score, and hence this approach was adopted for the current study. The accuracy score is defined as the fraction of the testing set of spectra that the neural network successfully assigned its manually labelled classification. Our method uses the GridSearchCV

(i) Training with the IBIS dataset
To train and test the neural network, 200 spectra from the IBIS Ca II 8542 Å dataset introduced in Section 2 were chosen at random from within the umbra. The spectra in this set are manually labelled with a classification, providing ground truth data for training and testing purposes. The first half of this 200 spectra dataset is used to train the neural network, while the other half is used to test the success of the trained neural network. A subset of these spectra are separated into their manually labelled classifications and plotted in Figure 3 for visual clarity. As described in Section 3b(i), these spectra are classified based on the ratio of emission to absorption, with the extreme ends of the scale being those spectra showing no evidence of emission (classified as '0'), and those spectra with obvious emission much brighter than the neighbouring continuum (classified as '4').
Following the methodology outlined in Section 3(c), the success of the trained neural network can be classified in terms of 'precision' and 'recall' statistics. Here, the precision is defined as the percentage of spectra that were correctly assigned a classification, c, from all the spectra that were also assigned the same classification by the neural network. Recall is defined as the percentage of spectra that were correctly assigned a classification, c, out of all the spectra that should have been assigned this specific classification according the the ground truth data. For each of the classifications (0 → 4) in ascending order, the precisions obtained were 77%, 74%, 71%, 100%, and 94%. Similarly, in terms of recall, the percentages obtained were 94%, 78%, 80%, 62%, and 100%. These result in overall averages, weighted by the number of ground truth spectra per classification, of 83% and 81% for the precision and recall parameters, respectively. Importantly, 95% of the spectra incorrectly classified by the neural network had a deviation of only one classification number. These performance measures suggest that the neural network is sufficiently accurate for our purposes. Running the neural network classification algorithm on the umbral pixels extracted from a single IBIS spectral imaging scan displays the expected complexities associated with this type of solar feature. In particular, the lower right panel of Figure 3 displays the classifications that the trained neural network returned, revealing a multitude of purely absorption (i.e., quiescent) and emission (i.e., active) spectra.

(d) Fitting method
The following preprocessing steps and fitting methods are applied to each spectrum of the IBIS dataset independently. Based on the neural network processing described in Section 3c(i), each spectrum is assigned a classification. Spectra that are assigned a classification of either '0' or '1' are fitted with a simple, single Voigt profile as described in Section 3(a). Furthermore, spectra assigned a classification of either '2', '3', or '4' are modelled with double Voigt profiles. This results in spectra with a dominant absorption component being fitted with a single quiescent atmosphere model, whereas spectra with increasing emission components are modelled with multi-component fits that are more representative of the dynamic atmosphere. Doppler velocity information can then be extracted for both the quiescent and active atmospheres using the parameters inferred from the fitted models.
As will be discussed in Section 3d(iii), we will further distinguish between classifications '0' and 1 → 4. However, classifications 2 → 4 are processed in an identical manner. While we found no additional benefits to distinguishing between these different classifications when fitting, we have kept these as distinct classifications to demonstrate the adaptability of our method and to assist with relating the spatial distribution of classifications to any dynamic features found in the Doppler velocity measurements.

(i) Preprocessing techniques
As shown in Figure 1, the spectral images acquired using IBIS consist of intensity measurements obtained across 27 non-equidistant wavelength points. However, in our analysis we focus on the central 23 wavelength points, providing a 1.6 Å wavelength range across the Ca II line, allowing Doppler velocities spanning ≈ ±30 km/s to be studied. Each spectrum is interpolated on to an equidistant wavelength grid employing a wavelength spacing of 0.05 Å. This produced a spectrum comprising of 33 equidistant wavelength samples.
The next stage is to ensure the line wings are correctly calibrated. IBIS includes a prefilter with a FWHM of 4.6 Å centred at 8542.5 Å in order to isolate the 8542 Å signal from the other resonant wavelengths produced by the Fabry-Pérot interferometer [26]. As a result, each profile is initially corrected by removing the prefilter transmission response from the measured spectrum.

(ii) Background removal
In the fitted models, a background profile is not included, and as a result the continuum intensity is expected to be zero. The motivation for not including a background profile is to reduce the number of free parameters, and thus make the model less computationally challenging to fit. Hence, a constant background intensity is calculated for each spectrum by performing a boxcar moving average of the intensities calculated 1.2 Å into the blue wing from the Ca II 8542 Å line core over a time frame of ±2 minutes relative to the current scan, which is subtracted from the spectrum before fitting with a suitable model. When processing a number of spectra over a large field-of-view, the calculation of the backgrounds and their subsequent subtraction can be vectorised, resulting in a significant computational speed improvement compared to processing each spectrum individually. Due to the breadth of the Ca II 8542 Å line, in conjunction with the relatively narrow wavelength range of the IBIS prefilter, it is not possible to to calculate the true continuum intensity for our dataset. As such, we approximate the continuum intensity, Ic, using a boxcar moving average placed at the furthest blue-wing position of the IBIS spectral imaging scans (i.e. line core −1.2 Å).

(iii) Weighted fit
In order to fit the chosen model to the spectra more accurately, we applied weights to different parts of the spectrum. Weights are given by assigning error bars to the spectrum, which the fitting algorithm uses when calculating goodness-of-fit parameters. As the measurement of Doppler velocities is of paramount importance, especially for quiescent spectra classified as '0' where the Doppler shifts may be subtle, we prioritise fitting around the spectral line core. As a result, we add larger uncertainties to the wings of the spectral line when compared to the line core, which can be visualised in the upper panel of Figure 4. On the other hand, for spectral classifications 1 → 4, we found that the accuracy of the fit was improved by increasing the associated uncertainties at wavelengths immediately surrounding the wavelength of the stationary line core (see the lower panel of Figure 4). This had the effect of lowering the priority for fitting the exact shape of the peak, which was either difficult to isolate (e.g., for classifications '1' and '2' shown in Figure 3), or incentivised the optimisation algorithms to fit more closely the steepening gradients of the spectral line wings (e.g., for classifications '3' and '4' shown in Figure 3) that are representative of the dynamic activity present in the spectra.

(iv) Least squares optimisation
The chosen model (i.e., either a single or double Voigt model) is fitted to the spectra using the curve_fit function provided by the scipy.optimize Python module [45,46]. With this function we use the Trust Region Reflective algorithm for least squares optimisation, similar to that proposed by Branch et al. [57], which allows for bounds to be specified for the parameters intrinsic to the model, which assists with spectral convergence.
The amplitude is bounded such that a single Voigt model must fit a negative (i.e., absorption) amplitude, while a double Voigt model must fit one negative amplitude (representing the quiescent atmosphere) and one positive amplitude (representing the active atmospheric component). This helps to prevent overfitting issues described in Section 1.
The wavelength at the centre of the absorption Voigt profile is constrained such that it does not deviate too far from the stationary line core wavelength. The central wavelength of the absorption Voigt function is allowed to vary within a 0.15 Å window above and below the stationary line core wavelength. The stationary line core wavelength is found by applying this method, without bounds, to the average quiet Sun spectrum introduced in Section 2. This constrains the method to only allow Doppler velocities in the absorption component that are ±5.27 km/s. This helps ensure that the model represents what is happening physically in the spectra, and does not give an unphysical Doppler shift.
Bounds are also placed on the γ and σ parameters for the Lorentzian and Gaussian shapes that are convolved to form the resultant Voigt profile. The bounds chosen for the γ and σ components are 10 −6 < γ or σ < 1, which allows a wide range of spectral broadening to be accounted for. respectively. The umbra/penumbra boundary is highlighted using a black contour, and is consistent with that shown in An initial guess is also supplied to the fitting function, where the central wavelength of each constituent Voigt profile is initially set to the stationary line core wavelength. Any Voigt profiles modelling absorption are given a negative initial amplitude that is typical for the spectra dataset, while any Voigt profiles representing emission are given a typical positive initial amplitude. All γ and σ initial guesses are taken to be 0.1 and 0.2, respectively. Figure 5 shows examples of the fitting methods applied to IBIS spectra. In the left panel, the method is applied to a spectrum containing only an absorption component and is therefore fitted with a single Voigt model. A spectrum exhibiting a complex mix of absorption and emission is fitted with a double Voigt model in the right panel. Both of these fitted profiles show excellent agreement with the observed profiles, in particular, around the line core where the fitting was prioritised.

(e) Calculating velocities
Once the quiescent and active components have been isolated, they can be studied independently to ascertain their respective Doppler shifts. The wavelength of the line core, λ observed , for each atmospheric component can be extracted from the parameters defining the central wavelengths of the best-fitting constituent Voigt profiles. The 'at rest' line core wavelength, λ stationary , is calculated by applying the algorithm to a spatially and temporally averaged spectrum extracted from a region containing quiet Sun, as documented in Section 2 (see, e.g., the rectangular quiet Sun region depicted in Figure 2 and its average profile shown in Figure 1). Doppler velocities, v, can then be calculated by comparing the line core wavelengths of each of the atmospheric components to the stationary wavelength via, v (km/s) = λ observed − λ stationary λ stationary × 300 000

Proof of concept testing with IBIS data
A proof of concept test was performed using the IBIS dataset described in Section 2. The spectra across the full field-of-view for a single spectral imaging scan (totalling 664 796 individual spectra) were first classified by the neural network, with the resulting classifications shown in Figure 6. Following the methods outlined in Section 3, Doppler velocities were computed for each fitted absorption component, with the resulting velocity maps shown in the left panel of Figure 7. If a spectral profile required multiple profile fits (i.e., using a double Voigt model), then the resulting Doppler velocities inferred from the emission Voigt fit are shown in the right panel of Figure 7. In particular, it can be seen in the right panel of Figure 7 that many of the derived Doppler velocities associated with dynamic phenomena appear to rapidly change between neighbouring pixels, an effect that has been documented in previous velocity studies of the solar chromosphere [58][59][60]. These rapid velocity excursions appear to be closely linked to instances when the Ca II 8542 Å line goes into emission. As a result, we believe that such discontinuities may be caused by dynamic changes in the opacity of the plasma, resulting in shifts of the response function of the line caused by the source function no longer monotonically decreasing throughout the chromosphere [61,62], and hence are not numeric artefacts. As such, we must stress that smooth velocity fields should not necessarily be expected following the application of our techniques, especially when examining the challenging effects of radiative transfer in the solar chromosphere.  A goodness-of-fit value was estimated at each pixel of the field-of-view using a modified χ 2 relation, where s is a scaling factor, ν is the estimated degrees of freedom, and I fitted λ and I observed λ are the intensity values of the fitted and observed spectra, respectively. The wavelengths over which this calculation is performed, λc, include the wavelength closest to the stationary line core, in addition to the 12 wavelength points either side. Thus, the central 25 wavelength points (out of 33 points in total) were used to compute the modified χ 2 statistic as this allowed for a better measure of the goodness-of-fit around the line core. Therefore, we introduce the scaling factor, s = 33 / 25 , to account for this subset of wavelength points. The degrees of freedom, ν, were calculated for each spectrum by subtracting the number of fitted parameters (4 for single Voigt profiles and 8 for double Voigt profiles) from the total number of wavelength points, 33. The χ 2 values for all the fitted spectra in the field-of-view are summarised in Figure 8. Considering the entire IBIS spectral imaging scan, the median χ 2 value is χ 2 = 2.36. When only the modified χ 2 values for the umbral locations (i.e., spectra within the white umbral contours highlighted in Figure 2) are used, the median value is χ 2 = 1.03. The calculated median χ 2 values are close to one, suggesting that the fitting method is able to accurately constrain the observed spectral line profiles. The accuracy is particularly good when considering umbral pixels, since not only is the median χ 2 value particularly close to one, as can be seen in Figure 8, but the tail on the χ 2 distribution for umbral spectra drops off very rapidly with increasing χ 2 values.
The most computationally intensive aspect of the proof of concept testing was the fitting of suitable Voigt models to the spectra, with the double Voigt model taking more time than than the single Voigt model. To fit a full spectral imaging scan totalling 664 796 individual spectra, with 30% being modelled using a double Voigt profile (see Figure 6), took 123 minutes running across all 16-cores on a 2.10 GHz Intel Xeon processor. As a result, processing all 2103 spectral scans from the current IBIS dataset on a single CPU would likely take on the order of 175 days. However, the techniques presented can be further parallelised by employing multiple CPUs, hence bringing the entire processing time down to the order of 1 week or less. Furthermore, as discussed in Section 3b(ii), the current algorithms may be ported across to GPUs, providing the ability to accelerate processing performance by an order-of-magnitude or more.

Discussion
Although our primary objective for this method is to accurately constrain velocity information within an umbral region, the method can be applied more generally to any region of a spectral imaging dataset where a two-component atmosphere may be present. As shown in our proof of concept test (Section 4), our methods can be applied to any solar region, including both dynamic locations (umbrae, penumbrae, regions of magnetism, etc.) and those demonstrating more quiescent behaviour (e.g., quiet Sun that is permeated by granulation). Although the set of labelled ground truth spectra that are used to train and test the neural network are chosen from within the umbral region, the range of spectral profile shapes encountered within the umbra (i.e., spanning pure absorption through to pure emission characteristics) are representative of many different spectra found outside of the umbral region.
The precision and recall scores of 83% and 81%, respectively, as introduced in Section 3c(i), suggest that the neural network is able to classify spectra with a reasonable degree of accuracy. Since our method currently treats classifications '2', '3', and '4' identically, the performance statistics can be recalculated assuming these cases all have the same classification. This results in increased precision and recall scores of 91% and 90%, respectively, suggesting that the performance of the neural network is particularly well-suited for our methods. Similarly, if we adjust the neural network to distinguish between 'emission' (classifications '2', '3', and '4') and 'no emission' (classifications '0' and '1'), the precision and recall scores for these groupings are 96% and 95%, respectively. With a larger set of ground truth data, perhaps including other highly dynamical solar phenomena that often exhibit enhanced line-wing asymmetries (e.g., penumbral jets, spicules, magnetic reconnection; [63][64][65]), the precision and recall scores of the neural network could be improved yet further, even to be very close to 100%. The precision and recall values computed here are consistent with other trained astrophysical neural networks adopted into mainstream data processing [66,67].
The classification methods presented here (i.e., excluding the line fitment and velocity processing) are also useful for estimating the degree of quiescence in a particular dataset [68]. Such processing is carried out by applying only the neural network classification procedure to the data and monitoring the relative occurrence of each of the five classifications -a process that can be accomplished within a few minutes.
In the future, our methods could be adapted to find velocity measurements for chromospheric jets that often demonstrate plasma motions in the range of ∼ 20 − 40 km/s [69]. Another potential use case is the study of Ellerman bombs, which have been observed in the Ca II 8542 Å line core as significant blue-shifted emission [70,71]. This blue-shifted emission is present in other chromospheric lines, including Hα and He I 10830 Å, but not in any photospheric lines [71]. As described above, it may be necessary to further train the neural network using a larger set of ground truth data, including examples of enhanced line-wing asymmetries that are synonymous with such dynamical solar phenomena.
Our method could also be extended to model a three (or more) component atmosphere by including additional Voigt (or similar) profiles in the model and modifying the criteria that determine how the assigned classification adjusts the fitting method. For such a technique to produce accurate velocities, a much higher number of wavelength samples would be required, such that the components are more clearly resolved and are therefore less blended with the surrounding plasma. With a higher number of wavelength samples, other features, such as the presence of double-peaked self-reversal structures in the emission components, could also be resolved. To facilitate such future code development, attention will likely need to be turned to the next generation of spectral imaging Fabry-Pérot instruments, in addition to slit-and fibrebased spectropolarimeters. These revolutionary instruments, including the Visible Tunable Filter (VTF; [72]) and the Diffraction Limited Near Infrared Spectropolarimeter (DL-NIRSP), will soon be commissioned the National Science Foundation's Daniel K. Inouye Solar Telescope (DKIST; [74]).
The 'active' and 'quiescent' components present in Stokes I observations can be isolated using our method (see, e.g., Figure 7). Future work to investigate passing these isolated components through the Non-LTE Inversion Code using the Lorien Engine (NICOLE; [3]) or the CAlcium Inversion using a Spectral ARchive (CAISAR; [73]) inversion codes would be of particular interest. Often these inversion codes provide plasma parameter outputs based on a static atmosphere. Hence, by treating the two atmospheric components separately, ambiguities in the inverted plasma parameters could be minimised, since each component would be better constrained by its own single, high-quality spectral fit. This is a timely endeavour, especially with new, high spectral precision observations on the horizon from new telescope facilities, including DKIST.
The code is fully available for the community to download and utilise. Details of how to access it are provided in Appendix A.

Conclusion
Novel methods for accurately constraining velocity information from spectral imaging observations have been presented. Using machine learning techniques, our methods automatically adapt the spectral models used to fit the input spectra. Importantly, our methods will only fit multi-component models if multiple signatures are observed in the input spectra, hence saving time and preventing the overfitment of the data.
By modelling each atmospheric component with its own independent Voigt function, the constituent components of the atmosphere can be isolated, both spatially and temporally. Such techniques have a diverse range of use cases, including the applicability to upcoming DKIST observations, as well as refining the inputs for modern inversion routines, since it enables each atmospheric component to be studied independently.
A proof of concept test applying this method to a challenging Ca II 8542 Å spectral imaging dataset was presented. In this, we demonstrated both the accuracy of the method and how its techniques can be applied more generally. Importantly, the algorithms presented are available to the global community through regularly updated download links.
Data Accessibility. The data used in this paper are from the observing campaign entitled 'Nanoflare Activity in the Lower Solar Atmosphere' (NSO-SP proposal T1020; principal investigator: DBJ), which employed the ground-based Dunn Solar Telescope, USA, during August 2014. The Dunn Solar Telescope at Sacramento Peak/NM was operated by the National Solar Observatory (NSO). NSO is operated by the Association of Universities for Research in Astronomy (AURA), Inc., under cooperative agreement with the National Science Foundation (NSF). Additional supporting observations were obtained from the publicly available