Networking Support

Exploring time series models for landslide prediction: a literature

November 10, 2024

Data Preparation

Data Frequency

The time series data can be collected at various frequencies, such as minutes, hours, days, or months. This choice depends on the data size, power consumption optimization of the monitoring system, and the required accuracy. The reviewed studies show that two primary frequencies are utilized: monthly time steps and daily time steps.

Monthly Time Steps: Around 59% of the literature uses monthly time steps for landslide prediction. This frequency provides a relatively smooth time series pattern, where even basic models can achieve adequate performance.

Daily Time Steps: Approximately 41% of the studies integrate daily time steps. These time series are more biased and random, necessitating the use of advanced modeling techniques to capture the complex dynamics.

Physically, the temporal prediction of landslides primarily relies on the rainfall pattern. The monthly and daily time steps exhibit significantly different rainfall patterns, as illustrated in Figure 6b. This disparity highlights the need to investigate how data frequency affects model performance, balancing data collection, sensor power consumption, and prediction accuracy.

Splitting Ratio

When employing physically based threshold models, the training process involves splitting the dataset into training, validation, and testing sets. This process should maintain the temporal ordering of the data, unlike time-independent applications that use random folding techniques.

The standard holdout strategy is recommended, where the validation and test sets are at the end of the time series. This approach ensures that the model is evaluated on future, unseen data.

Regarding the training-to-testing ratio, the literature shows a preference for an 80% training ratio, followed by 90%, 85%, and 70%. This choice is influenced by the temporal length of the available time series, which is often limited to 48 to 357 steps. To ensure that the testing set represents at least the last year of the data, a minimum of 12 steps is typically used for the testing set.

Decomposition

The landslide response is often a complex phenomenon, and various attempts have been made to simplify the analytical process. Many studies decompose the landslide displacement time series into residual, trend, and seasonal or periodic components, as shown in Figure 8.

Approximately 63% of the reviewed studies employ decomposition techniques, such as ensemble empirical mode decomposition with adaptive noise (CEEMDAN), modified ensemble empirical mode decomposition (EEMD), variational mode decomposition (VMD), double moving average (DMA), density-based spatial clustering of applications with noise (DBSCAN), support vector classifier (SVC), continuous wavelet analysis, and differencing.

These decomposition methods assume that the trend term depends on the creep behavior and is not affected by external triggering, while the seasonal term is the only component triggered by seasonal factors. However, this assumption has not been thoroughly validated, and further research is needed to explore the validity of this approach.

Window Size and Antecedent Period

Rainfall-induced landslides exhibit a time lag between the triggering and the slope response, primarily due to the infiltration and surface-runoff mechanisms. Considering this antecedent period, or lagged sequence, can improve model performance, as shown in Figure 10.

The reviewed studies utilize a lagged period ranging from 12 to 60 time steps, depending on the slope’s hydraulic conductivity and other mechanical and hydrological characteristics. However, the literature has not adequately addressed the impact of the window size (sequence length) and how the antecedent value influences the prediction model’s performance.

Feature Selection

Landslides are influenced by various factors, including creep features (geology, geomorphology, soil, hydraulic, and land use) and triggering features (rainfall, earthquakes, human activities, reservoir fluctuation, etc.). The reviewed studies generally focus on predicting surface displacement, considering rainfall and reservoir level fluctuations as the primary external triggering features, along with historical displacement values and their derivatives (velocity, increment, change, and evolution state).

Several statistical methods have been used for feature selection, such as gray relation analysis (GRA), partial autocorrelation function (PACF) algorithms, the maximal information coefficient (MIC), kernel sHAP, Pearson correlation, R2-adj, Akaike information criterion (AIC), and the least absolute shrinkage and selection operator (LASSO).

However, these models often neglect the temporal dependencies in landslide responses, leading to the selection of unrelated features. Integrating knowledge-based methods and sensitivity analysis is recommended to consider temporal dependencies and select the best-related features.

Statistical Correlations

Pearson’s correlation coefficient is a common statistical model used in feature selection due to its simplicity and practical nature. However, the maximal information coefficient (MIC) has been found to outperform Pearson’s correlation, as it can extract both linear and non-linear correlations, as well as complex relationships.

Additionally, the effective antecedent period can be sensitively investigated using statistical correlations, such as the Pearson method, to determine the optimal lag time for improved model performance.

Model Selection

Static models, such as artificial neural networks (ANN), support vector machines (SVM), random forest (RF), and statistical models like autoregressive integrated moving average (ARIMA), struggle to account for the temporal correlation between input features and external triggering. In contrast, dynamic and deep learning models, particularly the long short-term memory (LSTM) model, can extract the non-linear correlation between the triggering and the landslide response, outperforming traditional methods.

Optimizations

Hyperparameter Tuning

Optimizing the model’s hyperparameters is crucial for achieving the best performance. Various techniques have been employed, including grid search, random search, Bayesian optimization, particle swarm optimization (PSO), genetic algorithms (GA), successive halving (SH), and the sparrow search algorithm (SSA).

Grid search is a comprehensive but time-consuming approach, while random search is faster but may not always yield the optimal hyperparameter combination. Techniques like PSO, GA, SH, and SSA offer advantages in terms of convergence speed, global search capability, and efficient resource allocation.

The main concept behind these optimization methods is to search from coarse to fine scales, accurately capturing the best model structure while also converging faster.

Training Optimization

Selecting the appropriate number of iterations, learning rate, and monitoring metrics is essential for the training process. Small iterations can cause bias issues (underfitting), while long iterations can lead to variance issues (overfitting). Similarly, a small learning rate requires large computational time, while a large learning rate may be misleading.

The Adam optimizer has been found to offer the best prediction accuracy, outperforming traditional gradient descent methods. The Adam algorithm dynamically adapts the learning rates for each parameter, improving convergence speed and efficient exploration of the solution space.

Loss Functions

The Huber loss function, which integrates and refines both the mean square error (MSE) and mean absolute error (MAE), provides a balanced optimization approach. Monitoring and quantifying the training and validation loss can help overcome underfitting and overfitting challenges.

Normalization and Regularization

The quantitative variation in triggering and landslide responses can make the training process challenging. Normalizing and scaling the data is crucial to ensure the model converges more easily.

Techniques like minimum and maximum normalization, mean normalization, and Z-score normalization have been employed. Additionally, regularization methods, such as L2 regularization or dropout layers, can be used to reduce the size of the training weights while maintaining the model output.

Activation Functions

Activation functions are higher-level features that convert input features to new ones that fit the labeled output. Various activations, such as linear, ReLU, Leaky ReLU, tanh, and sigmoid, have been used in the reviewed studies.

ReLU and Leaky ReLU are preferred for hidden layer activations due to their faster convergence, although ReLU is susceptible to the “dying ReLU” problem. Leaky ReLU addresses this issue by allowing a slight, non-zero gradient for negative inputs.

Model Evaluation

Evaluation Metrics

The most commonly used metrics in the reviewed studies are RMSE, MAE, and MAPE, which evaluate the model’s performance during the training, validation, and testing stages. R2 and absolute error are less frequently used.

Two evaluation techniques are employed: the unweighted method, which assigns the same error weight to all datasets, and the weighted method, which assigns different error weights for creep and mutual points.

Predictions

The reviewed studies predominantly focus on predicting surface displacement, particularly for reservoir landslides. Hydrological response predictions, such as volumetric water content, matric suctions, and groundwater level variation, are rarely considered.

Other landslide types, such as deep-seated landslides and rock slope failures, have received less attention, and rainfall-induced shallow landslides are notably absent from the literature.

Uncertainties in Predictions

Predictions are inherently associated with uncertainties arising from model assumptions. However, the majority of the reviewed studies utilize single predictions, forecasting a single time step ahead without quantifying the corresponding uncertainties.

Only a few studies have explored interval predictions, which provide upper and lower boundaries for the forecasted values. Xing et al. (2019) presented a detailed derivation of calculating interval predictions based on the assumption of a random variable with a zero mean Gaussian distribution and independence from the input variable.

Gaps and Future Recommendations

The review of the literature on time series models for landslide prediction reveals several key gaps and opportunities for future research:

Assessment of Subsurface Characteristics: Prediction models primarily rely on surface measurements, neglecting the assessment of subsurface mechanical and hydrological characteristics. Integrating diverse data sources to better understand the underlying landslide mechanisms can improve predictive accuracy.
Spatiotemporal Dynamics in Prediction Methods: Single-point prediction methods often overlook the spatiotemporal dynamics of landslides. Exploring the integration of information from various monitored data sources can enhance the comprehensive understanding of landslide behavior.
Impact of Data Frequency: The influence of data frequency on prediction accuracy is frequently disregarded. Conducting comprehensive analyses to understand the impact of data frequency on accuracy and operational costs can lead to more effective monitoring and prediction strategies.
Effect of Time Series Decomposition: The impact of time series decomposition on prediction accuracy remains largely unexplored. Comparative analyses between decomposed and non-decomposed methodologies can provide valuable insights into the effectiveness of this approach.
Temporal Correlations in Feature Selection: Statistical techniques commonly used for feature selection often overlook the temporal correlations in the data. Leveraging deep learning and knowledge-based approaches to capture and incorporate temporal dependencies can improve feature selection robustness and accuracy.
Weighted Evaluation Methodologies: Many studies assign equal weight to all datasets, potentially resulting in misleading conclusions, particularly for lengthy datasets with numerous non-critical points. Adopting weighted evaluation methodologies can help accurately capture the critical points in landslide applications, prioritizing their detection to mitigate risks effectively.

By addressing these gaps and implementing the recommended future directions, researchers and practitioners can enhance the accuracy and efficacy of time series models for landslide prediction, contributing to improved disaster mitigation efforts.

Conclusion

This comprehensive review of time series models for landslide prediction provides valuable insights into the current state of research and identifies key areas for future exploration. The analysis of diverse studies reveals the significance of data frequency, temporal ordering in data splitting, time series decomposition, and the importance of considering the antecedent period in model development.

The review also highlights the superiority of dynamic and deep learning models, particularly the LSTM, in capturing the non-linear correlation between triggering and landslide response. However, the effectiveness of these models is contingent upon meticulous data preparation, feature selection, and model optimization strategies.

By addressing the identified gaps, such as the need to integrate diverse data sources, explore the impact of data frequency, and adopt weighted evaluation methodologies, researchers can further enhance the accuracy and reliability of time series models for landslide prediction. Implementing these recommendations can contribute to improved disaster mitigation efforts and better protect vulnerable populations and infrastructure from the devastating impacts of landslides.