All Courses

Evaluating Synthetic Time-Series Data

While general statistical and machine learning utility metrics provide a baseline assessment, they often fail to capture the intricate dynamic properties inherent in time-series data. Time-series datasets are characterized by temporal dependencies, trends, seasonality, and potentially complex autocorrelation structures. Evaluating synthetic time-series data requires specialized techniques that explicitly assess whether these dynamic characteristics have been preserved during the generation process. Without this specialized focus, synthetic time-series data might appear statistically similar in its marginal distributions but fail miserably when used for forecasting or simulating dynamic systems.

This section details methods for evaluating the quality of synthetic time-series data, focusing on its ability to replicate the temporal dynamics of the original data.

Assessing Autocorrelation Structure

One of the defining characteristics of time-series data is autocorrelation, the correlation of the series with lagged versions of itself. A good synthetic time-series generator should replicate this structure.

Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) Plots

The primary tools for visualizing autocorrelation are the ACF and PACF plots.

The ACF plot shows the correlation between the time series $y_t$ and its lags $y_{t-k}$ for various values of $k$ .
The PACF plot shows the correlation between $y_t$ and $y_{t-k}$ after removing the effects of the intermediate lags ( $y_{t-1}, y_{t-2}, ..., y_{t-k+1}$ ).

A standard evaluation practice involves generating ACF and PACF plots for both the real and synthetic datasets and comparing them visually. Significant discrepancies indicate that the synthetic data fails to capture the short-term and long-term temporal dependencies present in the real data.

Implementation: Libraries like statsmodels in Python provide functions (plot_acf, plot_pacf) to easily generate these plots.

import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt # Or use plotly for interactive plots

# Assuming 'real_series' and 'synthetic_series' are pandas Series
fig, axes = plt.subplots(1, 2, figsize=(12, 4))

sm.graphics.tsa.plot_acf(real_series, lags=40, ax=axes[0], title='Real Data ACF')
sm.graphics.tsa.plot_acf(synthetic_series, lags=40, ax=axes[1], title='Synthetic Data ACF')
plt.tight_layout()
plt.show()

# Repeat for PACF plots
# sm.graphics.tsa.plot_pacf(...)

Quantitative Comparison of ACF/PACF

Visual comparison is useful but subjective. For a quantitative assessment, calculate the ACF values up to a certain lag $K$ for both real ( $ACF_R$ ) and synthetic ( $ACF_S$ ) data. Then compute a distance metric between these vectors, such as the Mean Absolute Error (MAE):

MAE_{ACF} = \frac{1}{K} \sum_{k=1}^{K} |ACF_R(k) - ACF_S(k)|

A lower MAE indicates better preservation of the autocorrelation structure. The same approach applies to PACF.

Comparison of Autocorrelation Function (ACF) values up to lag 20 for example real and synthetic time series. Close alignment suggests good preservation of temporal dependencies.

Evaluating Multivariate Temporal Dependencies

For multivariate time-series (multiple variables measured over time), evaluating the relationships between series is just as important as evaluating individual series autocorrelation.

Cross-Correlation Function (CCF)

The CCF measures the correlation between one time series $x_t$ and lagged values of another time series $y_{t-k}$ . Comparing the CCF plots (or quantitatively comparing CCF values) between the real multivariate dataset and the synthetic one reveals whether the lead-lag relationships between variables have been maintained.

Implementation: statsmodels.tsa.stattools.ccf can compute the cross-correlation. Visualizing these requires plotting the CCF for each pair of variables.

Assessing Trend and Seasonality

Many time series exhibit trends (long-term increase or decrease) and seasonality (patterns repeating over a fixed period, e.g., daily, weekly, yearly).

Time Series Decomposition

Decomposition methods, like Seasonal-Trend decomposition using Loess (STL), separate a time series into three components: trend, seasonal, and residual.

Apply decomposition to both the real and synthetic time series (using the same seasonal period).
Visually compare the estimated trend and seasonal components.
Quantitatively compare the components, for example, by calculating the MAE between the real and synthetic trend components.

Implementation: statsmodels.tsa.seasonal.STL provides an implementation for STL decomposition.

from statsmodels.tsa.seasonal import STL

# Assuming 'real_ts' and 'synthetic_ts' are pandas Series with DatetimeIndex
# And 'period' is the known seasonal period (e.g., 7 for daily data with weekly seasonality)
stl_real = STL(real_ts, period=period).fit()
stl_synthetic = STL(synthetic_ts, period=period).fit()

# Plotting components
fig, axes = plt.subplots(3, 2, figsize=(12, 8), sharex=True)

axes[0, 0].plot(stl_real.trend, color='#1c7ed6')
axes[0, 0].set_title('Real Trend')
axes[0, 1].plot(stl_synthetic.trend, color='#fd7e14')
axes[0, 1].set_title('Synthetic Trend')

axes[1, 0].plot(stl_real.seasonal, color='#1c7ed6')
axes[1, 0].set_title('Real Seasonality')
axes[1, 1].plot(stl_synthetic.seasonal, color='#fd7e14')
axes[1, 1].set_title('Synthetic Seasonality')

axes[2, 0].plot(stl_real.resid, color='#1c7ed6', alpha=0.7)
axes[2, 0].set_title('Real Residuals')
axes[2, 1].plot(stl_synthetic.resid, color='#fd7e14', alpha=0.7)
axes[2, 1].set_title('Synthetic Residuals')

plt.tight_layout()
plt.show()

Analyzing Spectral Properties

Sometimes, the characteristics of a time series are best understood in the frequency domain. The Power Spectral Density (PSD) describes how the variance (power) of the series is distributed across different frequencies. This is particularly relevant for data with oscillatory patterns not strictly tied to standard seasonal periods.

Power Spectral Density (PSD) Comparison

Estimate the PSD for both the real and synthetic time series using methods like Welch's algorithm.
Plot the estimated PSDs against frequency.
Compare the plots visually. Significant peaks in the real PSD should ideally be present at similar frequencies and magnitudes in the synthetic PSD.
Quantitative comparison can involve calculating metrics like the Jensen-Shannon divergence or Wasserstein distance between the normalized PSDs.

Implementation: scipy.signal.welch is commonly used for PSD estimation.

Log-log plot comparing estimated Power Spectral Density (PSD) for example real and synthetic time series. Similarity suggests the synthetic data captures the frequency domain characteristics.

Time-Series Forecasting Utility

Similar to the general Machine Learning Utility evaluation (Chapter 3), we can assess the synthetic data's usefulness for the specific downstream task of forecasting.

Train-Synthetic-Test-Real (TSTR) for Forecasting

Train: Train a standard forecasting model (e.g., ARIMA, Exponential Smoothing, Prophet, or even RNNs/LSTMs) solely on the synthetic time-series data.
Test: Evaluate the trained model's ability to forecast the real time series over a hold-out period.
Compare: Compare the forecast accuracy (using metrics like MAE, RMSE, MAPE) against two baselines:
- A model trained and tested purely on real data (Train-Real-Test-Real).
- A naive forecast (e.g., predicting the last observed value).

If the model trained on synthetic data performs comparably to the one trained on real data (and significantly better than the naive forecast), it indicates high utility for forecasting tasks.

Distribution of Time Series Features

Instead of comparing entire series, you can compute various statistical features for each individual time series (if you have a dataset of many time series, e.g., sales for different stores) and compare the distribution of these features between the real and synthetic datasets.

Examples of features:

Mean, Variance, Skewness, Kurtosis
Autocorrelation at lag 1 (ACF(1))
Entropy or complexity measures
Parameters from fitted models (e.g., ARIMA parameters)
Trend strength, Seasonality strength (from decomposition)

Once you have these features calculated for every series in both the real and synthetic sets, you can apply the distributional comparison techniques from Chapter 2 (e.g., KS-tests, Wasserstein distance, visual comparison via histograms or density plots) to the distributions of these features. This provides a different perspective on whether the synthetic data captures the variety and characteristics of the original time series collection.

Diagram illustrating the process of comparing distributions of statistical features computed from sets of real and synthetic time series.

Evaluating synthetic time-series data requires checking more than static distributional checks. By analyzing autocorrelation, cross-correlation, trend, seasonality, spectral properties, and forecasting utility, you gain a much more complete understanding of whether the generated data truly captures the essential dynamic nature of the original time series. A combination of visual inspections (like ACF/PACF plots) and quantitative metrics provides the most reliable assessment.

Was this section helpful?