While general statistical and machine learning utility metrics provide a baseline assessment, they often fail to capture the intricate dynamic properties inherent in time-series data. Time-series datasets are characterized by temporal dependencies, trends, seasonality, and potentially complex autocorrelation structures. Evaluating synthetic time-series data requires specialized techniques that explicitly assess whether these dynamic characteristics have been preserved during the generation process. Without this specialized focus, synthetic time-series data might appear statistically similar in its marginal distributions but fail miserably when used for forecasting or simulating dynamic systems.
This section details methods for evaluating the quality of synthetic time-series data, focusing on its ability to replicate the temporal dynamics of the original data.
One of the defining characteristics of time-series data is autocorrelation, the correlation of the series with lagged versions of itself. A good synthetic time-series generator should replicate this structure.
The primary tools for visualizing autocorrelation are the ACF and PACF plots.
A standard evaluation practice involves generating ACF and PACF plots for both the real and synthetic datasets and comparing them visually. Significant discrepancies indicate that the synthetic data fails to capture the short-term and long-term temporal dependencies present in the real data.
Implementation: Libraries like statsmodels
in Python provide functions (plot_acf
, plot_pacf
) to easily generate these plots.
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt # Or use plotly for interactive plots
# Assuming 'real_series' and 'synthetic_series' are pandas Series
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
sm.graphics.tsa.plot_acf(real_series, lags=40, ax=axes[0], title='Real Data ACF')
sm.graphics.tsa.plot_acf(synthetic_series, lags=40, ax=axes[1], title='Synthetic Data ACF')
plt.tight_layout()
plt.show()
# Repeat for PACF plots
# sm.graphics.tsa.plot_pacf(...)
Visual comparison is useful but subjective. For a quantitative assessment, calculate the ACF values up to a certain lag K for both real (ACFR) and synthetic (ACFS) data. Then compute a distance metric between these vectors, such as the Mean Absolute Error (MAE):
MAEACF=K1k=1∑K∣ACFR(k)−ACFS(k)∣A lower MAE indicates better preservation of the autocorrelation structure. The same approach applies to PACF.
Comparison of Autocorrelation Function (ACF) values up to lag 20 for example real and synthetic time series. Close alignment suggests good preservation of temporal dependencies.
For multivariate time-series (multiple variables measured over time), evaluating the relationships between series is just as important as evaluating individual series autocorrelation.
The CCF measures the correlation between one time series xt and lagged values of another time series yt−k. Comparing the CCF plots (or quantitatively comparing CCF values) between the real multivariate dataset and the synthetic one reveals whether the lead-lag relationships between variables have been maintained.
Implementation: statsmodels.tsa.stattools.ccf
can compute the cross-correlation. Visualizing these requires plotting the CCF for each pair of variables.
Many time series exhibit trends (long-term increase or decrease) and seasonality (patterns repeating over a fixed period, e.g., daily, weekly, yearly).
Decomposition methods, like Seasonal-Trend decomposition using Loess (STL), separate a time series into three components: trend, seasonal, and residual.
Implementation: statsmodels.tsa.seasonal.STL
provides an implementation for STL decomposition.
from statsmodels.tsa.seasonal import STL
# Assuming 'real_ts' and 'synthetic_ts' are pandas Series with DatetimeIndex
# And 'period' is the known seasonal period (e.g., 7 for daily data with weekly seasonality)
stl_real = STL(real_ts, period=period).fit()
stl_synthetic = STL(synthetic_ts, period=period).fit()
# Plotting components
fig, axes = plt.subplots(3, 2, figsize=(12, 8), sharex=True)
axes[0, 0].plot(stl_real.trend, color='#1c7ed6')
axes[0, 0].set_title('Real Trend')
axes[0, 1].plot(stl_synthetic.trend, color='#fd7e14')
axes[0, 1].set_title('Synthetic Trend')
axes[1, 0].plot(stl_real.seasonal, color='#1c7ed6')
axes[1, 0].set_title('Real Seasonality')
axes[1, 1].plot(stl_synthetic.seasonal, color='#fd7e14')
axes[1, 1].set_title('Synthetic Seasonality')
axes[2, 0].plot(stl_real.resid, color='#1c7ed6', alpha=0.7)
axes[2, 0].set_title('Real Residuals')
axes[2, 1].plot(stl_synthetic.resid, color='#fd7e14', alpha=0.7)
axes[2, 1].set_title('Synthetic Residuals')
plt.tight_layout()
plt.show()
Sometimes, the characteristics of a time series are best understood in the frequency domain. The Power Spectral Density (PSD) describes how the variance (power) of the series is distributed across different frequencies. This is particularly relevant for data with oscillatory patterns not strictly tied to standard seasonal periods.
Implementation: scipy.signal.welch
is commonly used for PSD estimation.
Log-log plot comparing estimated Power Spectral Density (PSD) for example real and synthetic time series. Similarity suggests the synthetic data captures the frequency domain characteristics.
Similar to the general Machine Learning Utility evaluation (Chapter 3), we can assess the synthetic data's usefulness for the specific downstream task of forecasting.
If the model trained on synthetic data performs comparably to the one trained on real data (and significantly better than the naive forecast), it indicates high utility for forecasting tasks.
Instead of comparing entire series, you can compute various statistical features for each individual time series (if you have a dataset of many time series, e.g., sales for different stores) and compare the distribution of these features between the real and synthetic datasets.
Examples of features:
Once you have these features calculated for every series in both the real and synthetic sets, you can apply the distributional comparison techniques from Chapter 2 (e.g., KS-tests, Wasserstein distance, visual comparison via histograms or density plots) to the distributions of these features. This provides a different perspective on whether the synthetic data captures the variety and characteristics of the original time series collection.
Diagram illustrating the process of comparing distributions of statistical features computed from sets of real and synthetic time series.
Evaluating synthetic time-series data requires going beyond static distributional checks. By analyzing autocorrelation, cross-correlation, trend, seasonality, spectral properties, and forecasting utility, you gain a much more complete understanding of whether the generated data truly captures the essential dynamic nature of the original time series. A combination of visual inspections (like ACF/PACF plots) and quantitative metrics provides the most reliable assessment.
© 2025 ApX Machine Learning