Yearly seasonality with only 12 monthly points leads to different results due to version dependencies

Summary

Yearly seasonality with only 12 monthly data points leads to under-identified models in Prophet, causing version-dependent results. Different versions of Prophet and cmdstanpy converge to varying trend-seasonality decompositions, resulting in unstable forecasts. No warning or safeguard exists for this scenario, posing risks in production.

Root Cause

  • Under-identification: 12 monthly points (one annual cycle) are insufficient to uniquely determine both trend and yearly seasonality.
  • Version dependencies: Different versions of Prophet and cmdstanpy handle under-identified models differently, leading to divergent results.
  • Lack of safeguards: No warnings or checks for enabling yearly seasonality with limited data.

Why This Happens in Real Systems

  • Data limitations: Real-world datasets often have minimal historical data, especially for new products or metrics.
  • Model complexity: Prophet’s flexible modeling approach (trend + seasonality) becomes unstable with insufficient data.
  • Version inconsistencies: Dependency updates in production environments can silently alter forecasting behavior.

Real-World Impact

  • Unreliable forecasts: Predictions become version-dependent, leading to inconsistent results across environments.
  • Negative extrapolation: Models may produce unrealistic negative forecasts despite strictly positive training data.
  • Production risks: Unstable forecasts can impact decision-making and system reliability.

Example or Code (if necessary and relevant)

import pandas as pd
from prophet import Prophet

y = [361.06, 33880.23, 29431.62, 17337.68, 208032.5, 515776.5, 
     848975.0, 837513.2, 1237904.0, 2246456.0, 1982927.0, 2421611.0]
df = pd.DataFrame({"ds": pd.date_range("2025-01-01", periods=12, freq="MS"), "y": y})

m = Prophet(yearly_seasonality=True, weekly_seasonality=False, daily_seasonality=False)
m.fit(df)
future = m.make_future_dataframe(periods=3, freq="MS")
fcst = m.predict(future)
print(fcst[["ds", "trend", "yearly", "yhat"]])

How Senior Engineers Fix It

  • Data augmentation: Use at least 2+ annual cycles to stabilize seasonality estimation.
  • Regularization: Reduce model complexity by disabling yearly seasonality or using seasonality_prior_scale.
  • Version pinning: Lock Prophet and cmdstanpy versions in production to ensure consistency.
  • Custom safeguards: Implement checks to warn or block yearly seasonality with insufficient data.

Why Juniors Miss It

  • Lack of awareness: Juniors may not recognize the risks of under-identification in time-series models.
  • Over-reliance on defaults: Enabling yearly_seasonality=True without understanding its implications.
  • Insufficient testing: Failing to validate model stability across different versions or environments.

Leave a Comment