Summary
Yearly seasonality with only 12 monthly data points leads to under-identified models in Prophet, causing version-dependent results. Different versions of Prophet and cmdstanpy converge to varying trend-seasonality decompositions, resulting in unstable forecasts. No warning or safeguard exists for this scenario, posing risks in production.
Root Cause
- Under-identification: 12 monthly points (one annual cycle) are insufficient to uniquely determine both trend and yearly seasonality.
- Version dependencies: Different versions of Prophet and cmdstanpy handle under-identified models differently, leading to divergent results.
- Lack of safeguards: No warnings or checks for enabling yearly seasonality with limited data.
Why This Happens in Real Systems
- Data limitations: Real-world datasets often have minimal historical data, especially for new products or metrics.
- Model complexity: Prophet’s flexible modeling approach (trend + seasonality) becomes unstable with insufficient data.
- Version inconsistencies: Dependency updates in production environments can silently alter forecasting behavior.
Real-World Impact
- Unreliable forecasts: Predictions become version-dependent, leading to inconsistent results across environments.
- Negative extrapolation: Models may produce unrealistic negative forecasts despite strictly positive training data.
- Production risks: Unstable forecasts can impact decision-making and system reliability.
Example or Code (if necessary and relevant)
import pandas as pd
from prophet import Prophet
y = [361.06, 33880.23, 29431.62, 17337.68, 208032.5, 515776.5,
848975.0, 837513.2, 1237904.0, 2246456.0, 1982927.0, 2421611.0]
df = pd.DataFrame({"ds": pd.date_range("2025-01-01", periods=12, freq="MS"), "y": y})
m = Prophet(yearly_seasonality=True, weekly_seasonality=False, daily_seasonality=False)
m.fit(df)
future = m.make_future_dataframe(periods=3, freq="MS")
fcst = m.predict(future)
print(fcst[["ds", "trend", "yearly", "yhat"]])
How Senior Engineers Fix It
- Data augmentation: Use at least 2+ annual cycles to stabilize seasonality estimation.
- Regularization: Reduce model complexity by disabling yearly seasonality or using
seasonality_prior_scale. - Version pinning: Lock Prophet and cmdstanpy versions in production to ensure consistency.
- Custom safeguards: Implement checks to warn or block yearly seasonality with insufficient data.
Why Juniors Miss It
- Lack of awareness: Juniors may not recognize the risks of under-identification in time-series models.
- Over-reliance on defaults: Enabling
yearly_seasonality=Truewithout understanding its implications. - Insufficient testing: Failing to validate model stability across different versions or environments.