Introduction
The EUR/USD exchange rate is one of the most actively traded and economically significant financial instruments in the global markets. This study uses daily closing prices from LSEG/Datastream, covering December 31, 1999 to April 21, 2025. Daily frequency was chosen to balance model sensitivity and statistical robustness. The objective is to evaluate and compare volatility forecasting models — historical, EWMA, GARCH, and EGARCH — based on their empirical performance and predictive power.
Every chart below is recomputed in the browser from the same daily series; the statistics and GARCH parameters reproduce the original study to within rounding.
1. Return Analysis
Descriptive statistics
Mean return (): very close to zero, consistent with the efficient market hypothesis — no persistent directional bias.
Standard deviation (): the average daily volatility. Annualized () it yields approximately 9.27%, moderate for FX and lower than typical equity volatility.
Skewness (): nearly symmetric, with a very slight positive tilt.
Kurtosis (): excess kurtosis relative to a normal distribution (kurtosis ), implying fat tails and a higher probability of extreme returns.
| Mean | Skewness | Kurtosis | ||
|---|---|---|---|---|
| Log returns | 0.0000202318 | 0.005843 | 0.067542 | 4.89043 |
| i.i.d. | −0.0001158355 | 0.005841 | 0.013426 | 2.921075 |
Testing the mean return
To test whether the small positive mean is significant, we run a t-test against :
This returns a t-statistic of 0.281 and a p-value of 0.778. We fail to reject the null: there is no statistically significant evidence that the mean daily EUR/USD return differs from zero.
Testing the i.i.d. assumption
Many classical models assume returns are independent and identically distributed (i.i.d.), often normally. The raw returns look like noise; the absolute returns clearly do not.
We use the Autocorrelation Function (ACF) to assess independence. The Ljung–Box Q-test formally tests for autocorrelation up to lag (: none):
Raw returns (). First-order autocorrelation . No meaningful linear dependence — almost every lag sits inside the 95% confidence bands.
Absolute returns (). First-order autocorrelation , positive and highly significant. The ACF decays slowly with many lags far beyond the bands — solid evidence of volatility clustering: high-volatility periods follow high-volatility periods.
Conclusion on i.i.d.
- The returns are not normally distributed (significant leptokurtosis / fat tails).
- The returns show clear volatility clustering.
Therefore EUR/USD daily log returns are not i.i.d. — independence is violated by clustering, and identical distribution by non-constant variance and non-normality. Standard risk measures (VaR, ES) based on normality will likely underestimate true risk.
2. Equally Weighted Moving Average & EWMA
Equally weighted moving average
A first time-varying estimate uses moving averages of past squared returns. For a 22-day window, the variance at time is the sample variance of the last returns:
A 95% confidence interval for the variance, assuming approximate normality within the window and using the chi-squared distribution with degrees of freedom:
The estimate spikes during stress (2008, the 2020 COVID shock) and falls in calm periods; the band widens exactly when volatility is most uncertain. Window length is a responsiveness-vs-noise trade-off:
The 22-day estimate is the most responsive but noisiest; the 252-day is smooth but slow; the 90-day is a middle ground. Short windows suit current-risk detection; long windows suit stable, long-term planning.
Exponentially weighted moving average (EWMA)
EWMA weights recent returns more heavily via a recursion combining yesterday’s variance and yesterday’s squared return:
The daily estimate is , annualized as . The decay controls memory: with the most recent return gets 10% weight; with (RiskMetrics), only 6%.
Lower reacts faster and runs noisier; higher is smoother; the 252-day average is a slow long-term view. All three capture the same big events (2008, 2020, 2022) but react very differently. EWMA is simple and fast and avoids the jumps of equal-weighted windows, but it is sensitive to , ignores mean reversion, and assumes a flat forecast.
3. GARCH Models
GARCH treats volatility as clustering rather than constant: today’s variance is driven by the size of recent shocks (past squared residuals) and the most recent variance. It has a conditional mean equation, a conditional variance equation, and an error-distribution assumption.
GARCH(1,1)
Mean — returns fluctuate around a constant:
Variance — today’s variance depends on yesterday’s shock and yesterday’s variance:
Errors — zero-mean shocks scaled by the conditional volatility:
Fitting by maximum likelihood on the daily returns:
| Parameter | Estimate | Reading |
|---|---|---|
| ≈ 0 | mean return, not significantly different from zero | |
| 0.00108 | baseline variance when markets are calm | |
| 0.0349 | reaction to yesterday’s shock | |
| 0.9622 | persistence of volatility | |
| 0.9972 | near-unit persistence — shocks fade very slowly |
Long-term volatility is the level the process reverts to:
very close to the ~9.28% from the simple methods. The Ljung–Box test on the standardized residuals confirms a good fit:
| Lag | lb-stat | lb p-value |
|---|---|---|
| 1 | 1.351 | 0.245 |
| 10 | 11.819 | 0.297 |
| 21 | 18.508 | 0.617 |
The estimate captures the known stress periods (2008–09, 2011–12, 2020, 2022) with sharp rises and slow declines — a visual signature of high persistence () with mean reversion toward ~9.7%.
GARCH(1,1) with AR(1) mean
Allowing the mean to depend on the previous return,
the extra term is not useful: , small and insignificant. The variance parameters are essentially unchanged (, long-run ≈ 9.71%). Both models produce nearly identical volatility estimates — the constant-mean and AR(1) paths are visually indistinguishable.
Forecasting volatility with GARCH
GARCH forecasts revert slowly to the long-run level. Forecasting 250 days from the last data point: the path starts near the latest volatility and drifts down toward ~9.7% — slowly, because of the high persistence.
Versus the alternatives: a moving average only looks backward; EWMA forecasts a flat line (no mean reversion); GARCH starts where we are and reverts — a more realistic forecast. For a one-month option the near-term forecast (~11.5%) is a good vol input; for six months, the average of the daily-forecast path (~10.5–11%) is better.
EGARCH
EGARCH models the log-variance — avoiding positivity constraints — and distinguishes good from bad news (the leverage effect):
The asymmetry parameter is negative and significant — bad news raises EUR/USD volatility more than good news — but overall fit is slightly worse than plain GARCH:
| GARCH | AR-GARCH | EGARCH | |
|---|---|---|---|
| Log-likelihood | −5278.119268 | −5273.500078 | −5287.804381 |
| AIC | 10564.238536 | 10557.000156 | 10585.608763 |
| BIC | 10591.418442 | 10590.974281 | 10619.583645 |
If simplicity and overall fit are prioritized, choose GARCH(1,1); if the leverage effect matters for economic realism, prefer EGARCH(1,1).
4. Implied Volatility
So far we forecast volatility from history. An alternative reads it out of option prices: implied volatility (IV) is the volatility that makes the Black–Scholes price match the market price — the market’s own expectation over the option’s life. In practice IV varies with strike and maturity, so volatility is not constant even though Black–Scholes assumes it is. (Covrig & Low, 2003, show that quoted OTC ATM vols give a cleaner, often unbiased measure than vols backed out of exchange-traded prices.)
Information content and efficiency of IV
Using the Mincer–Zarnowitz framework, estimated by OLS with Newey–West standard errors:
The coefficient is positive (higher IV today → higher realized vol next month) but significant only at 10%, with . Testing unbiasedness () via a Wald test rejects decisively (, , both ): in this dataset, implied volatility is statistically biased.
Forecasting accuracy
Comparing forecasts on MAE and MSE (MSE punishes large errors more):
| Model | MAE | MSE |
|---|---|---|
| Implied Volatility | 5.123% | 0.630% |
| Historical Vol | 5.340% | 0.800% |
| EWMA | higher | 0.500% |
| GARCH | higher | 0.523% |
| EGARCH | higher | 0.529% |
On MAE, IV is most accurate; on MSE, EWMA wins (GARCH/EGARCH close behind). Best model depends on the objective: smallest average error → IV; avoiding large errors → EWMA.
5. Range-Based Models
The methods above use close-to-close returns, ignoring intraday movement. Range-based estimators use the high/low (and open/close) prices to extract more information.
Charts for this section aren’t reproduced here — the range estimators need daily High/Low/Open data, which isn’t in the close + implied-vol dataset driving the charts above.
Parkinson’s historical volatility
The Parkinson estimator (1980) needs only the daily high and low . The single-day variance from the range is:
and the rolling estimate over days averages and annualizes:
Over 2005–2025 with a 20-day window, Parkinson and standard close-to-close volatility track each other very closely; Parkinson averages slightly higher (8.52% vs 8.25%, average difference 0.27%), with the maximum gap ≈ 5.43% (annualized) on Jan 14, 2009. The intraday range carries some information missed by closing prices, but the practical difference here is minor.
Garman–Klass
The Garman–Klass (GK) estimator (1980) uses all four OHLC prices, combining the high-low range with the close-to-open change:
For EUR/USD, gaps are uncommon (it trades 24/5), so the OHLC advantage is limited. GK averages slightly above the standard measure (8.60% vs 8.25%, average difference 0.35%). Across all three (standard, Parkinson, Garman–Klass), the dynamics are highly similar; the range-based methods read marginally higher on average, but none has the dynamic, forward-looking modeling of the GARCH family.
Methods and estimation in Python (SciPy for statistical tests; standard econometrics tooling for the GARCH/EGARCH fits) in the original study. The charts on this page are recomputed independently in JavaScript from the LSEG/Datastream daily EUR/USD series (1999–2025); descriptive statistics and GARCH parameters match the original to within rounding.