Volatility Forecasting: EUR/USD

Introduction

The EUR/USD exchange rate is one of the most actively traded and economically significant financial instruments in the global markets. This study uses daily closing prices from LSEG/Datastream, covering December 31, 1999 to April 21, 2025. Daily frequency was chosen to balance model sensitivity and statistical robustness. The objective is to evaluate and compare volatility forecasting models — historical, EWMA, GARCH, and EGARCH — based on their empirical performance and predictive power.

Every chart below is recomputed in the browser from the same daily series; the statistics and GARCH parameters reproduce the original study to within rounding.

1. Return Analysis

▍ rendering prices…

EUR/USD daily closing price, 1999–2025.

Descriptive statistics

Mean return ( $\mu = 0.00002023$ ): very close to zero, consistent with the efficient market hypothesis — no persistent directional bias.

Standard deviation ( $\sigma = 0.005843$ ): the average daily volatility. Annualized ( $\sigma \times \sqrt{252}$ ) it yields approximately 9.27%, moderate for FX and lower than typical equity volatility.

Skewness ( $0.0675$ ): nearly symmetric, with a very slight positive tilt.

Kurtosis ( $4.8904$ ): excess kurtosis relative to a normal distribution (kurtosis $= 3$ ), implying fat tails and a higher probability of extreme returns.

	Mean	$\sigma$	Skewness	Kurtosis
Log returns	0.0000202318	0.005843	0.067542	4.89043
i.i.d.	−0.0001158355	0.005841	0.013426	2.921075

Testing the mean return

To test whether the small positive mean is significant, we run a t-test against $H_0: \mu = 0$ :

$t = \frac{\hat{\mu} - 0}{\hat{\sigma}/\sqrt{N}}$

This returns a t-statistic of 0.281 and a p-value of 0.778. We fail to reject the null: there is no statistically significant evidence that the mean daily EUR/USD return differs from zero.

Testing the i.i.d. assumption

Many classical models assume returns are independent and identically distributed (i.i.d.), often normally. The raw returns look like noise; the absolute returns clearly do not.

▍ rendering returns…

Daily log returns (%). Visually stationary, centred on zero.

▍ rendering absReturns…

Absolute log returns (%). Note the clustered bursts of activity — early signs of volatility clustering.

We use the Autocorrelation Function (ACF) to assess independence. The Ljung–Box Q-test formally tests for autocorrelation up to lag $h$ ( $H_0$ : none):

$Q = N(N+2) \sum_{k=1}^{h} \frac{\hat{\rho}_k^{\,2}}{N-k} \sim \chi^2(h)$

Raw returns ( $r_t$ ). First-order autocorrelation $\hat{\rho}_1 = -0.012$ . No meaningful linear dependence — almost every lag sits inside the 95% confidence bands.

▍ rendering acfRaw…

ACF of raw returns, lags 1–100. Essentially white noise: values fall within the ±95% bands.

Absolute returns ( $|r_t|$ ). First-order autocorrelation $\hat{\rho}_1 = 0.12$ , positive and highly significant. The ACF decays slowly with many lags far beyond the bands — solid evidence of volatility clustering: high-volatility periods follow high-volatility periods.

▍ rendering acfAbs…

ACF of absolute returns. Slow, persistent decay well outside the bands = volatility clustering.

Conclusion on i.i.d.

The returns are not normally distributed (significant leptokurtosis / fat tails).
The returns show clear volatility clustering.

Therefore EUR/USD daily log returns are not i.i.d. — independence is violated by clustering, and identical distribution by non-constant variance and non-normality. Standard risk measures (VaR, ES) based on normality will likely underestimate true risk.

2. Equally Weighted Moving Average & EWMA

Equally weighted moving average

A first time-varying estimate uses moving averages of past squared returns. For a 22-day window, the variance at time $t$ is the sample variance of the last $w = 22$ returns:

$\sigma_i^{\,2} = \frac{1}{T-1} \sum_{k=1}^{T} \left( r_{i,k} - \bar{r}_i \right)^2$

A 95% confidence interval for the variance, assuming approximate normality within the window and using the chi-squared distribution with $w-1 = 21$ degrees of freedom:

$\left( \frac{T\hat{\sigma}^2}{\chi^2_{\alpha/2,\;T-1}}, \; \frac{T\hat{\sigma}^2}{\chi^2_{1-\alpha/2,\;T-1}} \right)$

▍ rendering roll22ci…

22-day annualized volatility with its 95% confidence band — the band widens in turbulent periods.

The estimate spikes during stress (2008, the 2020 COVID shock) and falls in calm periods; the band widens exactly when volatility is most uncertain. Window length is a responsiveness-vs-noise trade-off:

▍ rendering rollCompare…

Rolling annualized volatility across 22 / 90 / 252-day windows.

The 22-day estimate is the most responsive but noisiest; the 252-day is smooth but slow; the 90-day is a middle ground. Short windows suit current-risk detection; long windows suit stable, long-term planning.

Exponentially weighted moving average (EWMA)

EWMA weights recent returns more heavily via a recursion combining yesterday’s variance and yesterday’s squared return:

$\hat{\sigma}_t^{\,2} = (1-\lambda)\, r_{t-1}^{\,2} + \lambda\, \hat{\sigma}_{t-1}^{\,2}$

The daily estimate is $\hat{\sigma}_t = \sqrt{\hat{\sigma}_t^{\,2}}$ , annualized as $\hat{\sigma}_t \times \sqrt{252}$ . The decay $\lambda$ controls memory: with $\lambda = 0.90$ the most recent return gets 10% weight; with $\lambda = 0.94$ (RiskMetrics), only 6%.

▍ rendering ewma…

EWMA volatility at λ = 0.90 and λ = 0.94, against the slow 252-day equal-weighted average.

Lower $\lambda$ reacts faster and runs noisier; higher $\lambda$ is smoother; the 252-day average is a slow long-term view. All three capture the same big events (2008, 2020, 2022) but react very differently. EWMA is simple and fast and avoids the jumps of equal-weighted windows, but it is sensitive to $\lambda$ , ignores mean reversion, and assumes a flat forecast.

3. GARCH Models

GARCH treats volatility as clustering rather than constant: today’s variance is driven by the size of recent shocks (past squared residuals) and the most recent variance. It has a conditional mean equation, a conditional variance equation, and an error-distribution assumption.

GARCH(1,1)

Mean — returns fluctuate around a constant:

$r_t = \mu + \varepsilon_t$

Variance — today’s variance depends on yesterday’s shock and yesterday’s variance:

$\sigma_t^{\,2} = \omega + \alpha\, \varepsilon_{t-1}^{\,2} + \beta\, \sigma_{t-1}^{\,2}$

Errors — zero-mean shocks scaled by the conditional volatility:

$\varepsilon_t \mid \mathcal{I}_{t-1} \sim \mathcal{N}\!\left(0, \sigma_t^{\,2}\right)$

Fitting by maximum likelihood on the daily returns:

Parameter	Estimate	Reading
$\mu$	≈ 0	mean return, not significantly different from zero
$\omega$	0.00108	baseline variance when markets are calm
$\alpha$	0.0349	reaction to yesterday’s shock
$\beta$	0.9622	persistence of volatility
$\alpha + \beta$	0.9972	near-unit persistence — shocks fade very slowly

Long-term volatility is the level the process reverts to:

$\bar{\sigma}^2 = \frac{\omega}{1-(\alpha+\beta)} \;\Rightarrow\; \hat{\sigma}_{\text{ann}} = \sqrt{\bar{\sigma}^2}\times\sqrt{252} \approx 9.7\%$

very close to the ~9.28% from the simple methods. The Ljung–Box test on the standardized residuals confirms a good fit:

Lag	lb-stat	lb p-value
1	1.351	0.245
10	11.819	0.297
21	18.508	0.617

▍ rendering garch…

GARCH(1,1) conditional annualized volatility, fitted by MLE, with the long-run level.

The estimate captures the known stress periods (2008–09, 2011–12, 2020, 2022) with sharp rises and slow declines — a visual signature of high persistence ( $\beta \approx 0.96$ ) with mean reversion toward ~9.7%.

GARCH(1,1) with AR(1) mean

Allowing the mean to depend on the previous return,

$r_t = \mu + \phi\, r_{t-1} + \varepsilon_t$

the extra term is not useful: $\phi \approx -0.017$ , small and insignificant. The variance parameters are essentially unchanged ( $\alpha+\beta = 0.9972$ , long-run ≈ 9.71%). Both models produce nearly identical volatility estimates — the constant-mean and AR(1) paths are visually indistinguishable.

Forecasting volatility with GARCH

GARCH forecasts revert slowly to the long-run level. Forecasting 250 days from the last data point: the path starts near the latest volatility and drifts down toward ~9.7% — slowly, because of the high persistence.

▍ rendering forecast…

250-day-ahead GARCH(1,1) volatility forecast (annualized %), reverting to the long-run level.

Versus the alternatives: a moving average only looks backward; EWMA forecasts a flat line (no mean reversion); GARCH starts where we are and reverts — a more realistic forecast. For a one-month option the near-term forecast (~11.5%) is a good vol input; for six months, the average of the daily-forecast path (~10.5–11%) is better.

EGARCH

EGARCH models the log-variance — avoiding positivity constraints — and distinguishes good from bad news (the leverage effect):

$\ln \sigma_t^{\,2} = \omega + \alpha \left( |z_{t-1}| - \mathbb{E}|z_{t-1}| \right) + \gamma\, z_{t-1} + \beta \ln \sigma_{t-1}^{\,2}$

The asymmetry parameter $\gamma = -0.0118$ is negative and significant — bad news raises EUR/USD volatility more than good news — but overall fit is slightly worse than plain GARCH:

	GARCH	AR-GARCH	EGARCH
Log-likelihood	−5278.119268	−5273.500078	−5287.804381
AIC	10564.238536	10557.000156	10585.608763
BIC	10591.418442	10590.974281	10619.583645

If simplicity and overall fit are prioritized, choose GARCH(1,1); if the leverage effect matters for economic realism, prefer EGARCH(1,1).

4. Implied Volatility

So far we forecast volatility from history. An alternative reads it out of option prices: implied volatility (IV) is the volatility that makes the Black–Scholes price match the market price — the market’s own expectation over the option’s life. In practice IV varies with strike and maturity, so volatility is not constant even though Black–Scholes assumes it is. (Covrig & Low, 2003, show that quoted OTC ATM vols give a cleaner, often unbiased measure than vols backed out of exchange-traded prices.)

Information content and efficiency of IV

Using the Mincer–Zarnowitz framework, estimated by OLS with Newey–West standard errors:

$RV_{t+1} = \alpha + \beta\, IV_t + \epsilon_{t+1}$

The coefficient $\beta = 0.0678$ is positive (higher IV today → higher realized vol next month) but significant only at 10%, with $R^2 = 0.027$ . Testing unbiasedness ( $H_0: \alpha = 0,\ \beta = 1$ ) via a Wald test rejects decisively ( $F = 295.69$ , $\chi^2 = 591.37$ , both $p \approx 0$ ): in this dataset, implied volatility is statistically biased.

▍ rendering ivrv…

1-month implied volatility vs subsequent 21-day realized volatility. IV sits structurally above realized — the volatility risk premium.

Forecasting accuracy

Comparing forecasts on MAE and MSE (MSE punishes large errors more):

Model	MAE	MSE
Implied Volatility	5.123%	0.630%
Historical Vol	5.340%	0.800%
EWMA	higher	0.500%
GARCH	higher	0.523%
EGARCH	higher	0.529%

On MAE, IV is most accurate; on MSE, EWMA wins (GARCH/EGARCH close behind). Best model depends on the objective: smallest average error → IV; avoiding large errors → EWMA.

5. Range-Based Models

The methods above use close-to-close returns, ignoring intraday movement. Range-based estimators use the high/low (and open/close) prices to extract more information.

Charts for this section aren’t reproduced here — the range estimators need daily High/Low/Open data, which isn’t in the close + implied-vol dataset driving the charts above.

Parkinson’s historical volatility

The Parkinson estimator (1980) needs only the daily high $H$ and low $L$ . The single-day variance from the range is:

$\sigma_{k,\text{Park}}^{\,2} = \frac{1}{4\ln 2} \left[ \ln\!\left( \frac{H_k}{L_k} \right) \right]^2$

and the rolling estimate over $w$ days averages and annualizes:

$\hat{\sigma}_{t,\text{Park}} = \sqrt{ N_{\text{days}} \cdot \frac{1}{w} \sum_{k=1}^{w} \hat{\sigma}_{k,\text{Park}}^{\,2} }$

Over 2005–2025 with a 20-day window, Parkinson and standard close-to-close volatility track each other very closely; Parkinson averages slightly higher (8.52% vs 8.25%, average difference 0.27%), with the maximum gap ≈ 5.43% (annualized) on Jan 14, 2009. The intraday range carries some information missed by closing prices, but the practical difference here is minor.

Garman–Klass

The Garman–Klass (GK) estimator (1980) uses all four OHLC prices, combining the high-low range with the close-to-open change:

$\sigma_{\text{GK}}^{\,2} = \frac{1}{2}\left[\ln\!\frac{H}{L}\right]^2 - (2\ln 2 - 1)\left[\ln\!\frac{C}{O}\right]^2$

For EUR/USD, gaps are uncommon (it trades 24/5), so the OHLC advantage is limited. GK averages slightly above the standard measure (8.60% vs 8.25%, average difference 0.35%). Across all three (standard, Parkinson, Garman–Klass), the dynamics are highly similar; the range-based methods read marginally higher on average, but none has the dynamic, forward-looking modeling of the GARCH family.

Methods and estimation in Python (SciPy for statistical tests; standard econometrics tooling for the GARCH/EGARCH fits) in the original study. The charts on this page are recomputed independently in JavaScript from the LSEG/Datastream daily EUR/USD series (1999–2025); descriptive statistics and GARCH parameters match the original to within rounding.