Let’s begin with the ARCH model.

ARCH Model

The ARCH model was initially raised by Engle (1982), and the ARCH model means the Autoregressive Conditional Heteroskedasticity model.

We assume here \(u_t\) is the return.

$$ u_t=\frac{P_t-P_{t-1}}{P_{t-1}} $$

$$u_t\sim N(0,\sigma_t^2)$$

The data-generating process (DGP) is like an AR form, as the name of ARCH. The volatility is autoregressively generated by \(u^2_i\).

$$\sigma_t^2=\delta_0+\sum_{i=1}^{p} \delta_i u_{t-i}^2$$

, where \(p\) is the number of lags, and \(\delta_i\) are a set of parameters. The DGP of that model shows that the volatility of the return is heteroscedastic, correlated with the squared term of the return per se.

For example, an ARCH(1) model is like,

$$ \sigma_t^2=\delta_0+\delta_1 u^2_{t-1} $$

  • Stationarity

Note here we need our time series to be stationary for better forecasting. Thus, \(Var(u_t)=\sigma^2 \)

$$ Var(u_t)=\delta_0+\delta_1 Var(u_{t-1}) $$

$$ \sigma^2=\frac{\delta_0}{1-\delta_1} $$

As the variance has to be positive. We need \(\delta_0 > 0\), and \(\delta_1<1\).

  • Estimation

For this time series data, OLS assumptions are violated, because our series are autoregressive heteroskedasticity.

Instead, the Maximum Likelihood Estimation (MLE) would be a better estimation method by assuming the probability distribution of variables.

MLE allows iterations to find parameters \(\delta\) that can maximise the maximum likelihood function.


The ‘G’ in the GARCH model means ‘generalised’, and the GARCH model has a set of additional terms, \(\sum \gamma_i \sigma^2_i \). Thus, the DGP of the GARCH(p,q) model is as the following,

$$u_t\sim N(0,\sigma_t^2)$$

$$ \sigma_t^2=\delta_0 + \sum_{i=1}^{p} \delta_i u^2_{t-i} +\sum^q_{j=1} \gamma_j \sigma^2_{t-j} $$


That is a further application, in which the GARCH model is applied to mimic the movement of error terms in the ARMA model.

We initially assume an ARMA(p,q) model,

$$ y_t=\beta_0 +\sum^p_{i=1} \beta_i y_{t-i} + \sum^{q}_{j=1} \theta_j u_{t-j} +u_t$$

Then, we assume the error term here, \(u_t \sim GARCH(m,n)\).

$$ u_t \sim N(0,\sigma_t^2)$$

$$ \sigma_t^2 = \delta_0 +\sum^m_{i=1} \delta_i u_{t-i}^2 +\sum_{j=1}^n \gamma_n \sigma_{t-n}^2 $$


Engle, R.F., 1982. Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica: Journal of the econometric society, pp.987-1007.

Something More about Solow Model

The current mostly used Solow model always have a depreciation term, and thus the law of motion becomes, \(\dot{K}=I-\delta K\).

The mainstream model has different assumptions about the production function as well. For example, technological progress is generally added. 1. \(Y=AF(K,L)\) in which technology is exogenous, and it could be called Hicks-neutral; 2. \(Y=F(K,AL)\) that can represent the efficient workers, labour-augmented, or Harrow-neutral; 3. \(Y=F(AK,L)\) in which the technological progress is capital augmented.

Applying for example the labour-augmented technology and \( \frac{\dot{A}}{A}=g\) , we can simply solve the Solow model as the following,


$$ \frac{\dot{k}}{k}= \frac{\dot{K}}{K}- \frac{\dot{A}}{A}- \frac{\dot{L}}{L} $$

$$ \frac{\dot{k}}{k}= \frac{sY-\delta K}{K}- \frac{\dot{A}}{A}- \frac{\dot{L}}{L} $$

$$ \frac{\dot{k}}{k}= \frac{sY}{K}-\delta-g- n $$

$$ \dot{k}=sy-(\delta+g+n)k $$

, where \(y=\frac{Y}{AL}\) and \(\frac{K}{AL}\) represent the output/capital per efficient works. Therefore, if \(\dot{k}=0\), then \(sy=(\delta+g+n)k\).

The stable point of k is \(k^*\) in which \(sf(k)=(\delta+n+g)k\).

We always the Cobb-Douglas function to represent the production function, because it satisfies CRTS, increasing and diminishing assumptions, and the Inada conditions (\(\lim_{k\rightarrow0}f'(k)=\infty; \lim_{k\rightarrow \infty}f'(k)=0\), Inada, 1963 ).

In the following, we would all analyse the model using efficient works to do analysis.

Balance Growth Path

All the following is assuming the economy is at the steady state or stable point.

For \( \frac{\dot{K}}{K} \),

$$ k=\frac{K}{AL} $$

By taking logritham,

$$ ln(k)=ln(K)-ln(A)-ln(L) $$

By taking differentiation and set \(\dot{k}=0\) (based on our previous derivations of finding the steady state condition).

$$ \frac{\dot{K}}{K} = \frac{\dot{A}}{A} + \frac{\dot{L}}{L} $$

$$ \frac{\dot{K}}{K} = g+n $$

For \( \frac{\dot{Y}}{Y} \), similar as the original Solow’s one.


Differentiate w.r.t. \(t\),

$$ \frac{\dot{Y}}{Y}=\frac{ \dot{K}F_1’+\dot{A}LF_2’+ A\dot{L}F_2′ }{F(K,AL)} $$

By Euler’s Theorem to the demoninator (see math tools),

P.S. differentiate \(tY=F(tK,tAL)\) w.r.t. \(t\), then we get \(Y=F’_1 K+F’_2 AL\).

$$ \frac{\dot{Y}}{Y}=\frac{ \dot{K}F_1’+\dot{A}LF_2’+ A\dot{L}F_2′ }{ F’_1 K+F’_2 AL } $$

Devide both numerator and demoninator by \(KAL\),

\frac{\dot{Y}}{Y}=\frac{ \frac{\dot{K}}{KAL}F_1’+\frac{\dot{A}L}{KAL}F_2’+\frac{ A\dot{L}}{KAL}F_2′ }{ \frac{F’_1 K}{KAL}+\frac{F’_2 AL}{KAL} }

\frac{\dot{Y}}{Y}=\frac{ \frac{\dot{K}}{K}\frac{F_1′}{AL}+\frac{\dot{A}}{A}\frac{F_2′}{K}+\frac{ \dot{L}}{L}\frac{F_2′}{K} }{ \frac{F’_1 }{AL}+\frac{F’_2 }{K} }= \frac{ \frac{\dot{K}}{K}\frac{F_1′}{AL}+(\frac{\dot{A}}{A}+\frac{\dot{L}}{L})\frac{F_2′}{K} }{ \frac{F’_1 }{AL}+\frac{F’_2 }{K} }

\frac{\dot{Y}}{Y}= (n+g)\frac{ \frac{F_1′}{AL}+\frac{F_2′}{K} }{ \frac{F’_1 }{AL}+\frac{F’_2 }{K} } = (\frac{\dot{K}}{K})\frac{ \frac{F_1′}{AL}+\frac{F_2′}{K} }{ \frac{F’_1 }{AL}+\frac{F’_2 }{K} }

$$ \frac{\dot{Y}}{Y}=n+g = \frac{\dot{K}}{K} $$

For \( \frac{\dot{y}}{y} \), (as \(y=\frac{Y}{AL})


$$ \frac{\dot{y}}{y}= \frac{\dot{Y}}{Y}- \frac{\dot{A}}{A}- \frac{\dot{L}}{L} =(n+g)-n-g $$

$$ \frac{\dot{y}}{y} =0$$

Similarly, for per capita terms,

For \( \frac{\dot{K/L}}{K/L} \), per capita capital,

$$\frac{\dot{K/L}}{K/L}=\frac{ \frac{\dot{K}L-K\dot{L}}{L^2} }{K/L}$$

$$\frac{\dot{K/L}}{K/L}=\frac{\dot{K}L-K\dot{L}}{KL}= \frac{\dot{K}}{K}- \frac{\dot{L}}{L} $$

\frac{\dot{K/L}}{K/L} =(g+n)-n=g

For \( \frac{\dot{Y/L}}{Y/L} \) (per capita output) we apply the same transformation as K/L,

\frac{\dot{Y/L}}{Y/L}= \frac{\dot{Y}}{Y}- \frac{\dot{L}}{L} =g

In summary, the BGP is a situation in which each variable of the model is growing at a constant rate. On the balanced growth path, the growth rate of output per worker is determined solely by the rate of growth of technology.

P.S. Technology Independent of Labour And Capital

Applying for example the Type 1 case and \( \frac{\dot{A}}{A}=g\) , we can simply solve the Solow model as the following,

We would not use capital per efficient worker here, because labour is not technology-augmented by assumption. Instead, we simply assume capital per capita, \(k=\frac{K}{L}\). We can easily get the relationship,

\frac{\dot{k}}{k}= \frac{\dot{K}}{K}- \frac{\dot{L}}{L}

By setting \(\dot{k}=0\), we can find \( \frac{\dot{K}}{K}=\frac{\dot{L}}{L}=n \), which is same as Solow’s original works.

However, the difference is when we deal with the output. As the output is now \(Y=AF(K,L)\), so the changes in outputs (numerator) are,

$$ \dot{Y}=\dot{A}F(K,L)+AF’_1\dot{K}+AF’_2\dot{L} $$

We expand output per se (demoninator) by Euler’s Theorem \(Y=AF’_1K+AF’_2L\) (A is now outside the production function), and then calculate the percentage changes of outputs,

$$ \frac{\dot{Y}}{Y}=\frac{ \dot{A}F(K,L)+AF’_1\dot{K}+AF’_2\dot{L} }{ AF’_1K+AF’_2L } $$

Devided both demoninator and numerator by AKL,

$$ \frac{\dot{Y}}{Y}=\frac{ \dot{A}F(K,L) }{ AF(K,L) }+\frac{\frac{F’_1}{L}\frac{\dot{K}}{K}+\frac{F’_2}{K}\frac{\dot{L}}{L} }{ \frac{F’_1}{L}+\frac{F’_2}{K} } $$

$$ \frac{\dot{Y}}{Y}= \frac{\dot{A}}{A}+ \frac{\dot{L}}{L}=g+n $$

Saving Rates

We now consider first how does changes in the saving rate affect those factors.

The determinants of saving rate are, for example, uncertainty or decrease in expected income, and required pension rate.

See the following figures,

An increase in the saving rate would result in an increase in the investment curve. \(\dot{K}=I-\delta K\) tells that there would be a huge increase in \(\dot{K}\) initially, and by the shape of production function, the difference diminishes until achieving the new stable point \(k^*_{new}\).

As \(\dot{k}\) is a derivative of \(k\) w.r.t. \(t\), we can easily get the time path of \(k\) as the following,

Another important factor is the growth rate of output per capita,

Also \(ln(Y/L)\),

For this one, we can prove that the slope of \(ln(Y/L)\) is \(\dot{ln(Y/L)}=\frac{\partial}{\partial t}[ln(Y)-ln(L)]=(g+n)-n=g\), so it grows constantly at rate “g” before \(t_0\). Later growth rate of Y jumps makes the slope of \(ln(Y/L)\) increases, but \(ln(Y/L)=g\) when achieves a new steady state and \(ln(Y/L)\) keeps growing at “g” in the long run.

The Speed of Convergence

Way 1

We follow our Solow model with labour-augmented technology. The time path of changes of capital per efficient works is,

$$ \dot{k}=sy-(\delta+n+g)k$$

$$ \dot{k}=sy-(\delta+n+g)k$$

At the steady state, \(\dot{k}=0\), so \( sy-(\delta+n+g)k \). We then plug in the Cobb-Doglas production function and denote \(y=\frac{Y}{AL}=\frac{K^{\alpha}(AL)^{1-\alpha}}{AL}=k^{\alpha}\), we can find the \(k^*\),

$$ k^*=(\frac{s}{\delta+g+n})^{\frac{1}{1-\alpha}} $$

And get the path of k,

$$ \frac{\dot{k}}{k}=sk^{\alpha-1}-(\delta+g+n) :=G(k)$$

To find the speed of convergence, we would focus on the time path of k around \(k^*\). Or approximate the time-path by taking first-order Taylor expansion around \(k^*\) to approximate,

$$ G(k)\approx G(k^*)+G'(k^*)(k-k^*) $$

As \(G(k^*)=0\) by our proof of steady state condition, thus,

$$ G(k)\approx (\alpha-1)s {k^*}^{\alpha-1}\frac{k-k^*}{k^*} $$

We plug the steady state \(k^*\) back into the above equation and get,

$$ G(k)=-(1-\alpha)(\delta+g+n)\frac{k-k^*}{k^*} $$

Therefore, we find the mathematic expression of the convergence speed, \( (1-\alpha)(\delta+g+n) \). It is the measure of how quickly k changes when k diviates from \(k^*\). Also, we find that the growth rate \( G(k)=\frac{\dot{k}}{k} \) depends on both the convergence speed and \( \frac{k-k^*}{k^*} \), which is how far k deviates from its steady state level.

Take also a Taylor expansion to \(ln(k)\) at \(k^*\), we would get,

$$ G(k)=-(1-\alpha)(\delta+g+n)(ln(k)-ln(k^*)) :=g_k$$

Then, to find the convergence speed of outputs, we apply \(y=k^{\alpha}\) and take logritham \(ln(y)=\alpha ln(k)\). Differentiate w.r.t. \(t\),

$$ \frac{\dot{y}}{y} =\alpha\frac{\dot{k}}{k} $$

$$ g_y:=\frac{\dot{y}}{y} =\alpha( -(1-\alpha)(\delta+g+n)(ln(k)-ln(k^*))) \\= -(1-\alpha)(\delta+g+n)(ln(y)-ln(y^*)) $$

So we get \(g_y=\alpha g_k\), and \(\beta= (1-\alpha)(\delta+g+n) \) is the speed of convergence. It measures how quickly \(y\) increases when \(y<y^*\). The growth rate of y depends on the speed of convergence, \(\beta\), and the log-difference between \(y\) and \(y^*\).

Way 2

We take first order Taylor approximation to \(f(k)=\dot{k}\) around \(k=k^*\).

$$ \dot{k} \approx \dot{k}|_{k=k^*}+\frac{\partial \dot{k}}{\partial k}|_{k=k^*}(k-k^*) $$

By definition of steady state condition, the first term of RHS is zero. So,

$$ \dot{k}\approx -\lambda \cdot (k-k^*) $$

We denote \(-\frac{\partial \dot{k}}{\partial k}|_{k=k^*}\:=\lambda\) as the speed of convergence. As \(\dot{k}=sy-(\delta+g+n)k=sk^{\alpha}- (\delta+g+n)k\), so,

$$ \lambda=-s\alpha {k^*}^{\alpha-1}- (\delta+g+n) $$

Plug \(k^*\) into, we get,

$$ \lambda=(1-\alpha)(\delta+g+n) $$

To see why we denote \(\lambda\)as the speed of convergence, solve the differential equation, \( \dot{k}\approx -\lambda \cdot (k-k^*) \), by restrict time from 0 to t.

$$ \dot{k}=\frac{\partial k}{\partial t} \approx -\lambda \cdot (k-k^*) $$

$$ \frac{1}{k-k^*} dk=-\lambda dt $$

$$\int_{k(0)}^{k(t)} \frac{1}{k-k^*} dk=\int_{0}^t -\lambda dt $$

$$ [ln(k-k^*)]|^{k(t)}_{k(0)}=-\lambda t|_0^t $$

$$ ln(k(t)-k^*)=-\lambda t|_0^t+ ln(k(0)-k^*) $$


$$ k(t)=k^*+e^{-\lambda t}[ k(0)-k^* ] $$

Or in other form,

$$ ln(\frac{k(t)-k^*}{k(0)-k^*})=-\lambda t $$

Solow Residuals

Recall our labour-augmented production function, \(Y(t)=F(K(t),A(t)L(t))\).

$$ \dot{Y}=\frac{\partial Y}{\partial t}=F’_1\dot{K}+ F’_2\dot{A}+ F’_2\dot{L} $$

$$\frac{ \dot{Y}}{Y}=\frac{\partial Y(t)}{\partial K(t)}\dot{K(t)}+ \frac{\partial Y(t)}{\partial L(t)}\dot{L(t)}+ \frac{\partial Y(t)}{\partial A(t)}\dot{A(t)} $$

Then, applying the replacement equation into the above equation,

$$ \frac{\partial Y(t)}{\partial L(t)}=\frac{\partial Y(t)}{\partial A(t)L(t)}\cdot A(t) $$

$$ \frac{\partial Y(t)}{\partial A(t)}=\frac{\partial Y(t)}{\partial A(t)L(t)}\cdot L(t) $$

Then we get,

$$ \frac{ \dot{Y}}{Y}=\frac{Y(t)}{K(t)}\frac{\dot{K(t)}}{K(t)}\frac{K(t)}{Y(t)}+ \frac{Y(t)}{L(t)}\frac{\dot{L(t)}}{L(t)}\frac{L(t)}{Y(t)}+ \frac{Y(t)}{A(t)}\frac{\dot{A(t)}}{A(t)}\frac{A(t)}{Y(t)}\\=\epsilon(t)_{Y,K}\frac{\dot{K(t)}}{K(t)}+ \epsilon(t)_{Y,L}\frac{\dot{L(t)}}{L(t)}+R(t) $$

,where we denote \(R(t)\) as the Solow Residuals.

$$ R(t) = \frac{Y(t)}{A(t)}\frac{\dot{A(t)}}{A(t)}\frac{A(t)}{Y(t)} $$

Solow Residuals represent the residuals unexplained by growth of capital and labours.

Golden Rule Saving Rate (Phelps)

To be continued.


Kalman Filter


In statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, who was one of the primary developers of its theory.


During my study in Cambridge, Professor Oliver Linton introduced the Kalman Filter in Time Series analysis, but I did not get it at that time. So, here is a revisit.

My Thinking of Kalman Filter

Kalman Filter is an algorithm that estimates optimal results from uncertain observation (e.g. Time Series Data. We know only the sample, but never know the true distribution of data or never know the true value when there are no errors).

Consider the case, I need to know my weight, but the bodyweight scale cannot give me the true value. How can I know my true weight?

Assume the bodyweight scale gives me error of 2, and my own estimate gives me error of 1. Or in another word, a weight scale is 1/3 accurate, and my own estimation is 2/3 accurate. Then, the optimal weight should be,

$$ Optimal Result = \frac{1}{3}\times Measurement + \frac{2}{3}\times Estimate $$

, where \( Measurement\) means the measurement value, and \(Estimate\) means the estimated value. We conduct the following transformation.

$$ Optimal Result = \frac{1}{3}\times Measurement +Esimate- \frac{1}{3}\times Estimate $$

Optimal Result = Esimate+\frac{1}{3}\times Measurement – \frac{1}{3}\times Estimate

Optimal Result = Esimate+\frac{1}{3}\times (Measurement – Estimate)

Therefore, we can get

Optimal Result = Esimate+\frac{p}{p+r}\times (Measurement – Estimate)

, where \(p\) is the estimation error and \(r\) is the measurement error.

For example, if the estimation error is zero, then the fraction is equal to zero. Thus, the optimal result is just the estimate.

Applying Time Series Data

$$ Optimal Result_n=\frac{1}{n}\times (meas_1+meas_2+meas_3+…+meas_{n}) $$

Optimal Result_n=\frac{1}{n}\times (meas_1+meas_2+meas_3+…+meas_{n-1})\\ +\frac{1}{n}\times meas_n

Optimal Result_n=\frac{n-1}{n}\times \frac{1}{n-1}\times (meas_1+…+meas_{n-1})\\ +\frac{1}{n}\times meas_n

Iterating the first term because\( \frac{1}{n-1}\times (meas_1+…+meas_{n-1}) = Optimal Result_{n-1} \),

Optimal Result_n=\frac{n-1}{n}\times Optimal Result_{n-1}\\ +\frac{1}{n}\times meas_n

Optimal Result_n=Optimal Result_{n-1}\\ -\frac{1}{n}\times Optimal Result_{n-1} +\frac{1}{n}\times meas_n

OResult_n=OResult_{n-1}+\frac{1}{n}\times (meas_n-OResult_{n-1})

Kalman Filter Equation

$$ \hat{x}_{n,n}=\hat{x}_{n,n-1}+K_n(z_n-\hat{x}_{n,n-1}) $$

$$ K_n=\frac{p_{n,n-1}}{p_{n.n-1}+r_n} $$

, where \(p_{n,n-1}\) is Uncertainty in Estimate, \(r_n\) is Uncertainty in Measurement, \(\hat{x}_{n,n}\) is the Optimal Estimate at \(n\), and \(z_n\) is the Measurement Value at \(n\).

The Optimal Estimate is updated by the estimate uncertainty through a Covariance Update Equation,

$$ p_{n,n}=(1-K_n)p_{n,n-1} $$

In a more intuitive way (1),

$$ OEstimate_n=OEstimate_{n-1}+K_n (meas_n-OEstimate_{n-1})$$

$$ K_n=\frac{OEstimateError_{n-1}}{OEstimateError_{n-1}+MeausreError_n}$$

$$OEstimateError_{n-1}=(1-K_{n-1})\times OEstimateError_{n-2}$$



A Senior Study

Estimation Equation:

$$ \hat{x}_k^-=A\hat{x}_{k-1}+Bu_k $$

$$ P_k^-=AP_{k-1}A^T+Q$$

Update Equation (same as the one I just introduced in (1)):

$$K_k=\frac{P_k^- C^T}{CP_k^-C^T+R}$$

$$ \hat{x}_k^-=A\hat{x}_{k-1}+K_k(y_k-C\hat{x}_k^-) $$

$$ P_k=(1-K_kC)P_k^-$$

Intuitively, I need \( \hat{x}_{k-1}\) (, which is the weight last week) to calculate the optimal estimate weight this week \(\hat{x}_k\). Firstly, I estimate the weights this week \(\hat{x}_k^-\) and measure the weight this week \(y_k\). Then, combine them to get the optimal estimate weights this week.


The application of the Kalman Filter could be found in the following reading. Also, I will continue in my further study.