Two Approaches for Forecasting Exchange Rate

The first approach is that analysts focus on flows of export and imports to establish what the net trade flows are and how large they are relative to the economy and other, potentially larger financing and investment flows. The approach also considers differences between domestic and foreign inflation rates that relate to the concept of purchasing power parity. Under PPP, the expected percentage change in the exchange rate should equal the difference between inflation rates. The approach also considers the sustainability of current account imbalances, reflecting the difference between national saving and investment.

The second approach is that the analysis focuses on capital flows and the degree of capital mobility. It assumes that capital seeks the highest risk-adjusted return. The expected changes in the exchange rate will reflect the differences in the respective countries’ assets’ characteristics such as relative short-term interest rates, term, credit, equity and liquidity premiums. The approach also considers hot money flows and the fact that exchange rates provide an across the board mechanism for adjusting the relative sizes of each country’s portfolio of assets.

Source by CFA reading materials

Least Squares Method – Intro to Kalman Filter

Consider a Linear Equation,

$$ y_i = \sum_{j=1}^n C_{i,j} x_j +v_i,\quad i=1,2,…$$

, where C_{i,j} are scalars and v_i\in \mathbb{R} is the measurement noise. The noise is unknown, while we assume it follows certain patterns (the assumptions are due to some statistical properties of the noise). We assume v_i, v_j are independent for i\neq j. Properties are mean of zero, and variance equals sigma squared.

$$\mathbb{E}(v_i)=0$$

$$\mathbb{E}(v_i^2) = \sigma_i^2$$


We can rewrite y_i = \sum_{j=1}^n C_{i,j} x_j +v_i as,

$$ \begin{pmatrix} y_1 \ y_2 \ \vdots\ y_s\end{pmatrix} = \begin{pmatrix} C_{11} & C_{12} & \cdots & C_{1n} \ C_{21} & C_{22}& \cdots & C_{2n} \ \vdots & \vdots & \cdots & \vdots \ C_{s1} & C_{s2} & \cdots & C_{sn}\end{pmatrix} \begin{pmatrix} x_1 \ x_2 \ \vdots\ x_n\end{pmatrix} + \begin{pmatrix} v_1 \ v_2 \ \vdots\ v_s\end{pmatrix} $$

, in a matrix form,

$$ \vec{y} = C \vec{x} + \vec{v} $$

, but I would write in a short form,

$$ y= C x +v$$

We solve for the least squared estimator from the optimisation problem, (there is a squared L2 norm)

$$ \min_x || y-Cx ||_2^2 $$

Recursive Least Squared Method

The classic least squared estimator might not work well when data evolving. So, there emerges a Recursive Least Squared Method to deal with the discrete-time instance. Let’s say, for a discrete-time instance k, y_k \in \mathbb{R}’ is within a set of measurements group follows,

$$y_k = C_k x + v_k$$

, where C_k \in \mathbb{R}^{l\times n}, and v_k \in \mathbb{R}^l is the measurement noise vector. We assume that the covariance of the measurement noise is given by,

$$ \mathbb{E}[v_k v_k^T] = R_k$$

, and

$$\mathbb{E}[v_k]=0$$

The recursive least squared method has the following form in this section,

$$\hat{x}k = \hat{x}{k-1} + K_k (y_k – C_k \hat{x}_{k-1})$$

, where \hat{x}k and \hat{x}{k-1} are the estimates of the vector x at the discrete-time instants k and k-1, and K_k \in \mathbb{R}^{n\times l} is the gain matrix that we need to determine. K_k is coined the ‘Gain Matrix’

The above equation updates the estimate of x at the time instant k on the basis of the estimate \hat{x}_{k-1} at the previous time instant k-1 and on the basis of the measurement y_k obtained at the time instant k, as well as on the basis of the gain matrix K_k computed at the time instant k.

Notation

$\hat{x}$ is the estimate.

$$ \hat{x}k = \begin{pmatrix} \hat{x}{1,k} \ \hat{x}{2,k} \ \vdots \\hat{x}{n,k} \end{pmatrix} $$

, which is corresponding with the true vector x.

$$x = \begin{pmatrix} x_1 \ x_2 \ \vdots \ x_n \end{pmatrix}$$

The estimation error, \epsilon_{i,k} = x_i – \hat{x}_{i,k} \quad i=1,2,…,n.

$$\epsilon_k = \begin{pmatrix} \epsilon_{1,k} \ \epsilon_{2,k} \ \vdots \\epsilon_{n,k} \end{pmatrix} = x – \hat{x}_k = \begin{pmatrix} x_1-\hat{x}_{1,k} \ x_2 – \hat{x}_{2,k} \ \vdots \x_n-\hat{x}_{n,k} \end{pmatrix} $$

The gain K_k is computed by minimising the sum of variances of the estimation errors,

$$ W_k = \mathbb{E}(\epsilon_{1,k}^2) + \mathbb{E}(\epsilon_{2,k}^2) + \cdots + \mathbb{E}(\epsilon_{n,k}^2) $$

Next, let’s show the cost function could be represented as follows, (tr(.) is the trace of a matrix)

$$ W_k = tr(P_k) $$

, and P_k is the estimation error covariance matrix defined by

$$ P_k = \mathbb{E}(\epsilon_k \epsilon_k^T )$$

Or, says,

$$ K_k = arg\min_{K_k} W_k = tr\bigg( \mathbb{E}(\epsilon_k \epsilon_k^T ) \bigg)$$

Why is that?

$$\epsilon_k \epsilon_k^T = \begin{pmatrix} \epsilon_{1,k} \ \epsilon_{2,k} \\vdots \ \epsilon_{n,k} \end{pmatrix} \begin{pmatrix} \epsilon_{1,k} & \epsilon_{2,k} & \cdots & \epsilon_{n,k} \end{pmatrix}$$

$$ = \begin{pmatrix} \epsilon_{1,k}^2 & \cdots & \epsilon_{1,k}\epsilon_{n,k} \ \vdots & \epsilon_{i,k}^2 & \vdots \ \epsilon_{1,k}\epsilon_{n,k} & \cdots & \epsilon_{n,k}^2\end{pmatrix} $$

So,

$$ P_k = \mathbb{E}[\epsilon_k \epsilon_k^T] $$

$$tr(P_k) = \mathbb{E}(\epsilon_{1,k}^2) + \mathbb{E}(\epsilon_{2,k}^2) + \cdots + \mathbb{E}(\epsilon_{n,k}^2)$$


Optimisation

$$ K_k = arg\min_{K_k} W_k = tr\bigg( \mathbb{E}(\epsilon_k \epsilon_k^T ) \bigg) = tr(P_k)$$

Let’s derive the optimisation problem.

$$\epsilon_k = x-\hat{x}_k$$

$$ =x-\hat{x}{k-1} – K_k(y_k – C_k \hat{x}{k-1}) $$

$$ = x- \hat{x}{k-1} – K_k (C_k x + v_k – C_k \hat{x}{k-1}) $$

$$ = (I – K_k C_k)(x-\hat{x}_{k-1}) – K_k v_k $$

$$ =(I-K_k C_k )\epsilon_{k-1} – K_k v_k $$

Recall y_k = C_k x + v_k and \hat{x}k = \hat{x}{k-1} + K_k (y_k – C_k \hat{x}_{k-1})

So, \epsilon_k \epsilon_k^T would be,

$$\epsilon_k \epsilon_k^T = \bigg((I-K_k C_k )\epsilon_{k-1} – K_k v_k\bigg)\bigg((I-K_k C_k )\epsilon_{k-1} – K_k v_k\bigg)^T$$

$P_k = \mathbb{E}(\epsilon_k \epsilon_k^T)$, and $P_{k-1} = \mathbb{E}(\epsilon_{k-1} \epsilon_{k-1}^T)$.

$\mathbb{E}(\epsilon_{k-1} v_k^T) = \mathbb{E}(\epsilon_{k-1}) \mathbb{E}(v_k^T) =0$ by the white noise property of $\epsilon$ and $v$. However, $\mathbb{E}(v_k v_k^T) = R_k$. Substituting all those into $P_k$, we would get,

$$P_k = (I – K_k C_k)P_{k-1}(I – K_k C_k)^T + K_k R_k K_k^T$$

$$ P_k = P_{k-1} – P_{k-1} C_k^T K_k^T – K_k C_k P_{k-1} + K_k C_k P_{k-1}C_k^T K_k^T + K_k R_k K_k^T $$

$$W = tr(P_k)= tr(P_{k-1}) – tr(P_{k-1} C_k^T K_k^T) – tr(K_k C_k P_{k-1}) + tr(K_k C_k P_{k-1}C_k^T K_k^T) + tr(K_k R_k K_k^T) $$

We take F.O.C. to solve for K_k = arg\min_{K_k} W_k = tr\bigg( \mathbb{E}(\epsilon_k \epsilon_k^T ) \bigg) = tr(P_k), by letting \frac{\partial W_k}{\partial K_k} = 0. See the Matrix Cookbook and find how to do derivatives w.r.t. K_k.

$$\frac{\partial W_k}{\partial K_k} = -2P_{k-1} C_k^T + 2K_k C_k P_{k-1} C_k^T + 2K_k R_k = 0$$

We solve for K_k,

$$ K_k = P_{k-1} C_k^T (R_k + C_k P_{k-1} C_k^T)^{-1}$$

, we let L_k = R_k + C_k P_{k-1} C_k^T, and L_k has the following property L_k = L_k^T and L_k^{-1} = (L_k^{-1})^T

$$ K_k = P_{k-1} C_k^T L_k^{-1} $$

Plug K_k = P_{k-1} C_k^T K_k^{-1} back into P_k.

$$ P_k = P_{k-1} – K_kC_k P_{k-1} = (I-K_k C_k)P_{k-1} $$


Summary

In the end, the Recursive Least Squared Method could be summarised as the following three equations.

  • 1. Update the Gain Matrix.

$$ K_k = P_{k-1} C_k^T (R_k + C_k P_{k-1} C_k^T)^{-1}$$

  • 2. Update the Estimate.

$$\hat{x}_k = \hat{x}_{k-1} + K_k (y_k – C_k \hat{x}_{k-1})$$

  • 3. Propagation of the estimation error covariance matrix by using this equation.

(I-K_k C_k)P_{k-1}

Reference

Sigmoid & Logistic

Sigmoid function is largely used for the binary classification, in either machine learning algorithm or econometrics.

Why the Sigmoid Function shapes in this form?

Firstly, let’s introduce the odds.

Odds provide a measure of the likelihood of a particular outcome. They are calculated as the ratio of the number of outcomes that produce that outcome to the number that do not.

Odds also have a simple relation with probability: the odds of an outcome are the ratio of the probability that the outcome occurs to the probability that the outcome does not occur. In mathematical terms, p is the probability of the outcome, and 1-p is the probability of not occurring.

$$ odds = \frac{p}{1-p} $$

Odd and Probability

Let’s find some insights behind the probability and the odd. Probability links with the outcomes in that for each outcomes, the probability give its specific corresponding probability. Pr(Y), where Y is the outcome, and Pr(\cdot) is the probability density function that project outcomes to it’s prob.

What about the odds? Odds is more like a ratio that is calculated by the probability as the formula says.

Implication: Compared to the probability, odds provide more about how the binary classification is balanced or not, but the probability distribution.

Example

Rolling a six-side die. The probability of rolling 6 is 1/6, but the odd is $1/5.

Formula

$$ odd = \frac{Pr(Y)}{1-Pr(Y)} $$

, where Y is the outcomes.

Logit

As the probability Pr(Y) is always between [0,1], the odds must be non-negative, odd \in [0,\infty]. We may want to apply a monotonic transformation to re-gauge that range of odds. We will apply on the logarithm.

$$ Sigmoid/Logistic := log(odds) =log\bigg( \frac{Pr(Y)}{1-Pr(Y)} \bigg) $$

We then get the Sigmoid function.

As the transformation we apply on is monotonic, the Sigmoid function remains the similar properties as the odd. The Sigmoid function keeps the similar implication, representing the balance of the binary outcomes.

Then, we bridge Y = f(X), the outcome Y is a function of events X. Here, we assume a linear form as Y = X\beta. The sigmoid function would then become a function of X.

$$g(X) = log\bigg( \frac{Pr(X\beta)}{1-Pr(X\beta)} \bigg) $$

$$ e^g = \frac{p}{1-p} $$

$$ p = \frac{e^g}{e^g+1}=\frac{1}{1+e^{-g}}$$

$$ p = \frac{1}{1+e^{-X\beta}}$$

We finally get out logistic sigmoid function as above.

Dirac Delta Function

The Dirac Delta Function could be applied to simplify the differential equation. There are three main properties of Dirac Delta Function.

$$\delta (x-x’) =\lim_{\tau\to0}\delta (x-x’)$$

such that,

$$ \delta (x-x’) = \begin{cases} \infty & x= x’ \ 0 & x\neq x’ \end{cases} $$

$$\int_{-\infty}^{\infty} \delta (x-x’)\ dx =1$$

Three Properties:

  • Property 1:

$$\delta(x-x’)=0 \quad \quad ,x\neq x’ $$

  • Property 2:

$$ \int_{x’-\epsilon}^{x’+\epsilon} \delta (x-x’)dx =1\quad \quad ,\epsilon >0 $$

  • Property 3:

$$\int_{x’-\epsilon}^{x’+\epsilon} f(x)\ \delta (x-x’)dx = f(x’)$$

At x=x’ the Dirac Delta function is sometimes thought of has having an “infinite” value. So, the Dirac Delta function is a function that is zero everywhere except one point and at that point it can be thought of as either undefined or as having an “infinite” value.

Girsanov’s Theorem

Statement

We can change the probability measure, and then make a random variable follows a certain probability measure.

  • Radon-Nikodym Derivative:

$$Z(\omega) = \frac{\tilde{P}(\omega)}{P(\omega)}$$

  • $\tilde{P}(\omega)$ is the risk-neutral probability measure.
  • ${P}(\omega)$ is the actual probability measure.
  • Properties:
    • $Z(\omega)>0$
    • $\mathbb{E}(Z)=1$
    • As \tilde{P}(\omega) = Z(\omega) P(\omega), so if Z(\omega), then \tilde{P}(\omega)>P(\omega). vice versa.

We can calculate that,

$$ \underbrace{\tilde{\mathbb{E}}(X)}_{\text{Expectation under Risk-neutral Probability Measure}} = \underbrace{\mathbb{E}(ZX)}_{\text{Expectation under Actual Probability Measure}} $$

Proof & Example

Under (\Omega,\mathcal{F},P), A\in \mathcal{F}, let X be a random variable X\sim N(0,1). \mathbb{E}(X)=0, and \mathbb{Var}(X)=1.

$Y=X+\theta$, $\mathbb{E}(Y)=\theta$, and $\mathbb{Var}(Y)=1$.

$X$ here is s.d. normal under the actual probability measure.

However, Y here is not standard normal under the current probability P(.), because \mathbb{E}(Y)\neq0.

What do we do?

We change the probability measure from P(.)\to\tilde{P}(.) to let Y be standard normal under the new probability measure!

We set the Radon-Nikodym Derivative,

$$Z(\omega) = exp\{ -\theta\ X(\omega) – \frac{1}{2}\theta^2 \}$$

Now, we can create the probability measure \tilde{P}(A), A={ \omega;Y(\omega)\leq b) }

$$\tilde{P}(A) = \int_A Z(\omega)\ dP(\omega)$$

such that Y=X+\theta would be standard normal distributed under the new probability measure \tilde{P}(A).

$$\tilde{P}(A) = \tilde{P}(Y(\omega \leq b)$$

$$ = \int_{{ Y(\omega)\leq b } } exp{ -\theta\ X(\omega) – \frac{1}{2}\theta^2 } \ dP(\omega)$$

, then change the integral range from the set A to \Omega by multiplying that indicator.

$$ = \int_{\Omega }\mathbb{1}_{ Y(\omega)\leq b }\ exp{ -\theta\ X(\omega) – \frac{1}{2}\theta^2 } \ dP(\omega)$$

, change from dP to dX,

$$ = \int_{-\infty }^{\infty }\mathbb{1}_{ b-\theta}\ exp{ -\theta\ X(\omega) – \frac{1}{2}\theta^2 } \ \frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}X^2(\omega)} \ dX(\omega)$$

$$ =\frac{1}{\sqrt{2\pi}} \int_{-\infty }^{b-\theta}\ exp{ -\theta\ X(\omega) – \frac{1}{2}\theta^2- \frac{1}{2}X^2(\omega)} \ dX(\omega)$$

$$ =\frac{1}{\sqrt{2\pi}} \int_{-\infty }^{b-\theta}\ exp\Bigg\{ -\frac{1}{2}\bigg(\theta+ X(\omega)\bigg)^2\Bigg\} \ dX(\omega)$$

, as Y=X+\theta, dY = dX, we now change dX to dY,

$$ =\frac{1}{\sqrt{2\pi}} \int_{-\infty }^{b}\ exp\big\{ -\frac{1}{2}Y(\omega)^2\big\} \ dY(\omega)$$

, the above is now a standard normal distribution for Y(\omega).

USD流转 信用 硅谷银行

QE

本来美元作为通用货币,而且很稳定,美国慢慢印钱慢慢灌水就行了,可惜来了疫情,为了刺激US内的经济循环,全世界委屈下,然后开始大量印钱,美国牛逼。短期情况下价格水平不变,(1)超额印出的美元可以用来购买import其他国家的资产(2)促进国内消费,带动健康的经济循环。

QT

随着时间段变长,中长期市场会超额印钱速了美元贬值,一般情况下,每次贬值到一定程度的时候都会加息让美元回流。因为加息之后收益利率变高,投资者可以买美元存银行或者换美元买美国其他资产,带来USD需求增加,让在外流动的美元减少,推动美元升值。然后US付利息。财政费用支出增加(财政部,cb配合,以及us gov资产负债表研究 wait to be done)。

QE+QT Circulate

吸收到一定程度后,USD升值,市场中的USD量达到适当(quasi-equilibrium)水平后,可以重复QE QT过程。

综上:QE+QT结合相当于:投资者(其他国家)把钱存入银行(US),银行(US)给投资者利息,然后用投资者的钱,买投资者(其他国家)的资产。In short, US用付利息(money)去买世界的goods and services。 但是,所谓利息,不过是张纸,或者说credit paper罢了。

Some Facts in Reality

  • 中国通过出口,挣了很多Current account surplus 贸易顺差,即中国在出口商品的过程中,挣了很多USD。此时。中国世界工厂的职能使得,进口商进口中国商品的时候,对CNY需求增加,使得CNY相对升值。曾经中国对CA surplus的处理是,买gov bond,但是近年逐步增加对实体资产的持有,以及对gold的持有(2020-2021全球QE的背景下,gold储备量保证该国货币的信用水平。USD持有也有为其他国家货币提供credibility的功能,但是该功能在US超量QE的背景下逐渐下降)。持有实体资产的好处在于,不是paper currency,相对更加保值且有能力转换为生产力,受到风险时也更加稳定。
  • US QE的时候,中国也处在疫情全国quarantine的阶段,出口减少,导致全球市场上超额印发的USD没有办法购买足够的商品(中国supply少了很多,其他国家同样export少了)。此时US QE的超量不能被市场消化,US通胀大幅上涨。US为了避免hyper- inflation,加息收回USD。
  • 各国EU,Canada,UK等(JP不同)为了避免在USD升贬值的过程中被US收割资产,基本上选择和US类似节奏的升降息,以避免本国资产大幅流出。中国外汇管制,资产难以流出。 => 超额USD难以被消耗。
  • 俄乌问题+各种sanction使得如俄罗斯,伊朗等国家出现无法转账等问题,这些国家的货币流通出现困难,使得这些国家货币的credit下降。(此时这些国家对于USD的需求增加)。
  • 石油美元的勾在新能源大背景下稍微减弱,但是已经稳定。
  • etc

总之,在目前短期内,超额USD难以被消耗,US只能继续大幅加息,高利息高收益,为了吸引其他国家钱流入,为了收回超发的货币避免通胀。

硅谷银行

US QE+QT毕竟是宏观行为,但是它改变了收益率的模式。最明显的特征就是收益率曲线倒挂。长期利率低于短期利率。这导致微观层面,消费者存入银行的钱长期不如短期给的利息多,消费者缩减投资期限的情况增加。银行挤兑出现。

USD信用崩塌的情况逐渐出现。

Dutch Disease

In Dutch Disease, certain sectors have enormous exports demand, which would drive the demand of currency for that country. Its currency appreciates. However, the rest sectors that may not have such huge amount of exports demand would also have to undergo an appreciation of currency. Export demands for goods and services in the rest sectors would decrease even severe.

The Impact of Balance of Payments Flows

As noted earlier, the parity conditions may be appropriate for assessing fair value for currencies over long horizons, but they are of little use as a real-time gauge of value. There have been many attempts to find a better framework for determining a currency’s short-run or long-run equilibrium value. Let’s now examine the influence of trade and capital flows.

A country’s balance of payments consists of its (1) current account as well as its (2) capital and (3) financial account. The official balance of payments accounts make a distinction between the “capital account” and the “financial account” based on the nature of the assets involved. For simplicity, we will use the term “capital account” here to reflect all investment/financing flows. Loosely speaking, the current account reflects flows in the real economy, which refers to that part of the economy engaged in the actual production of goods and services (as opposed to the financial sector). The capital account reflects financial flows. Decisions about trade flows (the current account) and investment/financing flows (the capital account) are typically made by different entities with different perspectives and motivations. Their decisions are brought into alignment by changes in market prices and/or quantities. One of the key prices—perhaps the key price—in this process is the exchange rate.

Countries that import more than they export will have a negative current account balance and are said to have current account deficits. Those with more exports than imports will have a current account surplus. A country’s current account balance must be matched by an equal and opposite balance in the capital account. Thus, countries with current account deficits must attract funds from abroad in order to pay for the imports (i.e., they must have a capital account surplus).

When discussing the effect of the balance of payments components on a country’s exchange rate, one must distinguish between short-term and intermediate-term influences on the one hand and longer-term influences on the other. Over the long term, countries that run persistent current account deficits (net borrowers) often see their currencies depreciate because they finance their acquisition of imports through the continued use of debt. Similarly, countries that run persistent current account surpluses (net lenders) often see their currencies appreciate over time.

However, investment/financing decisions are usually the dominant factor in determining exchange rate movements, at least in the short to intermediate term. There are four main reasons for this:

  • Prices of real goods and services tend to adjust much more slowly than exchange rates and other asset prices.
  • Production of real goods and services takes time, and demand decisions are subject to substantial inertia. In contrast, liquid financial markets allow virtually instantaneous redirection of financial flows.
  • Current spending/production decisions reflect only purchases/sales of current production, while investment/financing decisions reflect not only the financing of current expenditures but also the reallocation of existing portfolios.
  • Expected exchange rate movements can induce very large short-term capital flows. This tends to make the actualexchange rate very sensitive to the currency views held by owners/managers of liquid assets.

Current Account Imbalances and the Determination of Exchange Rates

Current account trends influence the path of exchange rates over time through several mechanisms:

  • The flow supply/demand channel
  • The portfolio balance channel
  • The debt sustainability channel

Let’s briefly discuss each of these mechanisms next.

The Flow Supply/Demand Channel

The flow supply/demand channel is based on a fairly simple model that focuses on the fact that purchases and sales of internationally traded goods and services require the exchange of domestic and foreign currencies in order to arrange payment for those goods and services. For example, if a country sold more goods and services than it purchased (i.e., the country was running a current account surplus), then the demand for its currency should rise, and vice versa. Such shifts in currency demand should exert upward pressure on the value of the surplus nation’s currency and downward pressure on the value of the deficit nation’s currency.

Hence, countries with persistent current account surpluses should see their currencies appreciate over time, and countries with persistent current account deficits should see their currencies depreciate over time. A logical question, then, would be whether such trends can go on indefinitely. At some point, domestic currency strength should contribute to deterioration in the trade competitiveness of the surplus nation, while domestic currency weakness should contribute to an improvement in the trade competitiveness of the deficit nation. Thus, the exchange rate responses to these surpluses and deficits should eventually help eliminate—in the medium to long run—the source of the initial imbalances.

The amount by which exchange rates must adjust to restore current accounts to balanced positions depends on a number of factors:

  • The initial gap between imports and exports
  • The response of import and export prices to changes in the exchange rate
  • The response of import and export demand to changes in import and export prices

If a country imports significantly more than it exports, export growth would need to far outstrip import growth in percentage terms in order to narrow the current account deficit. A large initial deficit may require a substantial depreciation of the currency to bring about a meaningful correction of the trade imbalance.

A depreciation of a deficit country’s currency should result in an increase in import prices in domestic currency terms and a decrease in export prices in foreign currency terms. However, empirical studies often find limited pass-through effects of exchange rate changes on traded goods prices. For example, many studies have found that for every 1% decline in a currency’s value, import prices rise by only 0.5%—and in some cases by even less—because foreign producers tend to lower their profit margins in an effort to preserve market share. In light of the limited pass-through of exchange rate changes into traded goods prices, the exchange rate adjustment required to narrow a trade imbalance may be far larger than would otherwise be the case.

Many studies have found that the response of import and export demand to changes in traded goods prices is often quite sluggish, and as a result, relatively long lags, lasting several years, can occur between (1) the onset of exchange rate changes, (2) the ultimate adjustment in traded goods prices, and (3) the eventual impact of those price changes on import demand, export demand, and the underlying current account imbalance.

The Portfolio Balance Channel

The second mechanism through which current account trends influence exchange rates is the so-called portfolio balance channel. Current account imbalances shift financial wealth from deficit nations to surplus nations. Countries with trade deficits will finance their trade with increased borrowing. This behaviour may lead to shifts in global asset preferences, which in turn could influence the path of exchange rates. For example, nations running large current account surpluses versus the United States might find that their holdings of US dollar–denominated assets exceed the amount they desire to hold in a portfolio context. Actions they might take to reduce their dollar holdings to desired levels could then have a profound negative impact on the dollar’s value.

“Shifts in Global Asset Preferences” means would alter the components of assets allocation in the portfolio.

The Debt Sustainability Channel

The third mechanism through which current account imbalances can affect exchange rates is the so-called debt sustainability channel. According to this mechanism, there should be some upper limit on the ability of countries to run persistently large current account deficits. If a country runs a large and persistent current account deficit over time, eventually it will experience an untenable rise in debt owed to foreign investors. If such investors believe that the deficit country’s external debt is rising to unsustainable levels, they are likely to reason that a major depreciation of the deficit country’s currency will be required at some point to ensure that the current account deficit narrows significantly and that the external debt stabilises at a level deemed sustainable.

The existence of persistent current account imbalances will tend to alter the market’s notion of what exchange rate level represents the true, long-run equilibrium value. For deficit nations, ever-rising net external debt levels as a percentage of GDP should give rise to steady (but not necessarily smooth) downward revisions in market expectations of the currency’s long-run equilibrium value. For surplus countries, ever-rising net external asset levels as a percentage of GDP should give rise to steady upward revisions of the currency’s long-run equilibrium value. Hence, one would expect currency values to move broadly in line with trends in debt and/or asset accumulation.

Reference

CFA Readings

Value at Risk & Expected Shortfalls

Value at Risk – VaR

VaR is a probability statement about the potential change in the value of a portfolio.

Notation

$$Porb(x\leq VaR(X))= 1-c$$

$$ Prob\bigg(z \leq \frac{VaR(X)-\mu}{\sigma}\bigg)=1-c $$

  • $c$ – confidence interval, i.e. $c=99\%$. Then $1-c = 1\% $
  • $\mu$ and $\sigma$ are for $X$.
    • For Example, if X is yearly return, then \mu_{252days}=252\cdot\mu_{1day}, and \sigma_{252days}=\sqrt{252}\cdot\sigma_{1day}
  • $x$ here is the return. So, $c$ is the confidence interval, i.e. 99%.
    • VaR focus on the tail risks. If x stands for return, then tail risk is on the left tail, z_{1-c}.
  • If x is the loss, the tail risk is on the right tail. z_c

$$VaR(X) = \mu + \sigma\cdot \Phi^{-1}(1-c)$$

$$VaR(X) = \mu + \sigma\cdot z_{1-c}$$

  • I.E.

​ If c=99\%, then 1-c=1\%, so z_{1-c}=z_{0.01} \approx -2.33

VaR(X) = \mu – 2.33\cdot \sigma

P.S.

​ The unit of VaR is the amount of loss, so it should be monetary amount. For example, if the total amount of portfolio is USD 1 million, then VaR = \$1m \cdot (\mu – 2.33\cdot \sigma).

Loss Distribution

Remember X is a distribution of loss. If we know the distribution of Portfolio Return R, R\sim N(\mu, \sigma^2), then what is the dist for X?

$$X \sim N(-\mu, \sigma^2)$$

Right! Loss is just the negative return. Also, the volatility would not be affected by plus / minus.

Expected Shortfall (ES)

Expected Shortfall states the Expected Loss during time T conditional on the loss being greater than the c^{th} percentile of the loss distribution.

Notation

$$ ES_c (X) = \mathbb{E}\bigg[ X|X\leq VAR_c(X) \bigg] $$

  • Be attention here, X is a r.v., and x stands for return here! while the only variable in the ES_c(X) is c, the confidence level, instead of X.
  • $c$ is the confidence level, i.e. $c$ = 99%.
  • If x stands for return, then the VaR is the left-tail, z_{1-c}.

$$ ES_c (X) = \mathbb{E}\bigg[ X|X\geq VAR_c(X) \bigg] $$

  • If x stands for loss (, which is the negative of return ), then the VaR is the right-tail, z_{c}.

Derivation

Notation Form

Consider, x is the return, then ES_c (X) = \mathbb{E}\bigg[ X|X\leq VAR_c(X) \bigg], and VaR_c(x)= \mu + z_{1-c}\sigma, where c is the confidence level c=99\% for example.

$$ES_c(X) = \frac{\int_{-\infty}^{VaR} xf(x)dx }{\int_{-\infty}^{VaR} f(x)dx } = \frac{\int_{-\infty}^{VaR} x \phi(x)dx }{\int_{-\infty}^{VaR} \phi(x)dx } =\frac{\int_{-\infty}^{VaR} x \phi(x)dx }{ \Phi(VaR) – \Phi(-\infty)} $$

$$= \frac{1}{ \Phi(VaR) – \Phi(-\infty) }\int_{-\infty}^{VaR}x \frac{1}{\sqrt{2\pi \sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}} dx $$

Replace z = \frac{x-\mu}{\sigma}, then x = \mu + z \sigma, and dx = \sigma dz

$$ = \frac{1}{\Phi(VaR)} \int_{-\infty}^{VaR}(\mu + z\sigma) \frac{1}{\sqrt{2\pi \sigma^2}} e^{-\frac{z^2}{2}}\sigma dz $$

$$ = \frac{1}{\Phi(VaR)}\mu \int_{-\infty}^{VaR}\frac{1}{\sqrt{2\pi }} e^{-\frac{z^2}{2}} dz + \sigma^2\int_{-\infty}^{VaR} z \frac{1}{\sqrt{2\pi \sigma^2}} e^{-\frac{z^2}{2}} dz $$

$$ = \frac{1}{\Phi(VaR)}\mu \Phi(VaR) – \frac{\sigma^2}{\Phi(VaR)}\int_{-\infty}^{VaR} \frac{1}{\sqrt{2\pi \sigma^2}} e^{-\frac{z^2}{2}} d(-\frac{z^2}{2}) $$

$$ = \mu – \frac{\sigma^2}{\Phi(VaR)} \frac{1}{\sqrt{2\pi \sigma^2}} \int_{-\infty}^{VaR} e^{-\frac{z^2}{2}} d(-\frac{z^2}{2}) $$

$$ = \mu – \frac{\sigma}{\Phi(VaR)} \frac{1}{\sqrt{2\pi }} e^{-\frac{z^2}{2}} |_{-\infty}^{VaR} $$

$$ = \mu – \frac{\sigma}{\Phi(VaR)} \frac{1}{\sqrt{2\pi }} e^{-\frac{VaR^2}{2}}= \mu – \frac{\sigma}{\Phi(VaR)} \phi(VaR)$$

Recall, VaR_c(x)= \mu + z_{1-c}\sigma, so \phi(VaR_c(x))= \phi(\mu + z_{1-c}\sigma) \leftrightarrow \phi(z_{1-c}) = \phi\bigg( \Phi^{-1}(1-c) \bigg), and \Phi(VaR_c(x))= \Phi(\mu + z_{1-c}\sigma) \leftrightarrow \phi(z_{1-c}) = \Phi\bigg( \Phi^{-1}(1-c) \bigg) = 1-c.

Thus,

$$ ES_c(X) =\mu – \frac{\sigma}{\Phi(VaR)} \phi(VaR)=\mu -\sigma \frac{\phi\big( \Phi^{-1}(1-c) \big)}{1-c}$$

VaR Form

we ‘sum up’ (integrate) the VaR from c to 1, conditional on 1-c.

$$ES_c(X) = \frac{1}{1-c} \int_c^1 VaR_u(X)du$$

$$ ES_c(X) = \frac{1}{1-c} \int_c^1 \bigg( \mu + \sigma\cdot \Phi^{-1}(1-u) \bigg) du $$

$$ =\mu + \frac{\sigma}{1-c} \int^1_c \Phi^{-1}(1-u) du $$

We let u = \Phi(Z), where Z \sim N(0,1). Then,

  • $du =d(\Phi(z)) =\phi(z) dz$.
  • $u\in (c,1)$, so $z = \Phi^{-1}(u)\in (z_c \ , \infty)$

Thus,

$$ ES_c(X) =\mu + \frac{\sigma}{1-c} \int^{\infty}_{z_c} \Phi^{-1}\big(1-\Phi(z)\big)\phi(z) dz $$

As 1-\Phi(z) = \Phi(-z)

$$ ES_c(X) =\mu + \frac{\sigma}{1-c} \int^{\infty}_{z_c} \Phi^{-1}(\Phi(-z))\phi(z) dz = \mu – \frac{\sigma}{1-c} \int^{\infty}_{z_c} z\phi(z) dz $$

$ \int_{z_c}^{\infty} z \phi(z)dz = \int_{z_c}^{\infty} z \frac{1}{\sqrt{2\pi}}e^{-\frac{z^2}{2}}dz = -\frac{1}{\sqrt{2\pi}} \int_{z_c}^{\infty} -e^{\frac{z^2}{2}}d(e^{-\frac{z^2}{2}})$

$=\frac{1}{\sqrt{2\pi}}e^{-\frac{z_c^2}{2}}=\phi(z_c)=\phi\big(\Phi^{-1}(c)\big)$, bring it back to $ES_c(X)$

$$ES_c(X) = \mu – \sigma\frac{ \phi\big(\Phi^{-1}(c)\big)}{1-c}$$

Morden Portfolio Theory

  • $x$ – vector weights
  • $R$ – vector of all assets’ returns
  • $\mu = \mathbb{E}(R)$ – mean return of all assets
  • $\Sigma = \mathbb{E}\bigg[ (R-\mu)(R-\mu)^T \bigg]$ – var-cov matrix of all assets

So,

  • $\mu_x = x^T \mu$ – becomes a scalar now
  • $\sigma^2 = x^T \Sigma x$ – collapse to be a scalar

Optimisation

  • Maximise Expected Return s.t. volatility constraint.

$$ \max_{x} \mu_x \quad s.t. \quad \sigma_x \leq \sigma^* $$

  • Minimise Volatility s.t. return constraint.

$$ \min_{x} \sigma_x \quad s.t. \quad \mu_x \geq \mu^* $$

Portfolio Risk Measures

By definition, the loss of a portfolio is the negative of return, L(x) = -R(x).

The Loss distribution becomes the same normal distribution with x-axis reversed.

  • Volatility of Loss: \sigma(L(x)) = \sigma_x, the minus does not matter in the s.d.
  • Standard Deviation-based risk measure: =\mathbb{E}(L(x)) + cz_{c}\sigma(L(x)), x-axis is revered, so z_{1-c} for return becomes z_c for loss.
  • VaR: VaR_{\alpha}(x)=inf\bigg{ \mathscr{l}:Prob\big[ L(x)\leq \mathscr{l} \geq\big] \alpha \bigg}
  • Expected Shortfall: ES_{\alpha}(x) = \frac{1}{1-\alpha} \int_{\alpha}^1 VaR_u(x) du. In other form, ES_{\alpha}(x)=\mathbb{E}\bigg( L(x)| L(x)\geq VaR_{\alpha}(x) \bigg)

As R \sim N(\mu, \Sigma),

  • for our portfolio with weights x, mean = \mu, and \sigma_x = \sqrt{x^T \Sigma x}.
  • for the loss, mean = -\mu, and \sigma_x = \sqrt{x^T \Sigma x}.

Taylor Series and Transition Density Functions

See my Github repo for full details.

https://github.com/eightsmile/cqf

1. Trinomial Random Walk

2. Transition Probability Density Function

The transition probability density function, p(y,t;y’,t’), is defined by,

$$ Prob(a<y'<b, at\ time \ t’ | y \ at \ time\ t) = \int_a^b p(yet;y’,t’)dy’$$

In words this is “the probability that the random variable y ′ lies between a and b at time t ′ in the future, given that it started out with value y at time t.”

Think of y and t as being current values with y ′ and t ′ being future values. The transition probability density function can be used to answer the question,

“What is the probability of the variable y ′ being in a specified range at time t ′ in the future given that it started out with value y at time t?”

Our Goal is to find the transition probability p.d.f., and so we find the relationship between p(y,t;y’,t’), and p(y,t;y’,t’-\delta t),

3. From the Trinomial model to the Transition Probability Density function

The variable y can either rise, fall or take the same value after a time step δt. These movements have certain probabilities associated with them.

We are going to assume that the probability of a rise and a fall are both the same, \alpha<\frac{1}{2} . (But, of course, this can be generalized. Why would we want to generalize this?)

3.1 The Forward Equation

Given {y,t}, or says {y,t} the current and previous. {y’,t’} are variate in the future time.

The probability of being at y’ at time t’ is related to the probabilities of being at the previous three values and moving in the right direction:

$$ p(y,t;y’,t’) = \alpha \ p(y,t;y’+\delta y,t’-\delta t) + \ (1-2\alpha) \ p(y,t;y’,t’-\delta t) + \alpha \ p(y,t;y’-\delta y,t’-\delta t) $$

Given {y,t}, we find relationship between {y’,t’} and {y’\pm \delta y,t’-\delta t} that is y’ and t’ a bit time previously.

Remember, our goal is to find a solution of p(.), we try to solve the above equation.

3.2 Taylor Series Expansion

We expand each term of the equation.

$$ p(y,t;y’,t’) = \alpha \ p(y,t;y’+\delta y,t’-\delta t) + \ (1-2\alpha) \ p(y,t;y’,t’-\delta t) + \alpha \ p(y,t;y’-\delta y,t’-\delta t) $$

Why we do that? Because there are too many variables in it, hard to solve it. We have to reduce the dimension.

$$ p(y,t;y’+\delta y,t’-\delta t)\approx \ p(y,t;y’,t) – \delta t \frac{\partial p}{\partial t’} +\delta y \frac{\partial p}{\partial y’} + \frac{1}{2}\delta y^2 \frac{\partial^2 p}{\partial y’^2} + O(\frac{\partial^2 p}{\partial t’^2}) $$

$$ p(y,t;y’,t’-\delta t)\approx \ p(y,t;y’,t) – \delta t \frac{\partial p}{\partial t’} + O(\frac{\partial^2 p}{\partial t’^2}) $$

$$ p(y,t;y’-\delta y,t’-\delta t)\approx \ p(y,t;y’,t) – \delta t \frac{\partial p}{\partial t’} -\delta y \frac{\partial p}{\partial y’} + \frac{1}{2}\delta y^2 \frac{\partial^2 p}{\partial y’^2} + O(\frac{\partial^2 p}{\partial t’^2}) $$

Plug them back into that equation, and after cancel out terms repeated we would left with,

$$ \frac{\partial p}{\partial t’} =\alpha \frac{\delta y^2}{\delta t} \frac{\partial^2 p}{\partial y’^2} + O(\frac{\partial^2 p}{\partial t’^2})$$

We drop those derivative terms with order greater and equal than O(\frac{\partial^2 p}{\partial t’^2}).

$$ \frac{\partial p}{\partial t’} =\alpha \frac{\delta y^2}{\delta t} \frac{\partial^2 p}{\partial y’^2} $$

In the RHS, we focus on \alpha \frac{\delta y^2}{\delta t}, firstly. The denominator and numerator have to be in the same order to make that term definite. Or, say \delta y \sim O(\sqrt{\delta t}).

We thus let c^2 = \alpha \frac{\delta y^2}{\delta t}

$$ \frac{\partial p}{\partial t’} =c^2 \frac{\partial^2 p}{\partial y’^2} $$

The above equation is also named Fokker–Planck or forward Kolmogorov equation.

Now, we have a partial differential equation. Solve it, we can get the form of p.

3.3 Backward Equation works similar.

$$ p(y,t;y’,t’) = \alpha \ p(y+\delta y,t+\delta t;y’,t’) + \ (1-2\alpha) \ p(y,t+\delta t;y’,t’) + \alpha \ p(y-\delta y,t+\delta t;y’,t’) $$

the dimension-reduced result is the blow, and it is called the backward Kolmogorov equation.

$$ \frac{\partial p}{\partial t’} + c^2 \frac{\partial^2 p}{\partial y’^2} =0 $$

4. Solve the Forward Kolmogorov Equation

We will solve for p right now! However, we will solve it by assuming similarity solution.

$$ \frac{\partial p}{\partial t’} =c^2 \frac{\partial^2 p}{\partial y’^2} $$

This equation has an infinite number of solutions. It has different solutions for different initial conditions and different boundary conditions. We need only a special solution here. The detailed process of finding that solution is showing as the following,

4. 1 Assume a Solution Form

$$ p=t’^a f(\frac{y’}{t’^b}) = t’^a f(\xi)$$

$$ \xi = \frac{y’}{t’^b}$$

, where a, and b are indefinite variables.

Again, don’t ask why it is in this form, because it is a special solution!

4.2 Derivation

$$\frac{\partial p}{\partial y’}=t’^{a-b}\frac{df}{d\xi}$$

$$\frac{\partial^2 p}{\partial y’^2}=t’^{a-2b}\frac{d^2f}{d\xi^2}$$

$$\frac{\partial p}{\partial t’}=at’^{a-1}f(\xi)+by’t’^{a-b-1}\frac{df}{d\xi}$$

Substitue back into the forward Kolmogorov equation (remember y’ = t’^b \xi), we get,

$$ af(\xi) – b\xi \frac{df}{d\xi} = c^2 t’^{-2b+1} \frac{d^2f}{d\xi^2}$$

4.3 Choose b

As we need the RHS to be independent of t’, we could choose the value of b=\frac{1}{2}, to let the t’ has a power of 0. Why we do that? Because we aim to reduce the partial differential equation to be a ordinary differential equation, in which the only variable is \xi, and t’ disappear.

By assuming the special form of p, and letting b= 1/2, our forward Kolmogorov becomes,

$$ af(\xi) – \frac{1}{2}\xi \frac{df}{d\xi} = c^2 \frac{d^2f}{d\xi^2}$$

$$ p=t’^a f(\frac{y’}{\sqrt{t’}}) = t’^a f(\xi)$$

$$ \xi = \frac{y’}{\sqrt{t’}}$$

4.4 Choose a

$$p=t’^a f(\frac{y’}{\sqrt{t’}}) $$

We know that p is the transition p.d.f., its integral must be equal to ‘1’. t’ is independent by the definition of random walk behaviour, so we do only integrate p, w.r.t. y’.

$$\int_{\mathbb{R}}p\ dy’ = \int_{\mathbb{R}} t’^a f(\frac{y’}{\sqrt{t’}})\ dy’ = 1$$

$$ \int_{\mathbb{R}} t’^a f(\frac{y’}{\sqrt{t’}})\ dy’ = 1 $$

, by replace x = \frac{y’}{\sqrt{t’}},

$$ \int_{\mathbb{R}} t’^{a+1/2} f(x)\ dx =t’^{a+1/2} \int_{\mathbb{R}} f(x)\ dx= 1 $$

$t’$ is independent, so the above equation would be equal to ‘1’ regardless the power of $t’$. Thus, $a = -\frac{1}{2}$ for sure.

Also, we get \int_{\mathbb{R}} f(x)\ dx= 1.

4.5 Integrate! Solve it!

By assuming the special form of p, and letting a=-1/2, b=1/2, we get,

$$ -\frac{1}{2}f(\xi) – \frac{1}{2}\xi \frac{df}{d\xi} = c^2 \frac{d^2f}{d\xi^2}$$

$$ p=\frac{1}{\sqrt{t’}} f(\frac{y’}{\sqrt{t’}}) = \frac{1}{\sqrt{t’}}f(\xi)$$

$$ \xi = \frac{y’}{\sqrt{t’}}$$

The forward Kolmogorov equation becomes,

$$ -\frac{1}{2}\bigg(f(\xi) – \xi \frac{df}{d\xi} \bigg)= c^2 \frac{d^2f}{d\xi^2}$$

$$ -\frac{1}{2}\bigg( \frac{d \xi f(\xi)}{d \xi} \bigg)= c^2 \frac{d^2f}{d\xi^2}$$

, as f(\xi) – \xi \frac{df}{d\xi} = \frac{d \xi f(\xi)}{d \xi}.

Integrate 1st Time

$$ -\frac{1}{2}\xi f(\xi)= c^2 \frac{df}{d\xi} + constant$$

There’s an arbitrary constant of integration that could go in here but for the answer we want this is zero. We need only a special solution, so we can set that arbitrary constant term be zero.

So, the eq could be rewritten as,

$$ -\frac{1}{2c^2}\xi d\xi = \frac{1}{f(\xi)}df $$

Integrate 2nd Time

$$ ln\ f(\xi) = -\frac{\xi^2}{4c^2} + C$$

Take exponential, f(\xi) = e^C e^{-\frac{\xi^2}{4c^2}} = A e^{-\frac{\xi^2}{4c^2}} .

Find A

The Last Step here is to find the exact value of A. A is chosen such that the integral of f is one.

$$\int_{\mathbb{R}}f(\xi)\ d\xi =1$$

$$ \int_{\mathbb{R}}A e^{-\frac{\xi^2}{4c^2}} \ d\xi = 2cA\int_{\mathbb{R}} e^{-\frac{\xi^2}{4c^2}} \ d\big(\frac{\xi}{2c}\big) =1 $$

$$ 2cA \sqrt{\pi} = 1 $$

, so we get A = \frac{1}{2c\sqrt{\pi}}

Plug f(\xi), a, b, A back into p = t^a f(\xi).

$$ p(y’)=\frac{1}{2c\sqrt{\pi \ t’}}e^{-\frac{\xi^2}{4c^2}} =\frac{1}{2c\sqrt{\pi \ t’}}e^{-\frac{y’^2}{4c^2t’}} $$

$p(.)$ now is normal like distributed.

$$N(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$$

So, we may say \mu_{y’}=0, and \sigma^2_{y’}=2c^2t’. Or, y’ \sim N(0, 2c^2t’).

5. Summary

$$p(y’)=\frac{1}{2c\sqrt{\pi \ t’}}e^{-\frac{\xi^2}{4c^2}} =\frac{1}{2c\sqrt{\pi \ t’}}e^{-\frac{y’^2}{4c^2t’}} $$

Finally, we solved the transition probability density function p(.). By assuming the forward or backward type of trinomial model, we find a partial differential relationship. Then, assuming a special form of p(.) by similarity method, we solve it.

The meaning is that p(y’)=\frac{1}{2c\sqrt{\pi \ t’}}e^{-\frac{\xi^2}{4c^2}} =\frac{1}{2c\sqrt{\pi \ t’}}e^{-\frac{y’^2}{4c^2t’}} is one of the transition probability density function that can satisfy the trinomial random walk.

Also, we find that p(.) is normally liked distributed.