The notes is not only a review for the preparetion of quants, but also hopelfully a learning notes for a junior PhD student and also my babe.
Firstly and most importantly, I need to declare what is statistics and why shall we learn statistics.The following only based on my own understanding. My understanding is pretty pretty limited (I got only a master degree, so I am definitely not an expert in Statistics) and subjective, and please provide your suggestions and even your blames to me. Glad to know your ideas.
For my understanding, Statistics is a tool, characterised by mathematics, to explain the world. Such a bull shit am I talking.
Be serious. I may say that statistics is a process to estimate the population by samples.
To do study about the population is always costly, and pretty much unpredictable. For example, to do test in the individual level, we have to collect data from all the people. The population census could only be done in a national level and conducted by the gov. Even that, the census is unable to be performed in a year by year basis, and there are measurement errors always. Thus, a more cost-effective way would be to estimate the population through the data from a small set of people who are randomly selected.
Another example could be the weather forecast, which is similar as doing a time series analysis or panel data analysis. Tthe forecast may most likely to be biased, because things changes unpredictabily and irregularly. So we may say that is even impossible to and the full data to estimate the population (factors related to weather in this case). Thus, a simpler way might be that we collect different factors and historical data about such as termperature, because we may assume the temperature changes are consistent over a short period of time.
However, there are gaps between population and sample. How could we connect those gaps? The answer is Statistics. Statistics provide some mathematical proven methods to make the sample have a better capture about the population, based on assumptions.
Let's begin our study.
If two events are independent, then
Random Variables:
Observations:
p.d.f captures the probability that a r.v.
Repalce the intergal with summation for discrete r.v.
P.S. For continuous r.v.
For example, the probability of selecting a number “3” among 1 to 10 is zero.
The first moment,
The
The second central moment about mean.
The third central moment,
The Fourth central moment,
The meaning of distributions, and the properties (mean & var).
For a standard normal dist,
One / Two /Three standard deviation regions.
i.i.d. - idependent identical distributed
Suppose
would behave like normal distribution.
Key facts:
Therefore, we would get the distribution of
By standardising it,
Also, for
The more obervations there are, the more similar the distribution to normal would be. Also, the less standard deviation means the estimate has less variations and is more accurate.
It is important because it provide a way to use repeated obersevations to estimate the whole population, which is impossible to be observed.
Recall, our aim of using statistics is to find the true population. We may assume the true population follows a distribtuion, and that distribution has some parameters. What we are doing right now is to use the sample data (feasiblly collectable) to presume the population parameters.
Population Mean:
Population Variance:
Sample Mean:
Sample Variance:
Throw data into sample estimators would get the estimates, and those estimates are then applied to presume the population parameters.
Remember that sample is only part of population ,we collect data from the sample is because they are more accessible and feasible to get. Still, we need to use our sample data to be representatitive to the population, or in another word, to have some foreseers about the whole population. Therefore, we use a different notation on sample statistics.
A important aspect is that we needs our sample to have better representativeness of the population. There are some measurements.
If
The unbiased estimator of sample variance is
Why the denominator is "n-1"?
There would be a long discussion to talk about that. We can simply understand "-1" as the adjustment of the
In sum,
If there is a estimator such that as
For example, although
Flawness of discussion is available in this section, awaiting to be updated.
By assuming a probability distribtuion of the r.v.
To illustrate the problem, we need to find the parameters
The value of the parameters
Assume r.v.
We would find the MLE estimators are same as the OLS estimator in the following section.
Assume a linear model through which we can have a minimum sum mean squred.
, where
By Fanyu Zhao