Review
- Correlation
- Covariance
- Correlation and causation
- Per capita cheese consumption is highly correlated with “Number of people who died by become tangled by their bedsheets”
- Popularity of the first name Annabelle with “UFO sightings in Mayland”
Correlation doesn’t imply causation
- Anscombe’s quartet
- Anscombe created 4 data sets with very similar descriptive statistics (linear correlation coefficients)
- degree of linear association; degree of monotone association (rank), Chatterjee.
- The Pearson Correlation Coefficient has limits
- The data sets appeared to be very different, which we could only talk about because we visualized the data
It illustrates the fact that even though we have the same correlation coefficient the reality behind that number may be really different
- Capital is a random variable, are realizations or actual numerical values for .
Transclude of Random-Variables-and-Expectations-2024-12-12-05.12.39.excalidraw
The expectation values of the mean is the consistent.
The same lies for the variance
The higher the sample size, the lesser the variance of , meaning that the value of the estimator will be more close to the real value of the population mean (the value that it’s trying to estimate)
As the sample size becomes bigger and bigger, theoretically to such an extent that the sample size is equal to the size of the population, then the variance becomes zero, and thus our estimated is actually the population mean.
- When the sample size increases () the distribution of the mean gets concentrated towards the population mean.
Keywords:
- Sampling
- Estimation
- Estimator
- Estimate
What are the desirable properties of these Estimators?
Properties
Consistency
Sufficient condition:
- The estimator is unbiased and its variance tends to when becomes large.
- A biased estimator may be consistent (thus, not necessary), if the bias disappears when the size of the sample increases1
The two conditions can also be stated as
- Then, is a consistent estimator of .
Plim definition:
Plim properties
and 5 more…
If we have several unbiased estimators, how do we choose among them? We simply select the one with the smaller variance (higher precision).
If we have an unbiased estimator with a larger variance, but a biased estimator with a smaller variance, how to we choose?
We can turn to a loss function which is the
Transclude of Random-Variables-and-Expectations-2024-12-12-05.37.40.excalidraw
The MSE is a synthetic measure of the bias and the variance. Looking at it we can see whether the bias-variance tradeoff is worth taking if it leads to a smaller MSE. But again, “best” is subjective…
- Try looking at variance as precision.
Footnotes
-
There are certain theories that are only applicable for large samples ↩