poplacarbon.blogg.se

# Assume that sample porosity follows a normal distribution # You collect 12 rock core samples for each depth 5.4Non-parametric alternative: Bootstrapĭepth =.Pythonic Tip:Box-Cox transform with SciPy and Scikit-Learn.5.3Transform to normal distribution: Box-Cox.Notes:Be cautious with hypothesis testing for normality.

5.2Robustness of confidence intervals to non-normality.

5.1Problems of non-normal distributions and central tendency.

5Confidence interval of non-normal distribution.

4.4Confidence interval of other statistics: Bootstrap.

Pythonic Tip:Computing confidence interval of variance with SciPy.

Pythonic Tip:Computing paired t-interval.

4.2.3Dependent (paired) samples - Paired t-interval.

Pythonic Tip:Computing Welch's t-interval.

4.2.2Independent (unpaired) samples, unequal variance - Welch's t-interval.

Pythonic Tip:Computing student's t-interval.

4.2.1Independent (unpaired) samples, equal.

Notes:Comparing means of more than two samples with ANOVA.

Pythonic Tip:Computing confidence interval of mean with SciPy.

Notes:Distribution of various statistics.

Pythonic Tip:Difference between Numpy variance.

Notes:Population variance $\sigma^2$ vs.

Example 3:Uncertainty in oil production forecast.

Example 2:Purity of methamphetamine (crystal) in Breaking Bad.

A confidence interval addresses this issue by providing a range of values, which is likely to contain the population parameter of interest within the range of uncertainty. How well a sample statistic estimates an underlying population parameter is always an issue ( Population vs. Due to the uncertainty involved with sample data, any statistical estimation needs to be delivered in a range, not in a point estimate. Sample data may not be a good representation of a population by numerous factors (Ex: bias), and as a result, uncertainty is always introduced in any estimations derived from sample data. Your best shot is to survey a small fraction ( samples) of the entire data set, and pray that your sample data represents the population reasonably well. It is difficult to obtain measurement data of an entire data set ( population) due to limited resource & time. For example: I am 95% confident that the population mean falls between 8.76 and 15.88 $\rightarrow$ (12.32 $\pm$ 3.56)Ĭonfidence interval tells you how confident you can be that the results from a poll or survey reflect what you would expect to find if it were possible to survey the entire population. In the other words, it is a range of values we are fairly sure our true value lies in.

Confidence interval is uncertainty in summary statistic represented as a range.