#!/usr/bin/env python # coding: utf-8 # # Chapter 8: Hypothesis testing # # Exercises # ## Exercice 1 (10 points) # # If $1000$ measurements are grouped into $25$ bins and then a curve is fitted to them which is the sum of a constant background and a Gaussian, how many degrees of freedom are there in the fit? # # Exercise 2 (20 points) # # You have a set of $20$ data points and fit four different models to it: linear, quadratic, and cubic poylnomials, as well as a sinusoidal function. These fits give $\chi^2$ values of $36$, $20$, $17$, and $29$, respectively. # # Calculate the appropriate p-values for each model fit. # Which model do you choose (if any) as the best fit and why? What have you assumed? Is selection based on $\chi^2$ appropriate and sufficient? # ## Exercise 3 (20 points) # # Ten temperatures are measured (in Kelvin), each with a precision of $0.2 K$: $10.2$, $10.4$, $9.8$, $10.5$, $9.9$, $9.8$, $10.3$, $10.1$, $10.3$, $9.9$. # # 1. It is suggested that they are all measurements of the same thing, the dispersion being due only to the measurements errors. Is this true? To answer this, define an appropriate hypothesis and statistic, and perform an appropriate hypothesis test to calculate the p-value of your hypothesis. # # 2. How does your analysis and result change if it is suggested that they are imstead all measurements of the same true value of $10.1 K$? # ## Exercise 4 (20 points) # # The exam scores of $16$ students in one group has a mean $107$ and standard deviation of $10$, while the scores of a second group of $14$ students has a mean of $112$ with a standard deviation of $8$. What is the probability that the two groups are different? # # _Hint_: first pose the question as an appropriate hypothesis test. What is the relevant statistic? # # ## Exercise 5 (30 points) # # Look at the `iswr_vitcap.data` dataset. We want to compare the "vital capacity" for the two groups (labelled 1 and 3 in the data file). We will use a t-test to decide whether the two groups differ significantly. # We adopt the null hypothesis that the difference in the means is zero. # Calculate the p-value of this null hypothesis, and the 99% confidence interval on the difference of the means, assuming # # (a) that the standard deviations of the two groups are the same, and # # (b) that they are different.