Biostatistics: Exercise 1

# Descriptive statistics and inference

The data sets are available for download in a .zip file.

1. Data set `precip` records the average amount of precipitation (rainfall) in inches for each of 70 United States (and Puerto Rico) cities.
• Plot the data set. Does it come from a normal distribution? Use `qqnorm` and `qqline` for a refined graphical analysis of normality. What is the theory behind `qqnorm`?
• Use `hist` and `density` to estimate the density function of the underlying probability distribution.
2. Experiment with data set `galaxies`. How many modes do you think there are in the underlying density?
3. Data set `sleep` shows the effect of two soporific drugs (increase in hours of sleep) on groups consisting of 10 patients each. Visualize the data set, and test the hypothesis whether the increase is the same in both groups.
4. Data set `chickwts` holds the results of an experiment conducted to measure and compare the effectiveness of various feed supplements on the growth rate of chickens. Use box-and-whisker plots to analyze the relation between chicken weight after six weeks and feed type. Use the Kruskal-Wallis test to analyze whether there is a significant relation between weight and feed type.
5. The dataset `vitcap` contains 24 observations of workers in the cadmium industry. Column `exposure` indicates the exposure with levels “> 10” which signifies an exposure of more than 10 years and “none” which signifies no exposure. Column `age` gives the age in years and column `vital.capacity` the vital capacity (a measure of lung volume) in liters. Visualize the data set. Use at least three different test procedures to compare the vital capacity for the two exposure groups and compare the results. Calculate a 99% confidence interval for the difference. The result of this comparison may be misleading. Why?
6. The following table gives the measurements of the Hamilton depression scale factor in 9 patients with mixed anxiety and depression, taken at the first (x) and second (y) visit after initiation of a therapy (administration of a tranquilizer).

``````Patient   1       2       3       4      5      6      7      8      9
1.83    0.50    1.62    2.48   1.68   1.88   1.55   3.06   1.30
y         0.878   0.647   0.598   2.05   1.06   1.29   1.06   3.14   1.29
``````

Do the data indicate a significant effect of the therapy in reducing the scores?