Question 1 (Lab 6)

In the following table fill in the columns “Which measure of centre” with the appropriate procedure for finding the centre of the distribution (Mean, Median or Mode). In the column “Value of Measure of Centre”, use R to execute the procedure and calculate the measure of centre. Execute the commands, but don’t print them in the output. Write that value in this column.

Variable Which Measure of Centre Value of Measure of Centre
sex Mode Female
d1 Mode Secure or Comfortable
d9 Median (Mean) 2 (2.2)
e11 Mode (Median) 3/Good (3/Good)
ft1 Mode (Median) 3/Some (3/Some)
g6e Mode (Median) 5/Completely Preventable (5/Completely Preventable)
k3a Median (Mean) 2 (2.19)
age Mean (Median) 53.6 (54)
agex Mode (Median) 6/65+ (4/45-54)

Question 1 (from Lab #7)

Produce a histogram and numerical summary for the age variable from the Alberta survey. Are the respondents normally distributed? What evidence do you have to support your claim?

alberta$age <- car:::recode(alberta$age, "99=NA")
histNorm(alberta$age, nclass=15)

It looks like there is a slight left skew in the data in the left tail being more prevalent than under normality.

Question 2 (from Lab #7)

Make a new variable (Z.age) that holds the standard scores for the age variable. Type the following code into R (with the Alberta data loaded)

alberta$Z.age <- scale(alberta$age)
alberta[which(alberta$respnum %in% c(61, 33)), c("respnum", "age", "Z.age")]
## # A tibble: 2 x 3
##   respnum age        Z.age
##     <dbl> <dbl+lbl>  <dbl>
## 1      33 45        -0.455
## 2      61 49        -0.210

This will show you the z-scores of respondents 61 and 33. If you had to guess, what proportion of observations would you guess would best characterize the proportion of observations below respondent 61 in the distribution and why: around 25% or less below, between 25% and 50% below, between 50% and 75% below or 75% or more below?

Between 25% and 50% below, because the z-score is less than zero, which would mean less than 50% if the distribution were normal, but the number is bigger than -1, which means that under normality, there couldn’t be any less than about 16 or 17% to the left of the value. Since the number is much bigger than -1, we can be confident that more than 25% of the distribution is to the left of -.455.