Multivariate Statistical Methods

FIND A SOLUTION AT Academic Writers Bay

Source: Manly, Bryan F.J. Multivariate Statistical Methods: A Primer, Third Edition, CRC Press,
07/2004.
Page 1 of 5
Tutorial Solutions– Week 4 (HT)
Question 1:
When you have several dependent variables and several samples/groups the four statistics
that may be used to identify differences between group means are Pillai’s trace, Wilk’s
Lambda, Roy’s largest root, and Lawes-Hotelling trace. Briefly, describe and compare.
Solution:
All have F equivalents. All compare some form of variation either within, between or total
SS.
Wilk’s: compares variation within groups to variation both within and between groups
(total) based on SS. A small Wilk’s indicates that the variation within is relatively small and
a significant difference between groups.
Roy’s looks at linear combination of variables that maximises ratio of between sample SS
and within sample SS. This ratio is lambda and the largest is the maximum latent root
(eigenvalue). A large lambda indicates a significant difference between groups.
Pillai’s trace also considers lambda and large lambda indicates a significant difference
between groups.
Lawes also considers lambda and large lambda indicates a significant difference between
groups.
Question 2:
How important are the assumption of MVN and equal covariances to Hotelling’s T2 and the
four statistics commonly used in MANOVA analysis?
Solution:
Chapter 4.2 Hotelling’s T2 is multivariate t-test. Some deviation from MVN is not too
important and moderate differences between population covariance matrices is fine.
All 4 MANOVA statistics assume MVN and equal covariance matrices, with Pillai’s trace the
most robust to deviations. All 4 tests are fairly robust to unequal sample sizes
(unbalanced).
Question 3:
When testing for normality is it possible to have non-significant univariate results and
significant multivariate results? Why?
Solution:
Chapter 4.4: Yes, because of accumulation of small-ish deviation from normality across
many variables can lead to significant deviation from MVN.
Question 4:
How does multiple testing affect Type I error rates? Explain how the Bonferroni correction
for multiple testing works.
Solution:
Chapter 4.4: multiple testing increases the chance of Type I error – reject Ho when samples
are not really from different populations. Specific multivariate tests like Hotelling’s T2 are an
advantage over series of univariate tests.
Source: Manly, Bryan F.J. Multivariate Statistical Methods: A Primer, Third Edition, CRC Press,
07/2004.
Page 2 of 5
Question 5:
Complete the exercise at the end of Chapter 4 of Manly. The data file ‘mandiblefull.dat’ is
available on the Study Desk. Some R code to get you started is provided in ‘mandible
MANOVA.R’
The variable names and codes are as follows:
X1 – length of mandible
X2 – breadth of mandible below 1st molar
X3 – breadth of articular condyle
X4 – height of mandible below 1st molar
X5 – length of 1st molar
X6 – breadth of 1st molar
X7 – length of 1st to 3rd molar inclusive (1st to 2nd for cuon)
X8 – length from 1st to 4th premolar inclusive
X9 – breadth of lower canine
Sex – Male (1), Female (2), Unknown (0)
Group – Thai (modern) dogs (1), golden jackals (2), cuons (3), Indian wolves (4),
Thai (prehistoric) dogs (5).
Solution:
> Y<-cbind(X1, X2, X3, X4, X5, X6, X7, X8,X9)
> (cory<-round(cor(Y), digits=3))
X1 X2 X3 X4 X5 X6 X7 X8 X9
X1 1.000 0.826 0.855 0.798 0.907 0.852 0.759 0.949 0.883
X2 0.826 1.000 0.786 0.890 0.821 0.846 0.560 0.746 0.887
X3 0.855 0.786 1.000 0.769 0.779 0.720 0.478 0.727 0.748
X4 0.798 0.890 0.769 1.000 0.740 0.809 0.471 0.715 0.823
X5 0.907 0.821 0.779 0.740 1.000 0.854 0.742 0.878 0.883
X6 0.852 0.846 0.720 0.809 0.854 1.000 0.646 0.798 0.894
X7 0.759 0.560 0.478 0.471 0.742 0.646 1.000 0.787 0.648
X8 0.949 0.746 0.727 0.715 0.878 0.798 0.787 1.000 0.838
X9 0.883 0.887 0.748 0.823 0.883 0.894 0.648 0.838 1.000
> mf.manova1<-manova(Y ~ as.factor(Group), data=mf)
> summary(mf.manova1) #default test is Pillai’s
Df Pillai approx F num Df den Df Pr(>F)
as.factor(Group) 4 2.5892 13.662 36 268 < 2.2e-16 ***
Residuals 72

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> summary(mf.manova1, test=”Wilks”)
Df Wilks approx F num Df den Df Pr(>F)
as.factor(Group) 4 0.0021936 27.666 36 241.57 < 2.2e-16 ***
Residuals 72

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> summary(mf.manova1, test=”Roy”)
Df Roy approx F num Df den Df Pr(>F)
as.factor(Group) 4 16.348 121.7 9 67 < 2.2e-16 ***
Residuals 72

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> summary(mf.manova1, test=”Hotelling-Lawley”)
Df Hotelling-Lawley approx F num Df den Df Pr(>F)
as.factor(Group) 4 25.129 43.627 36 250 < 2.2e-16 ***
Residuals 72
Source: Manly, Bryan F.J. Multivariate Statistical Methods: A Primer, Third Edition, CRC Press,
07/2004.
Page 3 of 5

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation between variables ranges from moderate (~0.4) to very high (>0.9). Some
correlation is needed for MANOVA however, it is possible that some of these highly
correlated variables should be removed from the analysis.
There is a significant difference (p<0.001) between at least two species in the mean ‘size’
of mandible (measured by 9 variables) based on all 4 tests.
> #to subset and run individual comparisons in MANOVA:
> mf.manova2<-manova(Y ~ as.factor(Group), data=mf,
+ subset=as.factor(Group)%in% c(“5”, “1”))
> summary(mf.manova2)
Df Pillai approx F num Df den Df Pr(>F)
as.factor(Group) 1 0.83288 8.8603 9 16 0.0001013 ***
Residuals 24

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> ##To run Hotellings on only 2 groups
> library(DescTools)
> G51<-subset(mf, Group==”5″|Group==”1″)
> (HotellingsT2Test(cbind(X1, X2, X3, X4, X5, X6, X7, X8,X9) ~ Group, data=G51))
Hotelling’s two sample T2-test
data: cbind(X1, X2, X3, X4, X5, X6, X7, X8, X9) by Group
T.2 = 8.8603, df1 = 9, df2 = 16, p-value = 0.0001013
alternative hypothesis: true location difference is not equal to c(0,0,0,0,0,0,0,0,0)
When comparing just 5-prehistoric dogs and 1-modern dogs using either MANOVA or
Hotelling’s T2 the two species are significantly different in ‘size’ (p<0.001). Notice the
difference in df now that we are only comparing 2 groups.
Remember: if you were to do multiple pairwise comparisons between species (Group) you
would need to control for Type I error rate and adjust the significance cut-off, e.g.
Bonferroni correction (divide p by the number of tests). See Manly Chapter 4.4.
> mf.manova3<-manova(Y ~ as.factor(Group), data=mf,
+ subset=as.factor(Group)%in% c(“5”, “2”))
> summary(mf.manova3)
Df Pillai approx F num Df den Df Pr(>F)
as.factor(Group) 1 0.9137 23.527 9 20 9.423e-09 ***
Residuals 28

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> mf.manova4<-manova(Y ~ as.factor(Group), data=mf,
+ subset=as.factor(Group)%in% c(“5”, “3”))
> summary(mf.manova4)
Df Pillai approx F num Df den Df Pr(>F)
as.factor(Group) 1 0.97745 81.894 9 17 3.222e-12 ***
Residuals 25

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> mf.manova5<-manova(Y ~ as.factor(Group), data=mf,
+ subset=as.factor(Group)%in% c(“5”, “4”))
> summary(mf.manova5)
Source: Manly, Bryan F.J. Multivariate Statistical Methods: A Primer, Third Edition, CRC Press,
07/2004.
Page 4 of 5
Df Pillai approx F num Df den Df Pr(>F)
as.factor(Group) 1 0.91706 17.198 9 14 4.213e-06 ***
Residuals 22

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The mean ‘size’ described by nine variables differs significantly (p<0.001) between
Prehistoric dogs and all other species, where alpha=0.05/4 = 0.0125 applying a Bonferroni
correction for multiple testing.
> mf_sex<-subset(mf, Sex==”1″|Sex==”2″& Group<5)
> mf.manova6<-manova(cbind(X1, X2, X3, X4, X5, X6, X7, X8, X9) ~ as.factor(Group) *
as.factor(Sex), data=mf_sex)
> summary(mf.manova6)
Df Pillai approx F num Df den Df Pr(>F)
as.factor(Group) 3 2.32644 20.3400 27 159 < 2.2e-16 ***
as.factor(Sex) 1 0.38789 3.5909 9 51 0.001597 **
as.factor(Group):as.factor(Sex) 3 0.58485 1.4261 27 159 0.093163 .
Residuals 59

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The interaction between sex and Group (species) is only significant at p<0.1.
A significant interaction effect means that the mean ‘size’ depends on both gender and
species so that within at least one of the 4 species there is a difference between gender and
between at least two species the effect of gender is different. This could be further explored
with individual tests by species. From the test above both Group and Sex are significant
factors in describing differences between mean ‘size’ (p<0.001). When an interaction is
significant it is generally unwise to then interpret significant individual effects separately (if
there is a significant interaction then this means that one factor affects the other so it could
be misleading to interpret them independently). However in this case, the significance of
the interaction is not very convincing (only just significant at 0.1) so I would consider the
effects of both Group and Sex independently.
> G1<-subset(mf, Group==”1″ , select=c(X1, X2, X3, X4, X5, X6, X7, X8,X9))
> G5<-subset(mf, Group==”5″ , select=c(X1, X2, X3, X4, X5, X6, X7, X8,X9))
> par(mfrow=c(1,2))
> boxplot(G1, xlab=”size variables for modern dogs (1)”, ylab=”size (mm)”)
> boxplot(G5, xlab=”size variables for prehistoric dogs (2)”, ylab=”size (mm)”)
Source: Manly, Bryan F.J. Multivariate Statistical Methods: A Primer, Third Edition, CRC Press,
07/2004.
Page 5 of 5

YOU MAY ALSO READ ...  63665 – In this assignment, you need to describe a market of
Order from Academic Writers Bay
Best Custom Essay Writing Services

QUALITY: 100% ORIGINAL PAPERNO PLAGIARISM – CUSTOM PAPER