2. 0'9 107。 3 042 setosa versicolor virginica setosa versicolor virginica Species Species 日 6 PIM'leed 990 8牡 古23 setosa versicolor virginica setosa versicolor virginica Species Species Figure 3:Boxplots for the response variables in the iris data set classified by species. par(mfrow=c(2,2)) for (response in c("Sepal.Length","Sepal.Width","Petal.Length","Petal.Width")) + Boxplot(iris[,response]~Species,data=iris,ylab=response) As the photographs suggest,the scatterplot matrix and boxplots for the measurements reveal that versicolor and virginica are more similar to each other than either is to setosa.Further,the ellipses in the scatterplot matrix suggest that the assumption of constant within-group covariance matrices is problematic:While the shapes and sizes of the concentration ellipses for versicolor and virginica are reasonably similar,the shapes and sizes of the ellipses for setosa are different from the other two. We proceed nevertheless to fit a multivariate one-way ANOVA model to the iris data: mod.iris <-lm(cbind(Sepal.Length,Sepal.Width,Petal.Length,Petal.Width) ~Species,data=iris) 6
● setosa versicolor virginica 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 Species Sepal.Length 107 ● setosa versicolor virginica 2.0 2.5 3.0 3.5 4.0 Species Sepal.Width 42 ● ● setosa versicolor virginica 1 2 3 4 5 6 7 Species Petal.Length 23 99 ● ● setosa versicolor virginica 0.5 1.0 1.5 2.0 2.5 Species Petal.Width 2444 Figure 3: Boxplots for the response variables in the iris data set classified by species. > par(mfrow=c(2, 2)) > for (response in c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")) + Boxplot(iris[, response] ~ Species, data=iris, ylab=response) As the photographs suggest, the scatterplot matrix and boxplots for the measurements reveal that versicolor and virginica are more similar to each other than either is to setosa. Further, the ellipses in the scatterplot matrix suggest that the assumption of constant within-group covariance matrices is problematic: While the shapes and sizes of the concentration ellipses for versicolor and virginica are reasonably similar, the shapes and sizes of the ellipses for setosa are different from the other two. We proceed nevertheless to fit a multivariate one-way ANOVA model to the iris data: > mod.iris <- lm(cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) + ~ Species, data=iris) 6
class(mod.iris) [1]"mlm""1m" >mod.iris Call: 1m(formula cbind(Sepal.Length,Sepal.Width,Petal.Length,Petal.Width) Species,data iris) Coefficients: Sepal.Length Sepal.Width Petal.Length Petal.Width (Intercept) 5.006 3.428 1.462 0.246 Speciesversicolor 0.930 -0.658 2.798 1.080 Speciesvirginica 1.582 -0.454 4.090 1.780 summary(mod.iris) Response Sepal.Length Call: lm(formula Sepal.Length Species,data iris) Residuals: Min 10 Median 30 Max -1.688-0.329-0.0060.3121.312 Coefficients: Estimate Std.Error t value Pr(>tl) (Intercept) 5.0060 0.0728 68.76<2e-16 Speciesversicolor 0.9300 0.1030 9.038.8e-16 Speciesvirginica 1.5820 0.1030 15.37<2e-16 Residual standard error:0.515 on 147 degrees of freedom Multiple R-squared:0.619, Adjusted R-squared:0.614 F-statistic:119 on 2 and 147 DF,p-value:<2e-16 Response Sepal.Width Ca11: lm(formula Sepal.Width Species,data iris) Residuals: Min 10 Median 30 Max -1.128-0.2280.0260.2260.972 Coefficients: Estimate Std.Error t value Pr(>Itl) 7
> class(mod.iris) [1] "mlm" "lm" > mod.iris Call: lm(formula = cbind(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) ~ Species, data = iris) Coefficients: Sepal.Length Sepal.Width Petal.Length Petal.Width (Intercept) 5.006 3.428 1.462 0.246 Speciesversicolor 0.930 -0.658 2.798 1.080 Speciesvirginica 1.582 -0.454 4.090 1.780 > summary(mod.iris) Response Sepal.Length : Call: lm(formula = Sepal.Length ~ Species, data = iris) Residuals: Min 1Q Median 3Q Max -1.688 -0.329 -0.006 0.312 1.312 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.0060 0.0728 68.76 < 2e-16 Speciesversicolor 0.9300 0.1030 9.03 8.8e-16 Speciesvirginica 1.5820 0.1030 15.37 < 2e-16 Residual standard error: 0.515 on 147 degrees of freedom Multiple R-squared: 0.619, Adjusted R-squared: 0.614 F-statistic: 119 on 2 and 147 DF, p-value: <2e-16 Response Sepal.Width : Call: lm(formula = Sepal.Width ~ Species, data = iris) Residuals: Min 1Q Median 3Q Max -1.128 -0.228 0.026 0.226 0.972 Coefficients: Estimate Std. Error t value Pr(>|t|) 7
(Intercept) 3.4280 0.0480 71.36<2e-16 Speciesversicolor -0.6580 0.0679 -9.69 <2e-16 Speciesvirginica -0.4540 0.0679 -6.68 4.5e-10 Residual standard error:0.34 on 147 degrees of freedom Multiple R-squared:0.401, Adjusted R-squared:0.393 F-statistic:49.2 on 2 and 147 DF,p-value:<2e-16 Response Petal.Length Ca11: lm(formula Petal.Length Species,data iris) Residuals: Min 10 Median 30 Max -1.260-0.2580.0380.2401.348 Coefficients: Estimate Std.Error t value Pr(>Itl) (Intercept) 1.4620 0.0609 24.0 <2e-16 Speciesversicolor 2.7980 0.0861 32.5 <2e-16 Speciesvirginica 4.0900 0.0861 47.5 <2e-16 Residual standard error:0.43 on 147 degrees of freedom Multiple R-squared:0.941, Adjusted R-squared:0.941 F-statistic:1.18e+03 on 2 and 147 DF,p-value:<2e-16 Response Petal.Width ca11: lm(formula Petal.Width Species,data iris) Residuals: Min 10 Median 30 Max -0.626-0.126-0.0260.1540.474 Coefficients: Estimate Std.Error t value Pr(>Itl) (Intercept) 0.2460 0.0289 8.5 2e-14 Speciesversicolor 1.0800 0.0409 26.4 <2e-16 Speciesvirginica 1.7800 0.0409 43.5 <2e-16 Residual standard error:0.205 on 147 degrees of freedom Multiple R-squared:0.929, Adjusted R-squared:0.928 F-statistic:960 on 2 and 147 DF,p-value:<2e-16 8
(Intercept) 3.4280 0.0480 71.36 < 2e-16 Speciesversicolor -0.6580 0.0679 -9.69 < 2e-16 Speciesvirginica -0.4540 0.0679 -6.68 4.5e-10 Residual standard error: 0.34 on 147 degrees of freedom Multiple R-squared: 0.401, Adjusted R-squared: 0.393 F-statistic: 49.2 on 2 and 147 DF, p-value: <2e-16 Response Petal.Length : Call: lm(formula = Petal.Length ~ Species, data = iris) Residuals: Min 1Q Median 3Q Max -1.260 -0.258 0.038 0.240 1.348 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.4620 0.0609 24.0 <2e-16 Speciesversicolor 2.7980 0.0861 32.5 <2e-16 Speciesvirginica 4.0900 0.0861 47.5 <2e-16 Residual standard error: 0.43 on 147 degrees of freedom Multiple R-squared: 0.941, Adjusted R-squared: 0.941 F-statistic: 1.18e+03 on 2 and 147 DF, p-value: <2e-16 Response Petal.Width : Call: lm(formula = Petal.Width ~ Species, data = iris) Residuals: Min 1Q Median 3Q Max -0.626 -0.126 -0.026 0.154 0.474 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.2460 0.0289 8.5 2e-14 Speciesversicolor 1.0800 0.0409 26.4 <2e-16 Speciesvirginica 1.7800 0.0409 43.5 <2e-16 Residual standard error: 0.205 on 147 degrees of freedom Multiple R-squared: 0.929, Adjusted R-squared: 0.928 F-statistic: 960 on 2 and 147 DF, p-value: <2e-16 8
The lm function returns an S3 object of class c("mlm","Im").The printed representation of the object simply shows the estimated regression coefficients for each response,and the model summary is the same as we would obtain by performing separate least-squares regressions for the four responses. We use the Anova function in the car package to test the null hypothesis that the four response means are identical across the three species of irises:3 >(manova.iris <-Anova(mod.iris)) Type II MANOVA Tests:Pillai test statistic Df test stat approx F num Df den Df Pr(>F) Species 2 1.19 53.5 8290<2e-16 class(manova.iris) [1]"Anova.mlm" summary(manova.iris) Type II MANOVA Tests: Sum of squares and products for error: Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length 38.956 13.630 24.625 5.645 Sepal.Width 13.630 16.962 8.121 4.808 Petal.Length 24.625 8.121 27.223 6.272 Petal.Width 5.645 4.808 6.272 6.157 Term:Species Sum of squares and products for the hypothesis: Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length 63.21 -19.95 165.25 71.28 Sepal.Width -19.95 11.34 -57.24 -22.93 Petal.Length 165.25 -57.24 437.10 186.77 Petal.Width 71.28 -22.93 186.77 80.41 Multivariate Tests:Species Df test stat approx F num Df den Df Pr(>F) Pillai 2 1.19 53.5 8 290<2e-16 Wilks 2 0.02 199.1 8 288<2e-16 Hotelling-Lawley 2 32.48 580.5 8 286<2e-16 Roy 2 32.19 1167.0 4 145<2e-16 The Anova function returns an object of class "Anova.mlm"which,when printed,produces a multivariate-analysis-of-variance ("MANOVA")table,by default reporting Pillai's test statistic; 3The Manova function in the car package is equivalent to Anova applied to a multivariate linear model. 9
The lm function returns an S3 object of class c("mlm", "lm"). The printed representation of the object simply shows the estimated regression coefficients for each response, and the model summary is the same as we would obtain by performing separate least-squares regressions for the four responses. We use the Anova function in the car package to test the null hypothesis that the four response means are identical across the three species of irises:3 > (manova.iris <- Anova(mod.iris)) Type II MANOVA Tests: Pillai test statistic Df test stat approx F num Df den Df Pr(>F) Species 2 1.19 53.5 8 290 <2e-16 > class(manova.iris) [1] "Anova.mlm" > summary(manova.iris) Type II MANOVA Tests: Sum of squares and products for error: Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length 38.956 13.630 24.625 5.645 Sepal.Width 13.630 16.962 8.121 4.808 Petal.Length 24.625 8.121 27.223 6.272 Petal.Width 5.645 4.808 6.272 6.157 ------------------------------------------ Term: Species Sum of squares and products for the hypothesis: Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length 63.21 -19.95 165.25 71.28 Sepal.Width -19.95 11.34 -57.24 -22.93 Petal.Length 165.25 -57.24 437.10 186.77 Petal.Width 71.28 -22.93 186.77 80.41 Multivariate Tests: Species Df test stat approx F num Df den Df Pr(>F) Pillai 2 1.19 53.5 8 290 <2e-16 Wilks 2 0.02 199.1 8 288 <2e-16 Hotelling-Lawley 2 32.48 580.5 8 286 <2e-16 Roy 2 32.19 1167.0 4 145 <2e-16 The Anova function returns an object of class "Anova.mlm" which, when printed, produces a multivariate-analysis-of-variance (“MANOVA”) table, by default reporting Pillai’s test statistic; 3The Manova function in the car package is equivalent to Anova applied to a multivariate linear model. 9
summarizing the object produces a more complete report.The object returned by Anova may also be used in further computations,for example,for displays such as HE plots (Friendly,2007;Fox et al.,2009;Friendly,2010).Because there is only one term (beyond the regression constant)on the right-hand side of the model,in this example the type-II test produced by default by Anova is the same as the sequential test produced by the standard R anova function: anova(mod.iris) Analysis of Variance Table Df Pillai approx F num Df den Df Pr(>F) (Intercept) 10.993 5204 4144<2e-16 Species 21.192 53 8 290<2e-16 Residuals 147 The null hypothesis is soundly rejected. The linearHypothesis function in the car package may be used to test more specific hypothe- ses about the parameters in the multivariate linear model.For example,to test for differences between setosa and the average of versicolor and virginica,and for differences between versicolor and virginica: linearHypothesis(mod.iris,"0.5*Speciesversicolor 0.5*Speciesvirginica", + verbose=TRUE) Hypothesis matrix: (Intercept)Speciesversicolor 0.5*Speciesversicolor 0.5*Speciesvirginica 0 0.5 Speciesvirginica 0.5*Speciesversicolor +0.5*Speciesvirginica 0.5 Right-hand-side matrix: Sepal.Length Sepal.Width 0.5*Speciesversicolor +0.5*Speciesvirginica 0 0 Petal.Length Petal.Width 0.5*Speciesversicolor +0.5*Speciesvirginica 0 0 Estimated linear function (hypothesis.matrix %*coef -rhs): Sepal.Length Sepal.Width Petal.Length Petal.Width 1.256 -0.556 3.444 1.430 Sum of squares and products for the hypothesis: Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length 52.58 -23.28 144.19 59.87 Sepal.Width -23.28 10.30 -63.83 -26.50 Petal.Length 144.19 -63.83 395.37 164.16 Petal.Width 59.87 -26.50 164.16 68.16 Sum of squares and products for error: 10
summarizing the object produces a more complete report. The object returned by Anova may also be used in further computations, for example, for displays such as HE plots (Friendly, 2007; Fox et al., 2009; Friendly, 2010). Because there is only one term (beyond the regression constant) on the right-hand side of the model, in this example the type-II test produced by default by Anova is the same as the sequential test produced by the standard R anova function: > anova(mod.iris) Analysis of Variance Table Df Pillai approx F num Df den Df Pr(>F) (Intercept) 1 0.993 5204 4 144 <2e-16 Species 2 1.192 53 8 290 <2e-16 Residuals 147 The null hypothesis is soundly rejected. The linearHypothesis function in the car package may be used to test more specific hypotheses about the parameters in the multivariate linear model. For example, to test for differences between setosa and the average of versicolor and virginica, and for differences between versicolor and virginica: > linearHypothesis(mod.iris, "0.5*Speciesversicolor + 0.5*Speciesvirginica", + verbose=TRUE) Hypothesis matrix: (Intercept) Speciesversicolor 0.5*Speciesversicolor + 0.5*Speciesvirginica 0 0.5 Speciesvirginica 0.5*Speciesversicolor + 0.5*Speciesvirginica 0.5 Right-hand-side matrix: Sepal.Length Sepal.Width 0.5*Speciesversicolor + 0.5*Speciesvirginica 0 0 Petal.Length Petal.Width 0.5*Speciesversicolor + 0.5*Speciesvirginica 0 0 Estimated linear function (hypothesis.matrix %*% coef - rhs): Sepal.Length Sepal.Width Petal.Length Petal.Width 1.256 -0.556 3.444 1.430 Sum of squares and products for the hypothesis: Sepal.Length Sepal.Width Petal.Length Petal.Width Sepal.Length 52.58 -23.28 144.19 59.87 Sepal.Width -23.28 10.30 -63.83 -26.50 Petal.Length 144.19 -63.83 395.37 164.16 Petal.Width 59.87 -26.50 164.16 68.16 Sum of squares and products for error: 10