REVIEW INTERPRETATION GGE Biplot vs.AMMI Analysis of Genotype-by-Environment Data Weikai Yan,*Manjit S.Kang,Baoluo Ma,Sheila Woods,and Paul L.Cornelius W.Yan and B.L.Ma.Eastern Cereal and Oilseed Research Centre ABSTRACT The use of genotype mai CORCgdCanada (AAPC)0 kang Dep.of gronomy LA 70803-2110:S.Woods.Cereal Research Center (CRC).AAFC,195 cultural researchers has increased dramatically Dafoe Road,Winnipeg.MB.Canada.R3T 2M9:P.L.Cornelius,Dep. during the past 5 yr for analyzing multi-environ tria MET)data Dep. ndin Abbreviations:AEC.average environment coordination:AMMI.Addi- this to com biplot tive Main Effect and Multiplicative Interactio G,genotype main eftect y-enviror bV-D namely mega-environment analysis,ge nrincinal comr ponent:MET ation: environment tials;NID.normally and independently distributed an d be com importance of model diagnosis in biplot a LANT BREEDERS and gen eticists as well as statisticians,hav Our main 、9 inte in i and inte ons are:(1)bo nd ce in selecting sun rior g enot s in ls (Barah et al..1981:Kan 1g88 1993:Eskridge,1990:Kang vironment analysis and genotype e and Pham.1991:Hiihn.1996:Yan et al.2000)Many statisti (ii)the GGE biplot is superior to the AMMI cal methods have been developed for GED analysis,including graph in mec AMMI analysis(Gauch 1992)and GGE biplot analysis(Yan and and has the inner-product property of the hinlot Kang,2003;Yan and Tinker,2006). The biplot (Gabriel,1971)has become a popular data visu- biplot is effective in evaly zation too in many scientific research areas,including psy- dsnot pos ch ogy, ogy,and of biplots yses each dataset is useful,but accuracy gain from 7周 n(19 and ac model diagnosis should not be overstated gry pop evaluation and mega-env men (Yan et al Published in Crop Sci 47:643-655(2007). CROP SCIENCE,VOL.47,MARCH-APRIL 2007 643
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved. CROP SCIENCE, VOL. 47, MARCH–APRIL 2007 643 ABSTRACT The use of genotype main effect (G) plus genotype-by-environment (GE) interaction (G+GE) biplot analysis by plant breeders and other agricultural researchers has increased dramatically during the past 5 yr for analyzing multi-environment trial (MET) data. Recently, however, its legitimacy was questioned by a proponent of Additive Main Effect and Multiplicative Interaction (AMMI) analysis. The objectives of this review are: (i) to compare GGE biplot analysis and AMMI analysis on three aspects of genotype-by-environment data (GED) analysis, namely mega-environment analysis, genotype evaluation, and test-environment evaluation; (ii) to discuss whether G and GE should be combined or separated in these three aspects of GED analysis; and (iii) to discuss the role and importance of model diagnosis in biplot analysis of GED. Our main conclusions are: (i) both GGE biplot analysis and AMMI analysis combine rather than separate G and GE in megaenvironment analysis and genotype evaluation, (ii) the GGE biplot is superior to the AMMI1 graph in mega-environment analysis and genotype evaluation because it explains more G+GE and has the inner-product property of the biplot, (iii) the discriminating power vs. representativeness view of the GGE biplot is effective in evaluating test environments, which is not possible in AMMI analysis, and (iv) model diagnosis for each dataset is useful, but accuracy gain from model diagnosis should not be overstated. GGE Biplot vs. AMMI Analysis of Genotype-by-Environment Data Weikai Yan,* Manjit S. Kang, Baoluo Ma, Sheila Woods, and Paul L. Cornelius W. Yan and B.L. Ma, Eastern Cereal and Oilseed Research Centre (ECORC), Agric. and Agri-Food Canada (AAFC), 960 Carling Ave., Ottawa, ON, Canada, K1A 0C6; M.S. Kang, Dep. of Agronomy & Environ. Mgmt., Louisiana State Univ. Agric. Center, Baton Rouge, LA 70803-2110; S. Woods, Cereal Research Center (CRC), AAFC, 195 Dafoe Road, Winnipeg, MB, Canada, R3T 2M9; P.L. Cornelius, Dep. of Plant and Soil Sciences and Dep. of Statistics, Univ. of Kentucky, Lexington, KY 40506. ECORC contribution number: 06-688. Received 9 June 2006. *Corresponding author (yanw@agr.gc.ca). Abbreviations: AEC, average environment coordination; AMMI, Additive Main Eff ect and Multiplicative Interaction; G, genotype main eff ect; GE, genotype-by-environment interaction; GED, genotype-by-environment data (for a single trait); GGE, genotype main eff ect plus genotypeby-environment interaction; IPC, interaction principal component; MET, multi-environment trials; NID, normally and independently distributed; PC, principal component; SREG, Sites (Environments) Regression model; SVD, singular value decomposition; SVP, singular value partitioning. Plant breeders and geneticists, as well as statisticians, have a long-standing interest in investigating and integrating G and GE in selecting superior genotypes in crop performance trials (Barah et al., 1981; Kang, 1988, 1993; Eskridge, 1990; Kang and Pham, 1991; Hühn, 1996; Yan et al., 2000). Many statistical methods have been developed for GED analysis, including AMMI analysis (Gauch 1992) and GGE biplot analysis (Yan and Kang, 2003; Yan and Tinker, 2006). The biplot (Gabriel, 1971) has become a popular data visualization tool in many scientifi c research areas, including psychology, medicine, business, sociology, ecology, and agricultural sciences. Earlier uses of biplots in GED analyses include Bradu and Gabriel (1978), Kempton (1984), and Cooper and DeLacy (1994). The biplot tool has become increasingly popular among plant breeders and agricultural researchers since its use in cultivar evaluation and mega-environment investigation (Yan et al., Published in Crop Sci 47:643–655 (2007). doi: 10.2135/cropsci2006.06.0374 © Crop Science Society of America 677 S. Segoe Rd., Madison, WI 53711 USA REVIEW & INTERPRETATION
2000).Yan et al.(2000)referred to biplots based on sin- scaling).Mathem gular value ition (SVD)of nk 2 lea ank n natrix Z.This rep ntation anique except for hinlots”bee chan on all o nd/ which are the two sources of variation that are relevant to or all 6 andAn important property of the biplot is cultivar evaluation (Kang.1988.1993:Gauch and Zobel that the rank 2 approximation of any entry in the original 1996:Yan and Kang.2003) matrix Z can be computed by taking the inner product The commonly used GGE biplot is based on the Sites of the corresponding genotype and environment vectors, Regression(SREG)linear-bilinear(multiplicative)mode ie.6a,2)-,-2)=,di+di2 (Cornelius et al.,1996).which can be written as -,=∑入40+回 This is know -product property of the biplo logy (Yan e nean of genotype i in environm en 200 200 ran an ng,200 8 biplot interpreta and t is C) ined in the nd oth 1).The model is subied t to the found GGE biplots useful in mega ent analys 入,≥0 and to orthonormality Yan and rai s et al 2005-Sam scores that is et al.2005:Yan and Tinker.2005b:Dardanellia et al. =0 if with similar constraints on the 2006),genotype evaluation(Bhan et al.,2005;Malvar et [defined by replacing symbols (i.g,o)with (je,).The al.,2005:Voltas et al.,2005;Kang et al,2006),test-envi- e:are assumed NID(0.2/r).where r is the number of ronment evaluation (Yan and raican.2002:Blanche and replications within an environment. Myers,2006;Thomason and Phillips,2006 trait-as Least squares solution for is the empirical mean ciation and trait-profile analyses(Yan and Rajcan,2002 for the jth environme and th least squares solutions to M 004:Ober et a and heter para the ter analy and Hunt ble for rom tl th(for is the et 2006 mac del Rank(Z).In gen forGED alit super =1 also T of this nd inte multinlicative effects of the ith cultivar and ith enviro are:(i)to compare GGE biplotanalysis and AMMlanaly ment(for first usage of such terminology in a multiplic on three aspects of ged analysis namely mega-environ- tive model context,see Seyedsadr and Cornelius,1992). ment analysis,genotype evaluation,and test-environment Thus,Eq.[1]may be described as modeling the deviations evaluation:(ii)to discuss whether g and ge should be of the cell means from the environment means as a sum of combined or separated in GED analysis;and(iii)to discuss PCs,each of which is the product of a cultivar score the importance of model diagnosis in SVD-based analy. an environment score ()and a scale factor(the singular sis of GED.This disc ussion should enhance agrict ultura researchers'understanding of biplot analysis of GED GE biplot is onstr ed from PC the n cu THREE ASPECTS OF GED ANALYSIS USING GGE BIPLOTS and i-/ ent The of GED G e.,MET data for ingle e trai 0<f<1.i ale the s to enhance visual in of the biplot for nd fii particular purpose.Specifically,singular values are allo- type evaluation (Yan and Kang.2003).We use the vield cated entirely to cultivar scores if f=1 ithis is"cultivar. data of 18 winter wheat(Triticum aestivum L)genotypes focused"scaling (Yan,2002),or entirely to environment (Gl to G18)tested at nine Ontario locations(El to E9) scores if f=0("environment-focused"scaling):and f= (Table 1)as an example to illustrate the three aspects of 0.5 will allocate the square roots of the X values to cul biplot analysis.The same dataset was used extensively in tivar scores and also to environment scores ("symmetric Yan and Kang (2003)and Yan and Tinker (2006).When 644 WWW.CROPS.ORG CROP SCIENCE,VOL.47,MARCH-APRIL 2007
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved. 644 WWW.CROPS.ORG CROP SCIENCE, VOL. 47, MARCH–APRIL 2007 2000). Yan et al. (2000) referred to biplots based on singular value decomposition (SVD) of environment-centered or within-environment standardized GED as “GGE biplots,” because these biplots display both G and GE, which are the two sources of variation that are relevant to cultivar evaluation (Kang, 1988, 1993; Gauch and Zobel, 1996; Yan and Kang, 2003). The commonly used GGE biplot is based on the Sites Regression (SREG) linear-bilinear (multiplicative) model (Cornelius et al., 1996), which can be written as 1 t ij j k ik jk ij k y = − μ = ∑λα γ + ε [1] where y – ij is the cell mean of genotype i in environment j; μj is the mean value in environment j; i = 1, ∙ ∙ ∙ g; j = 1, ∙ ∙ ∙ e, g and e being the numbers of cultivars and environments, respectively; and t is the number of principal components (PC) used or retained in the model, with t ≤ min(e,g − 1). The model is subject to the constraint λ1 ≥ λ2 ≥ ∙ ∙ ∙ λt ≥ 0 and to orthonormality constraints on the αik scores, that is, 1 ' g ik ik i= ∑ α α = 1 if k = k' and 1 ' g ik ik i= ∑ α α = 0 if k ≠ k', with similar constraints on the γjk scores [defi ned by replacing symbols (i,g,α) with (j,e, γ)]. The eij are assumed 2 NID(0, / ) σ r , where r is the number of replications within an environment. Least squares solution for μj is the empirical mean (y – .j) for the jth environment, and the least squares solutions for parameters in the term λk αikγjk (for i = 1, ∙ ∙ ∙ ,g; j = 1,…,e) are obtained from the kth PC of the SVD of the matrix Z = [zij], where zij = y – ij – y – .j. The maximum number of PCs available for estimating the model parameters is p = Rank(Z). In general, p ≤ min(e, g − 1), with equality holding in most cases. For k = 1, 2, 3, ∙ ∙ ∙ , αik and γjk have also been characterized as primary, secondary, tertiary, etc., multiplicative eff ects of the ith cultivar and jth environment (for fi rst usage of such terminology in a multiplicative model context, see Seyedsadr and Cornelius, 1992). Thus, Eq. [1] may be described as modeling the deviations of the cell means from the environment means as a sum of PCs, each of which is the product of a cultivar score (αik), an environment score (γjk), and a scale factor (the singular value, λk ). The GGE biplot is constructed from the fi rst two PCs from the SVD of Z with “markers,” one for each cultivar, plotted with 1 1 ˆ ˆ f λ αi as abscissa and 2 2 ˆ ˆ f λ αi as ordinate. Similarly, markers for environments are plotted with 1 1 1 ˆ ˆf j − λ γ as abscissa and 1 2 2 ˆ ˆf j − λ γ as ordinate. The exponent f, with 0 ≤ f ≤ 1, is used to rescale the cultivar and environment scores to enhance visual interpretation of the biplot for a particular purpose. Specifi cally, singular values are allocated entirely to cultivar scores if f = 1 [this is “cultivarfocused” scaling (Yan, 2002)], or entirely to environment scores if f = 0 (“environment-focused” scaling); and f = 0.5 will allocate the square roots of the λˆ k values to cultivar scores and also to environment scores (“symmetric” scaling). Mathematically, a GGE biplot is a graphical representation of the rank 2 least squares approximation of the rank p matrix Z. This representation is unique except for possible simultaneous sign changes on all 1 ˆαi and 1 ˆ j γ and/ or all 2 ˆαi and 2 ˆ j γ . An important property of the biplot is that the rank 2 approximation of any entry in the original matrix Z can be computed by taking the inner product of the corresponding genotype and environment vectors, i.e., ( )( ) 1 1 1 1 2 2 1 1 2 2 111 22 2 ˆˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆˆ ˆˆ , , ff f f i i j j ij ij − − ′ λα λα λ γ λ γ = λα γ + λα γ . This is known as the inner-product property of the biplot. The GGE biplot methodology (Yan et al., 2000; Yan, 2001, 2002; Yan and Kang, 2003; Yan and Tinker, 2006) consists of a set of biplot interpretation methods, whereby important questions regarding genotype evaluation and test-environment evaluation can be visually addressed. Increasingly, plant breeders and other agronomists have found GGE biplots useful in mega-environment analysis (Yan and Rajcan, 2002; Casanoves et al., 2005; Samonte et al., 2005; Yan and Tinker, 2005b; Dardanellia et al., 2006), genotype evaluation (Bhan et al., 2005; Malvar et al., 2005; Voltas et al., 2005; Kang et al., 2006), test-environment evaluation (Yan and Rajcan, 2002; Blanche and Myers, 2006; Thomason and Phillips, 2006), trait-association and trait-profi le analyses (Yan and Rajcan, 2002; Morris et al., 2004; Ober et al., 2005), and heterotic pattern analysis (Yan and Hunt, 2002; Narro et al., 2003; Andio et al., 2004; Bertoia et al., 2006). The legitimacy of GGE biplot analysis was, however, recently questioned by Gauch (2006), who concluded that, for GED analyses, AMMI analysis was either superior or equal to GGE biplot analysis. The objectives of this review and interpretation paper are: (i) to compare GGE biplot analysis and AMMI analysis on three aspects of GED analysis, namely, mega-environment analysis, genotype evaluation, and test-environment evaluation; (ii) to discuss whether G and GE should be combined or separated in GED analysis; and (iii) to discuss the importance of model diagnosis in SVD-based analysis of GED. This discussion should enhance agricultural researchers’ understanding of biplot analysis of GED. THREE ASPECTS OF GED ANALYSIS USING GGE BIPLOTS The analysis of GED (i.e., MET data for a single trait) should include three major aspects: (i) mega-environment analysis; (ii) test-environment evaluation, and (iii) genotype evaluation (Yan and Kang, 2003). We use the yield data of 18 winter wheat (Triticum aestivum L.) genotypes (G1 to G18) tested at nine Ontario locations (E1 to E9) (Table 1) as an example to illustrate the three aspects of biplot analysis. The same dataset was used extensively in Yan and Kang (2003) and Yan and Tinker (2006). When
pltmeamtlnotione,dhncmnioementlS or cult across years (Yan and Rajcan 2002.dat genotyp and G years re I to de whether or o ur ncan be divide (:2006) es ofG ang 200 ested at the (sub-)set of Mega-environment Analysis ent o A GGE biplot is constructed by plotting the first prin- but not sufficient for declating diffe cipal component(PC1)scores of the ge notypes and the ments.For example,even if the target environments can environments against their respective scores for the second be subdivided into Group 1 and Group 2 repeatedly across principal component(PC2)that result from SVD of envi years,the target environment still may not be meaning ronment-centered or environment-standardized GED. fully divided if cultivar A and B win in Groups 1 and 2. The"which-won-where"view of the GGE biplot (Yan et respectively,in 1 yr,but the which-won-where pattern a set on for a repea wn fron ne eac which- an m es at righ of the polygon g pattern (Ya C, 200 ma an d K ar L- ep line that starts from the bi and p be divided in n side s the se of hype in the harlev ex nle gi nin Yan and Tinker (2005h)the cal environmen in which the two cultivars defining that GE that causes the c sovers among winning genotyp side perform cqually:the relative ranking of the two culti- can be exploited by selecting in and for each mega-er vars would be reversed in environments on opposite sides ronment.If the crossover GE patterns are not repeatable of the line (the so-called"crossover GE").Therefore.the across years,the GE cannot be exploited.Rather,it must perpendicular lines to the polygon sides divide the biplot be avoided by selecting high yielding and stable genotype into sectors,each having its own winning cultivar. across target environments win ing cul for a se or is ppropriate mega-enviro ment analysi should cla ction h ygon sides wh perpend sif的 the target environmen into one of three possibl ry o sector;it is p sector (see Meanyield ( E1 7993G kers fall into a single this indicates that,to a r Geno Test Environments had the highest yield types E1 E2 E4 E5 E6 E7 E8 E9 Mean .If er E3 markers fall into different sectors this indicates 42 that different cultivars won in different sectors revealin the which-won-where pattern ofa ged set is an intrinsi 473475338300.2 3.4545 property of the GGE biplot rendered by the inner-prod- 4.394.603513.8557 54251541 uct property of the biplot(Yan and Kang.2003).Once a 5184482.993776.58 GGE biplot is constructed.the polygon and the lines that 338418274316534427416406203370 divide the biplot into sec tors can be drawn by hand without 48546644330555458341750635746 alculation.In the of th G9 5.044.743.513.445.964.864.984.512.864.43 1)ba th 11 10 5.204.663.603.765.945.353.904.453.304.46 1 G11 4.294.532.763.426.145.254.884.143.154.28 all G12 3.153.0 2.392.35 1.23 4.263.384.072.103.22 1 4.10 2.30 3.72 4565,152.604.962.893.80 1g1 G8 G14 3.34 2.78 .635.0 3.283.92 258 3.5 highest vieldin This G1 4.3 that the may be divided into different mega Since a mega-environment is defined as a groun of locations that consistently share the best set of genotypes 4443.143.496.68 5e 4.244.36290419 CROP SCIENCE.VOL.47.MARCH-APRIL 2007 WWW.CROPS.ORG 645
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved. CROP SCIENCE, VOL. 47, MARCH–APRIL 2007 WWW.CROPS.ORG 645 supplemental information (e.g., data on environmental or genotypic covariates) is available, a fourth aspect, which is to understand the causes of G and GE, can be included (Yan and Hunt, 2001; Yan and Kang, 2003; Yan and Tinker, 2005b, 2006). Mega-environment Analysis A GGE biplot is constructed by plotting the fi rst principal component (PC1) scores of the genotypes and the environments against their respective scores for the second principal component (PC2) that result from SVD of environment-centered or environment-standardized GED. The “which-won-where” view of the GGE biplot (Yan et al., 2000) is an eff ective visual tool in mega-environment analysis. It consists of an irregular polygon and a set of lines drawn from the biplot origin and intersecting each of the sides at right angles. The vertices of the polygon are the genotype markers located farthest away from the biplot origin in various directions, such that all genotype markers are contained within the resulting polygon. A line that starts from the biplot origin and perpendicularly intersects a polygon side represents the set of hypothetical environments in which the two cultivars defi ning that side perform equally; the relative ranking of the two cultivars would be reversed in environments on opposite sides of the line (the so-called “crossover GE”). Therefore, the perpendicular lines to the polygon sides divide the biplot into sectors, each having its own winning cultivar. The winning cultivar for a sector is the vertex cultivar at the intersection of the two polygon sides whose perpendicular lines form the boundary of that sector; it is positioned usually, but not necessarily, within its winning sector (see Yan, 2002 for a detailed example). If all environment markers fall into a single sector, this indicates that, to a rank-two approximation, a single cultivar had the highest yield in all environments. If environment markers fall into diff erent sectors, this indicates that diff erent cultivars won in diff erent sectors. Revealing the which-won-where pattern of a GED set is an intrinsic property of the GGE biplot rendered by the inner-product property of the biplot (Yan and Kang, 2003). Once a GGE biplot is constructed, the polygon and the lines that divide the biplot into sectors can be drawn by hand without further calculation. In the which-won-where view of the GGE biplot (Fig. 1) based on the data in Table 1, the nine environments fell into two sectors with diff erent winning cultivars. Specifi cally, G18 was the highest yielding cultivar in E5 and E7 (but only slightly higher than several other cultivars with markers in close proximity to G18), and G8 was the highest yielding cultivar in the other environments. This crossover GE suggests that the target environments may be divided into diff erent mega-environments. Since a mega-environment is defi ned as a group of locations that consistently share the best set of genotypes or cultivars across years (Yan and Rajcan, 2002), data from multiple years are essential to decide whether or not the target region can be divided into diff erent mega-environments. Furthermore, a defi nitive conclusion must be based on data in which the same (sub-)set of genotypes is tested at the same (sub-)set of test locations across multiple years. Repeatable environment grouping is necessary, but not suffi cient, for declaring diff erent mega-environments. For example, even if the target environments can be subdivided into Group 1 and Group 2 repeatedly across years, the target environment still may not be meaningfully divided if cultivar A and B win in Groups 1 and 2, respectively, in 1 yr, but the which-won-where pattern is reversed in another year. The necessary and suffi cient condition for mega-environment division is a repeatable which-won-where pattern rather than merely a repeatable environment-grouping pattern (Yan and Rajcan, 2002; Yan and Kang, 2003). If the which-won-where or crossover patterns are repeatable across years and, hence, the target environment can be divided into subregions or mega-environments, as in the barley example given in Yan and Tinker (2005b), the GE that causes the crossovers among winning genotypes can be exploited by selecting in and for each mega-environment. If the crossover GE patterns are not repeatable across years, the GE cannot be exploited. Rather, it must be avoided by selecting high yielding and stable genotypes across target environments. Appropriate mega-environment analysis should classify the target environment into one of three possible Table 1. Mean yield (Mg ha−1) of 18 winter wheat cultivars (G1 to G18) tested at nine Ontario locations (E1 to E9) in 1993. Genotypes Test Environments E1 E2 E3 E4 E5 E6 E7 E8 E9 Mean G1 4.46 4.15 2.85 3.08 5.94 4.45 4.35 4.04 2.67 4.00 G2 4.42 4.77 2.91 3.51 5.70 5.15 4.96 4.39 2.94 4.31 G3 4.67 4.58 3.10 3.46 6.07 5.03 4.73 3.90 2.62 4.24 G4 4.73 4.75 3.38 3.90 6.22 5.34 4.23 4.89 3.45 4.54 G5 4.39 4.60 3.51 3.85 5.77 5.42 5.15 4.10 2.83 4.40 G6 5.18 4.48 2.99 3.77 6.58 5.05 3.99 4.27 2.78 4.34 G7 3.38 4.18 2.74 3.16 5.34 4.27 4.16 4.06 2.03 3.70 G8 4.85 4.66 4.43 3.95 5.54 5.83 4.17 5.06 3.57 4.67 G9 5.04 4.74 3.51 3.44 5.96 4.86 4.98 4.51 2.86 4.43 G10 5.20 4.66 3.60 3.76 5.94 5.35 3.90 4.45 3.30 4.46 G11 4.29 4.53 2.76 3.42 6.14 5.25 4.86 4.14 3.15 4.28 G12 3.15 3.04 2.39 2.35 4.23 4.26 3.38 4.07 2.10 3.22 G13 4.10 3.88 2.30 3.72 4.56 5.15 2.60 4.96 2.89 3.80 G14 3.34 3.85 2.42 2.78 4.63 5.09 3.28 3.92 2.56 3.54 G15 4.38 4.70 3.66 3.59 6.19 5.14 3.93 4.21 2.93 4.30 G16 4.94 4.70 2.95 3.90 6.06 5.33 4.30 4.30 3.03 4.39 G17 3.79 4.97 3.38 3.35 4.77 5.30 4.32 4.86 3.38 4.24 G18 4.24 4.65 3.61 3.91 6.64 4.83 5.01 4.36 3.11 4.48 Mean 4.36 4.44 3.14 3.49 5.68 5.06 4.24 4.36 2.90 4.19
nmv2 1.This AEC view is based on gular valus 0.8 ing (SVP).that is.the are entirely par G3 G18 titioned into the genotyp 0.4 G12 G2G6 (GGE biplot option "SVP =1") (Yan,2002).This AEC view with G6 SVP =1 is also referred to as the 0.0 G14 G1S16 “Mean vs.Stability”view because E4E3 E1 it facilitates genotype compari sons bas d on mean performance 04 d stability across environments G17 E8 GiEE9 men e AEC 0.8 d li -G8 G13 rough the d the 12 which is -1.2 -0.8 -0.4 0.0 0.4 0.8 1.2 1.6 at the center of the small circle with coordinates (ie PC1 means of environment PCI and Figure 1.The"which-won-where"view of the GGE biplot based on the G x E data in Table 1.The PC2 scores.The axis of the Aec ordinate is the double-arrowed a SVP-2" and th a the line that passes through the bip- nvironm ents.It explai d78%omeoalrcETmeg8nopesa8beeaa6Goeigang lot origin and is perpendicular to the environments are labeled as E1 to Eg the AEC or th the bip types (Table 2).Type 1is the easiest target environment lot,the projections e genotype ma aver ally an opt age env. prop of the GE opp on of the sho app genoty G.The 2 is of t山 if they hole int of AEC ab and GE an alysis.T oints in the dir tion of highe 3 is th lenging target environment and, fortunately.also the nance most common one Unless G is too small to be meaningful,the ranking of Genotype evaluation and test-environment evaluation the genotypes on the AEC abscissa is always perfectly or become meaningful only after the mega-environment highly correlated with G.the correlation being 1.0 for the issue is addressed.Within a single mega-environment, cultivars should be evaluated for their mean performance ing to G as follows:G8: G and stability across environments (Fig.2);and the test 16 17 G18 G6 G2>Mean G11 G3> environments should be evaluated for being,or not being. G13 G1 1 G7 >G12 representative or the target environment and for thei Since GG repre nts G+GE and since the power to d criminate among genotypes(Fig.3). Genotype Evaluation Gen ingful only for T1 otyp spe G4 wa stable loc d al nost on the AeC high mear nd higl stability within a me the AFC This ind environment assuming that the mega. nvironment dif that its rank was highly consistent across environments ferentiation in Fig.1 is repeatable across years,genotype within this mega-environment.In contrast,G17 and G6 evaluation should be conducted for each mega-enviro were two of the least stable genotypes with above average ment.Figure 2 is the "Average Environment Coordina mean performance tion"(AEC)view (Yan,2001)ofthe GGE biplot involving Yan (2001)defined an "ideal"genotype on the basis the seven environments in the G8 niche identified in Fig. of both mean performance and stability.and the geno 646 WWW.CROPS.ORG CROP SCIENCE,VOL.47,MARCH-APRIL 2007
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved. 646 WWW.CROPS.ORG CROP SCIENCE, VOL. 47, MARCH–APRIL 2007 types (Table 2). Type 1 is the easiest target environment one can hope for, but it is usually an overoptimistic expectation. Type 2 suggests opportunities for exploiting some of the GE. Such opportunities should not be overlooked if they exist, which is the whole point of mega-environment analysis and GE analysis. Type 3 is the most challenging target environment and, unfortunately, also the most common one. Genotype evaluation and test-environment evaluation become meaningful only after the mega-environment issue is addressed. Within a single mega-environment, cultivars should be evaluated for their mean performance and stability across environments (Fig. 2); and the test environments should be evaluated for being, or not being, representative of the target environment and for their power to discriminate among genotypes (Fig. 3). Genotype Evaluation Genotype evaluation is mean ingful only for a specifi c mega-environment, and an ideal geno type should have both high mean performance and high stability within a megaenvironment. Assuming that the mega-environment differentiation in Fig. 1 is repeatable across years, genotype evaluation should be conducted for each mega-environment. Figure 2 is the “Average Environment Coordination” (AEC) view (Yan, 2001) of the GGE biplot involving the seven environments in the G8 niche identifi ed in Fig. 1. This AEC view is based on genotype-focused singular value partitioning (SVP), that is, the singular values are entirely partitioned into the genotype scores (GGE biplot option “SVP = 1”) (Yan, 2002). This AEC view with SVP = 1 is also referred to as the “Mean vs. Stability” view because it facilitates genotype comparisons based on mean performance and stability across environments within a mega-environment. The axis of the AEC abscissa, or “average environment axis,” is the single-arrowed line that passes through the biplot origin and the “average environment,” which is at the center of the small circle with coordinates .1 .2 (, ) γ γ ˆ ˆ , i.e., means of environment PC1 and PC2 scores. The axis of the AEC ordinate is the double-arrowed line that passes through the biplot origin and is perpendicular to the AEC abscissa. Because of the inner-product property of the biplot, the projections of the genotype markers on the “average environment axis” are proportional to the rank-two approximation of the genotype means and represent the main eff ects of the genotypes, G. The arrow shown on the axis of the AEC abscissa points in the direction of higher mean performance of the genotypes and, consequently ranks the genotypes with respect to mean performance. Unless G is too small to be meaningful, the ranking of the genotypes on the AEC abscissa is always perfectly or highly correlated with G, the correlation being 1.0 for the current example. Thus, the genotypes are ranked according to G as follows: G8 > G4 = G10 > G5 = G9 = G15 = G16 = G17 = G18 > G6 > G2 > Mean = G11 > G3 > G13 > G1 > G14 > G7 > G12. Since GGE represents G+GE and since the AEC abscissa approximates the genotypes’ contributions to G, the AEC ordinate must approximate the genotypes’ contributions to GE, which is a measure of their stability or instability. Thus, G4 was the most stable genotype, as it was located almost on the AEC abscissa and had a nearzero projection onto the AEC ordinate. This indicates that its rank was highly consistent across environments within this mega-environment. In contrast, G17 and G6 were two of the least stable genotypes with above average mean performance. Yan (2001) defi ned an “ideal” genotype on the basis of both mean performance and stability, and the genoFigure 1. The “which-won-where” view of the GGE biplot based on the G × E data in Table 1. The data were not transformed (“Transform = 0”), not scaled (“Scaling = 0”), and were environmentcentered (“Centering = 2”). The biplot was based on environment-focused singular value partitioning (“SVP = 2”) and therefore is appropriate for visualizing the relationships among environments. It explained 78% of the total G+GE. The genotypes are labeled as G1 to G18 and the environments are labeled as E1 to E9
types can beranked based on theirbip- from the ideal genotype G7%PC%8m%2vp小 0.8 E8 catio 200no be m G17 than either nean erformance o E3 G8 a stability index. G14 G12 G13 0.0 9 Test Environment Evaluation The purpose of test-environment evalu- ation is to identify test environments that -0.4 effectively identity superior genotypes for a mega-environment.An 'ideal"test environment should be both discrimi 0.8 sh t as Fig. excep 2 E1 ling (Ya 2002).that is -1.6 12 -0.8 04 0.0 0.4 0.8 1.6 the 12 sVp=2門s PC1 so tha riate for studving the relation and stability of the among test environments.This ure 2 The subset of the Gx F hins th GGE bipl type of AEC can be referred to as the deta in Table 1.The data were not transto med ("Transform =0").not scaled (Scaling "Discriminating power vs.Representa- "),and were environment-centered ("Centering=2).The biplot was based on genotype e parut tiveness"view of the GGE biplot.It can the simi pes be helpful in evaluating each of the test environments with respect to the following questions: Table 2.Three types of target environment based on mega-environ- 1.Is the test environment capable of discrim ment analysis. No Cross er GE eatable 0e2 e 1:t 2.ong genotypes cross years select specit 3 Do represent te e of the mega-en Strate test at a the data are not scaled for standardized) Not ear s ("Scaling=0"),the length of an environment vec repeatable ng of a sin gle but complex mega- tor is pr portional to the standard deviation of cul- cross year tivar means in the environment.which is a measure the gy:select a set of cultivars of the discriminating power of the environment perfo e and stabil assuming that the experimental errors of the test environments are comparable.Test environments with longer vectors (like El in our example)are of th more discrimi environ ments that have small angles with it (e.g.E2.E3.E4 that is. ment has very ang at wl en SVI he ny en t is not wellr ed by PCI and PC2 if the biplot on the values in tha nvironment and the st of the GGE of the data. A of Fig.3is to indicate the test-environ Based on fig 3 a test environment may he classified ments'representativeness of the mega-environment Since into one of three types (Table 3).Type 1 environments the AEC abscissa is the "average-environment axis,"test have short vectors and provide little or no information CROP SCIENCE,VOL.47,MARCH-APRIL 2007 WWW.CROPS.ORG 647
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved. CROP SCIENCE, VOL. 47, MARCH–APRIL 2007 WWW.CROPS.ORG 647 types can be ranked based on their biplot distance from the ideal genotype. Dimitrios Baxevanos (personal communication, 2006) found this GGE distance to be more repeatable across years than either mean performance or a stability index. Test Environment Evaluation The purpose of test-environment evaluation is to identify test environments that eff ectively identify superior genotypes for a mega-environment. An “ideal” test environment should be both discriminating of the genotypes and representative of the mega-environment. Figure 3 is the same GGE biplot as Fig. 2 except that it is based on environment-focused scaling (Yan, 2002), that is, the singular values were entirely partitioned into the environment scores (“SVP = 2”) so that it is appropriate for studying the relationships among test environments. This type of AEC can be referred to as the “Discriminating power vs. Representativeness” view of the GGE biplot. It can be helpful in evaluating each of the test environments with respect to the following questions: 1. Is the test environment capable of discriminating among the genotypes, i.e., does it provide much information about the diff erences among genotypes? 2. Is it representative of the mega-environment? 3. Does it provide unique information about the genotypes? When the data are not scaled (or standardized) (“Scaling = 0”), the length of an environment vector is proportional to the standard deviation of cultivar means in the environment, which is a measure of the discriminating power of the environment, assuming that the experimental errors of the test environments are comparable. Test environments with longer vectors (like E1 in our example) are more discriminating of the genotypes. If a test environment marker falls close to the biplot origin, that is, if the test environment has a very short vector, it means that all genotypes performed similarly in it and therefore it provided little or no information about the genotype differences. A short vector could also mean that the environment is not well represented by PC1 and PC2 if the biplot does not explain most of the GGE of the data. A second usage of Fig. 3 is to indicate the test-environments’ representativeness of the mega-environment. Since the AEC abscissa is the “average-environment axis,” test environments that have small angles with it (e.g., E2, E3, E4, E6, and E9 are more representative of the mega-environment than those that have larger angles with it, e.g., E1 and E8). This follows from the fact that when SVP = 2, the cosine of the angle between any environment vector and the “average environment axis” approximates the correlation coeffi cient between the genotype values in that environment and the genotype means across the environments. Based on Fig. 3, a test environment may be classifi ed into one of three types (Table 3). Type 1 environments have short vectors and provide little or no information Figure 2. The “mean vs. stability” view of the GGE biplot based on a subset of the G × E data in Table 1. The data were not transformed (“Transform = 0”), not scaled (“Scaling = 0”), and were environment-centered (“Centering = 2”). The biplot was based on genotypefocused singular value partitioning (“SVP = 1”) and therefore is appropriate for visualizing the similarities among genotypes. It explained 79.5% of the total G+GE for the subset. Table 2. Three types of target environment based on mega-environment analysis. With Crossover GE No Crossover GE Repeatable across years Type 2: target environment consisting of multiple mega-environments. Strategy: select specifi cally adapted genotypes for each megaenvironment. A single year multilocation trial may be suffi cient. Type 1: target environment consisting of a single, simple megaenvironment. Strategy: test at a single test location in a single year suffi ces to select for a single best cultivar. Not repeatable across years Type 3: target environment consisting of a single but complex megaenvironment. Strategy: select a set of cultivars for the whole region based on both mean performance and stability based on data from multiyear and multilocation tests