A comprehensive overview of the internationalisation of correspondence analysis Correspondence Analysis: Theory, Practice and New Strategies examines the key issues of correspondence analysis, and discusses the new advances that have been made over the last 20 years. The main focus of this book is to provide a comprehensive discussion of some of the key technical and practical aspects of correspondence analysis, and to demonstrate how they may be put to use. Particular attention is given to the history and mathematical links of the developments made. These links include not just those major contributions made by researchers in Europe (which is where much of the attention surrounding correspondence analysis has focused) but also the important contributions made by researchers in other parts of the world. Key features include: * A comprehensive international perspective on the key developments of correspondence analysis. * Discussion of correspondence analysis for nominal and ordinal categorical data. * Discussion of correspondence analysis of contingency tables with varying association structures (symmetric and non-symmetric relationship between two or more categorical variables).* Extensive treatment of many of the members of the correspondence analysis family for two-way, three-way and multiple contingency tables. Correspondence Analysis offers a comprehensive and detailed overview of this topic which will be of value to academics, postgraduate students and researchers wanting a better understanding of correspondence analysis. Readers interested in the historical development, internationalisation and diverse applicability of correspondence analysis will also find much to enjoy in this book.
Foreword xv
Preface xvii
Part One Introduction
1 Data Visualisation 3 (41)
1.1 A Very Brief Introduction to Data 3 (7)
Visualisation
1.1.1 A Very Brief History 3 (1)
1.1.2 Introduction to Visualisation Tools 4 (2)
for Numerical Data
1.1.3 Introduction to Visualisation Tools 6 (4)
for Univariate Categorical Data
1.2 Data Visualisation for Contingency 10 (2)
Tables
1.2.1 Fourfold Displays 11 (1)
1.3 Other Plots 12 (1)
1.4 Studying Exposure to Asbestos 13 (12)
1.4.1 Asbestos and Irving J. Selikoff 13 (4)
1.4.2 Selikoff's Data 17 (1)
1.4.3 Numerical Analysis of Selikoff's 17 (1)
Data
1.4.4 A Graphical Analysis of Selikoff's 18 (2)
Data
1.4.5 Classical Correspondence Analysis 20 (2)
of Selikoff's Data
1.4.6 Other Methods of Graphical Analysis 22 (3)
1.5 Happiness Data 25 (4)
1.6 Correspondence Analysis Now 29 (5)
1.6.1 A Bibliographic Taste 29 (1)
1.6.2 The Increasing Popularity of 29 (3)
Correspondence Analysis
1.6.3 The Growth of the Correspondence 32 (2)
Analysis Family Tree
1.7 Overview of the Book 34 (1)
1.8 R Code 35 (1)
References 36 (8)
2 Pearson's Chi-Squared Statistic 44 (27)
2.1 Introduction 44 (1)
2.2 Pearson's Chi-Squared Statistic 44 (7)
2.2.1 Notation 44 (1)
2.2.2 Measuring the Departure from 45 (2)
Independence
2.2.3 Pearson's Chi-Squared Statistic 47 (1)
2.2.4 Other xイ Measures of Association 48 (1)
2.2.5 The Power Divergence Statistic 49 (1)
2.2.6 Dealing with the Sample Size 50 (1)
2.3 The Goodman-Kruskal Tau Index 51 (1)
2.3.1 Other Measures and Issues 52 (1)
2.4 The 2 x 2 Contingency Table 52 (2)
2.4.1 Yates' Continuity Correction 53 (1)
2.5 Early Contingency Tables 54 (7)
2.5.1 The Impact of Adolph Quetelet 55 (3)
2.5.2 Gavarret's (1840) Legitimate 58 (1)
Children Data
2.5.3 Finley's (1884) Tornado Data 58 (1)
2.5.4 Galton's (1892) Fingerprint Data 59 (2)
2.5.5 Final Comments 61 (1)
2.6 R Code 61 (6)
2.6.1 Expectation and Variance of the 61 (1)
Pearson Chi-Squared Statistic
2.6.2 Pearson's Chi-Squared Test of 62 (2)
Independence
2.6.3 The Cressie-Read Statistic 64 (3)
References 67 (4)
Part Two Correspondence Analysis of Two-Way 71 (302)
Contingency Tables
3 Methods of Decomposition 73 (47)
3.1 Introduction 73 (1)
3.2 Reducing Multidimensional Space 73 (1)
3.3 Profiles and Cloud of Points 74 (5)
3.4 Property of Distributional Equivalence 79 (1)
3.5 The Triplet and Classical Reciprocal 79 (5)
Averaging
3.5.1 One-Dimensional Reciprocal Averaging 80 (1)
3.5.2 Matrix Form of One-Dimensional 81 (2)
Reciprocal Averaging
3.5.3 M-Dimensional Reciprocal Averaging 83 (1)
3.5.4 Some Historical Comments 83 (1)
3.6 Solving the Triplet Using 84 (2)
Eigen-Decomposition
3.6.1 The Decomposition 84 (1)
3.6.2 Example 85 (1)
3.7 Solving the Triplet Using Singular 86 (3)
Value Decomposition
3.7.1 The Standard Decomposition 86 (2)
3.7.2 The Generalised Decomposition 88 (1)
3.8 The Generalised Triplet and Reciprocal 89 (2)
Averaging
3.9 Solving the Generalised Triplet Using 91 (9)
Gram-Schmidt Process
3.9.1 Ordered Categorical Variables and a 91 (1)
priori Scores
3.9.2 On Finding Orthogonalised Vectors 92 (2)
3.9.3 A Recurrence Formulae Approach 94 (2)
3.9.4 Changing the Basis Vector 96 (1)
3.9.5 Generalised Correlations 97 (3)
3.10 Bivariate Moment Decomposition 100 (1)
3.11 Hybrid Decomposition 100 (3)
3.11.1 An Alternative Singly Ordered 102 (1)
Approach
3.12 R Code 103 (6)
3.12.1 Eigen-Decomposition in R 103 (1)
3.12.2 Singular Value Decomposition in R 103 (1)
3.12.3 Singular Value Decomposition for 104 (2)
Matrix Approximation
3.12.4 Generating Emerson's Polynomials 106 (3)
3.13 A Preliminary Graphical Summary 109 (3)
3.14 Analysis of Analgesic Drugs 112 (3)
References 115 (5)
4 Simple Correspondence Analysis 120 (57)
4.1 Introduction 120 (1)
4.2 Notation 121 (1)
4.3 Measuring Departures from Complete 122 (2)
Independence
4.3.1 The 'Duplication Constant' 123 (1)
4.3.2 Pearson Ratios 123 (1)
4.4 Decomposing the Pearson Ratio 124 (2)
4.5 Coordinate Systems 126 (10)
4.5.1 Standard Coordinates 126 (1)
4.5.2 Principal Coordinates 127 (5)
4.5.3 Biplot Coordinates 132 (4)
4.6 Distances 136 (4)
4.6.1 Distance from the Origin 136 (1)
4.6.2 Intra-Variable Distances and the Lp 137 (1)
Metric
4.6.3 Inter-Variable Distances 138 (2)
4.7 Transition Formulae 140 (1)
4.8 Moments of the Principal Coordinates 141 (4)
4.8.1 The Mean of ナm 142 (1)
4.8.2 The Variance of ナm 142 (1)
4.8.3 The Skewness of ナm 143 (1)
4.8.4 The Kurtosis of ナm 143 (1)
4.8.5 Moments of the Asbestos Data 144 (1)
4.9 How Many Dimensions to Use? 145 (2)
4.10 R Code 147 (7)
4.11 Other Theoretical Issues 154 (2)
4.12 Some Applications of Correspondence 156 (2)
Analysis
4.13 Analysis of a Mother's Attachment to 158 (7)
Her Child
References 165 (12)
5 Non-Symmetrical Correspondence Analysis 177 (39)
5.1 Introduction 177 (3)
5.2 The Goodman-Kruskal Tau Index 180 (6)
5.2.1 The Tau Index as a Measure of the 180 (2)
Increase in Predictability
5.2.2 The Tau Index in the Context of 182 (1)
ANOVA
5.2.3 The Sensitivity of τ 182 (3)
5.2.4 A Demonstration: Revisiting 185 (1)
Selikoff s Asbestos Data
5.3 Non-Symmetrical Correspondence Analysis 186 (2)
5.3.1 The Centred Column Profile Matrix 186 (1)
5.3.2 Decomposition of τ 187 (1)
5.4 The Coordinate Systems 188 (9)
5.4.1 Standard Coordinates 188 (1)
5.4.2 Principal Coordinates 189 (4)
5.4.3 Biplot Coordinates 193 (4)
5.5 Transition Formulae 197 (2)
5.5.1 Supplementary Points 198 (1)
5.5.2 Reconstruction Formulae 198 (1)
5.6 Moments of the Principal Coordinates 199 (2)
5.6.1 The Mean of ナm 199 (1)
5.6.2 The Variance of ナm 200 (1)
5.6.3 The Skewness of ナm 201 (1)
5.6.4 The Kurtosis of ナm 201 (1)
5.7 The Distances 201 (3)
5.7.1 Column Distances 201 (2)
5.7.2 Row Distances 203 (1)
5.8 Comparison with Simple Correspondence 204 (1)
Analysis
5.9 R Code 204 (5)
5.10 Analysis of a Mother's Attachment to 209 (3)
Her Child
References 212 (4)
6 Ordered Correspondence Analysis 216 (35)
6.1 Introduction 216 (5)
6.2 Pearson's Ratio and Bivariate Moment 221 (1)
Decomposition
6.3 Coordinate Systems 222 (11)
6.3.1 Standard Coordinates 222 (1)
6.3.2 The Generalised Correlations 223 (2)
6.3.3 Principal Coordinates 225 (4)
6.3.4 Location, Dispersion and Higher 229 (1)
Order Components
6.3.5 The Correspondence Plot and 230 (2)
Generalised Correlations
6.3.6 Impact on the Choice of Scores 232 (1)
6.4 Artificial Data Revisited 233 (3)
6.4.1 On the Structure of the Association 233 (1)
6.4.2 A Graphical Summary of the 233 (1)
Association
6.4.3 An Interpretation of the Axes and 234 (1)
Components
6.4.4 The Impact of the Choice of Scores 235 (1)
6.5 Transition Formulae 236 (2)
6.6 Distance Measures 238 (1)
6.6.1 Distance from the Origin 238 (1)
6.6.2 Intra-Variable Distances 239 (1)
6.7 Singly Ordered Analysis 239 (2)
6.8 R Code 241 (7)
6.8.1 Generalised Correlations and 241 (4)
Principal Inertias
6.8.2 Doubly Ordered Correspondence 245 (3)
Analysis
References 248 (3)
7 Ordered Non-Symmetrical Correspondence 251 (51)
Analysis
7.1 Introduction 251 (1)
7.2 General Considerations 252 (2)
7.2.1 Orthogonal Polynomials Instead of 253 (1)
Singular Vectors
7.3 Doubly Ordered Non-Symmetrical 254 (3)
Correspondence Analysis
7.3.1 Bivariate Moment Decomposition 254 (1)
7.3.2 Generalised Correlations in 255 (2)
Bivariate Moment Decomposition
7.4 Singly Ordered Non-Symmetrical 257 (2)
Correspondence Analysis
7.4.1 Hybrid Decomposition for an Ordered 257 (1)
Predictor Variable
7.4.2 Hybrid Decomposition in the Case of 258 (1)
Ordered Response Variables
7.4.3 Generalised Correlations in Hybrid 258 (1)
Decomposition
7.5 Coordinate Systems for Ordered 259 (6)
Non-Symmetrical Correspondence Analysis
7.5.1 Polynomial Plots for Doubly Ordered 260 (2)
Non-Symmetrical Correspondence Analysis
7.5.2 Polynomial Biplot for Doubly 262 (1)
Ordered Non-Symmetrical Correspondence
Analysis
7.5.3 Polynomial Plot for Singly Ordered 262 (1)
Non-Symmetrical Correspondence Analysis
with an Ordered Predictor Variable
7.5.4 Polynomial Biplot for Singly 263 (1)
Ordered Non-Symmetrical Correspondence
Analysis with an Ordered Predictor
Variable
7.5.5 Polynomial Plot for Singly Ordered 264 (1)
Non-Symmetrical Correspondence Analysis
with an Ordered Response Variable
7.5.6 Polynomial Biplot for Singly 265 (1)
Ordered Non-Symmetrical Correspondence
Analysis with an Ordered Response Variable
7.6 Tests of Asymmetric Association 265 (1)
7.7 Distances in Ordered Non-Symmetrical 266 (3)
Correspondence Analysis
7.7.1 Distances in Doubly Ordered 267 (2)
Non-Symmetrical Correspondence Analysis
7.7.2 Distances in Singly Ordered 269 (1)
Non-Symmetrical Correspondence Analysis
7.8 Doubly Ordered Non-Symmetrical 269 (8)
Correspondence of Asbestos Data
7.8.1 Trends 270 (7)
7.9 Singly Ordered Non-Symmetrical 277 (6)
Correspondence Analysis of Drug Data
7.9.1 Predictability of Ordered Rows 278 (5)
Given Columns
7.10 R Code for Ordered Non-Symmetrical 283 (17)
Correspondence Analysis
References 300 (2)
8 External Stability and Confidence Regions 302 (35)
8.1 Introduction 302 (1)
8.2 On the Statistical Significance of a 303 (1)
Point
8.3 Circular Confidence Regions for 304 (2)
Classical Correspondence Analysis
8.4 Elliptical Confidence Regions for 306 (5)
Classical Correspondence Analysis
8.4.1 The Information in the Optimal 306 (2)
Correspondence Plot
8.4.2 The Information in the First Two 308 (1)
Dimensions
8.4.3 Eccentricity of Elliptical Regions 309 (1)
8.4.4 Comparison of Confidence Regions 309 (2)
8.5 Confidence Regions for Non-Symmetrical 311 (2)
Correspondence Analysis
8.5.1 Circular Regions in Non-Symmetrical 312 (1)
Correspondence Analysis
8.5.2 Elliptical Regions in 312 (1)
Non-Symmetrical Correspondence Analysis
8.6 Approximate p-values and Classical 313 (2)
Correspondence Analysis
8.6.1 Approximate p-values Based on 313 (1)
Confidence Circles
8.6.2 Approximate p-values Based on 314 (1)
Confidence Ellipses
8.7 Approximate p-values and 315 (1)
Non-Symmetrical Correspondence Analysis
8.8 Bootstrap Elliptical Confidence Regions 315 (1)
8.9 Ringrose's Bootstrap Confidence Regions 316 (2)
8.9.1 Confidence Ellipses and Covariance 317 (1)
Matrix
8.10 Confidence Regions and Selikoff s 318 (4)
Asbestos Data
8.11 Confidence Regions and Mother-Child 322 (3)
Attachment Data
8.12 R Code 325 (10)
8.12.1 Calculating the Path of a 326 (1)
Confidence Ellipse
8.12.2 Constructing Elliptical Regions in 327 (8)
a Correspondence Plot
References 335 (2)
9 Variants of Correspondence Analysis 337 (36)
9.1 Introduction 337 (1)
9.2 Correspondence Analysis Using Adjusted 337 (3)
Standardised Residuals
9.3 Correspondence Analysis Using the 340 (2)
Freeman-Tukey Statistic
9.4 Correspondence Analysis ofRanked Data 342 (1)
9.5 R Code 343 (10)
9.5.1 Adjusted Standardised Residuals 343 (6)
9.5.2 Freeman-Tukey Statistic 349 (4)
9.6 The Correspondence Analysis Family 353 (12)
9.6.1 Detrended Correspondence Analysis 353 (1)
9.6.2 Canonical Correspondence Analysis 354 (1)
9.6.3 Inverse Correspondence Analysis 355 (1)
9.6.4 Ordered Correspondence Analysis 355 (1)
9.6.5 Grade Correspondence Analysis 355 (1)
9.6.6 Symbolic Correspondence Analysis 356 (1)
9.6.7 Correspondence Analysis of 356 (4)
Proximity Data
9.6.8 Residual (Scaling) Correspondence 360 (2)
Analysis
9.6.9 Log-Ratio Correspondence Analysis 362 (2)
9.6.10 Parametric Correspondence Analysis 364 (1)
9.6.11 Subset Correspondence Analysis 364 (1)
9.6.12 Foucart's Correspondence Analysis 365 (1)
9.7 Other Techniques 365 (1)
References 366 (7)
Part Three Correspondence Analysis of Multi-Way 373 (144)
Contingency Tables
10 Coding and Multiple Correspondence Analysis 375 (76)
10.1 Introduction to Coding 375 (2)
10.2 Coding Data 377 (5)
10.2.1 B-Splines 377 (3)
10.2.2 Crisp Coding 380 (2)
10.2.3 Fuzzy Coding 382 (1)
10.3 Coding Ordered Categorical Variables 382 (2)
by Orthogonal Polynomials
10.4 Burt Matrix 384 (2)
10.5 An Introduction to Multiple 386 (2)
Correspondence Analysis
10.6 Multiple Correspondence Analysis 388 (7)
10.6.1 Notation 388 (1)
10.6.2 Decomposition Methods 389 (4)
10.6.3 Coordinates, Transition Formulae 393 (2)
and Adjusted Inertia
10.7 Variants of Multiple Correspondence 395 (3)
Analysis
10.7.1 Joint Correspondence Analysis 396 (1)
10.7.2 Stacking and Concatenation 397 (1)
10.8 Ordered Multiple Correspondence 398 (7)
Analysis
10.8.1 Orthogonal Polynomials in Multiple 398 (1)
Correspondence Analysis
10.8.2 Hybrid Decomposition of Multiple 399 (1)
Indicator Tables
10.8.3 Two Ordered Variables and Their 400 (1)
Contingency Table
10.8.4 Test of Statistical Significance 401 (2)
10.8.5 Properties of Ordered Multiple 403 (1)
Correspondence Analysis
10.8.6 Graphical Displays in Ordered 404 (1)
Multiple Correspondence Analysis
10.9 Applications 405 (12)
10.9.1 Customer Satisfaction in Health 406 (5)
Care Services
10.9.2 Two Quality Aspects 411 (6)
10.10 R Code 417 (27)
10.10.1 B-Spline Function 417 (4)
10.10.2 Crisp and Fuzzy Coding Using 421 (4)
B-Splines in R
10.10.3 Crisp Coding and the Burt Table 425 (3)
by Indicator Functions in R
10.10.4 Classical and Multiple 428 (16)
Correspondence Analysis in R
References 444 (7)
11 Symmetrical and Non-Symmetrical Three-Way 451 (66)
Correspondence Analysis
11.1 Introduction 451 (2)
11.2 Notation 453 (1)
11.3 Symmetric and Asymmetric Association 454 (1)
in Three-Way Contingency Tables
11.4 Partitioning Three-Way Measures of 455 (8)
Association
11.4.1 Partitioning Pearson's Three-Way 457 (1)
Statistic
11.4.2 Partitioning Marcotorchino's and 458 (2)
Gray-William's Three-Way Indices
11.4.3 Marcotorchino's Index 460 (1)
11.4.4 Partitioning the Three-Way Delta 461 (2)
Index
11.4.5 Three-Way Delta Index 463 (1)
11.5 Formal Tests of Predictability 463 (3)
11.5.1 Testing Pearson's Statistic 464 (1)
11.5.2 Testing the Marcotorchino's Index 464 (1)
11.5.3 Testing the Delta Index 465 (1)
11.5.4 Discussion 465 (1)
11.6 Tucker3 Decomposition for Three-Way 466 (1)
Tables
11.7 Correspondence Analysis of Three-Way 467 (3)
Contingency Tables
11.7.1 Symmetrically Associated Variables 467 (1)
11.7.2 Asymmetrically Associated Variables 468 (1)
11.7.3 Additional Property 469 (1)
11.8 Modelling of Partial and Marginal 470 (1)
Dependence
11.9 Graphical Representation 471 (3)
11.9.1 Interactive Plot 471 (1)
11.9.2 Interactive Biplot 472 (2)
11.9.3 Category Contribution 474 (1)
11.10 On the Application of Partitions 474 (3)
11.10.1 Olive Data: Partitioning the 474 (2)
Asymmetric Association
11.10.2 Job Satisfaction Data: 476 (1)
Partitioning the Asymmetric Association
11.11 On the Application of Three-Way 477 (13)
Correspondence Analysis
11.11.1 Job Satisfaction and Three-Way 477 (6)
Symmetrical Correspondence Analysis
11.11.2 Job Satisfaction and Three-Way 483 (7)
Non-Symmetrical Correspondence Analysis
11.12 R Code 490 (21)
References 511 (6)
Part Four The Computation of Correspondence 517 (28)
Analysis
12 Computing and Correspondence Analysis 519 (26)
12.1 Introduction 519 (1)
12.2 A Look Through Time 519 (4)
12.2.1 Pre-1990 519 (1)
12.2.2 From 1990 to 2000 520 (2)
12.2.3 The Early 2000s 522 (1)
12.3 The Impact of R 523 (10)
12.3.1 Overview of Correspondence 523 (1)
Analysis in R
12.3.2 MASS 524 (1)
12.3.3 Nenadic and Greenacre's (2007) ca 525 (2)
12.3.4 Murtagh (2005) 527 (3)
12.3.5 ade4 530 (3)
12.4 Some Stand-Alone Programs 533 (7)
12.4.1 JMP 533 (1)
12.4.2 SPSS 533 (1)
12.4.3 PAST 534 (1)
12.4.4 DtmVic5.6+ 535 (5)
References 540 (5)
Index 545