class: center, inverse background-image: url("blindman2.jpg") --- <style type="text/css"> body, td { font-size: 15px; } code.r{ font-size: 15px; } pre { font-size: 20px } .huge .remark-code { /*Change made here*/ font-size: 200% !important; } .tiny .remark-code { /*Change made here*/ font-size: 80% !important; } </style> ## Press record --- ## Intended learning goals <br/><br/> Motivate utilisation of path and CFA models; Argue how they connect to other models that we covered at the course. <br/><br/> Calculate number of free parameters and degrees of freedom of the proposed model. <br/><br/> Build a model in R statistical environment, estimate, and interpret the coefficients. <br/><br/> Criticise, modify, compare, and evaluate the fit of the proposed models. <br/><br/> --- ## Latent space of measures Principal Component Analysis (PCA) <br/><br/> Exploratory Factor Analysis (EFA) <br/><br/> Confirmatory Factor Analysis (CFA) ??? Differences between PCA and EFA:<br/><br/> [Link 1](https://stats.stackexchange.com/a/95106)<br/><br/> [Link 2](https://stats.stackexchange.com/a/288646) --- ## Exploratory factor analysis (EFA) Multivariate statistical procedure (Spearman): understanding and accounting for variation and covariation among of set of observed variables by postulating __latent__ structures (factors)<br/><br/> Factor: unobservable variable that influences more than one observed measure and accounts for their intercorrelation <br/><br/> If we partial out latent construct then intercorrelations would be zero <br/><br/> Factor analysis decomposes variance: __a) common variance (communality)__ and __b) unique variance__ ??? Thourough example of EFA in R: https://psu-psychology.github.io/psy-597-SEM/06_factor_models/factor_models.html#overview --- ## EFA versus CFA Reproduce observer relationships between measured variables with smaller number of latent factors <br/><br/> EFA is data-driven approach: weak or no assumptions on a number of latent dimensions and factor loadings (relations between indicators and factors) <br/> <br/> CFA is theory-driven approach: strong assumptions for both things <br/><br/> EFA is used earlier in the process of questionnaire development and construct validation --- ## Factor model <img src="image1.png" width="90%" style="display: block; margin: auto;" /> --- ## Factor or measurement model Is linear regression where the main predictor is latent or unobserved: <br/> `$$y_1=\tau_1+\lambda_1*\eta+\epsilon_1$$`<br/><br/> `\(y_1=\tau_1+\lambda_1*\eta+\epsilon_1\)`<br/> `\(y_2=\tau_2+\lambda_2*\eta+\epsilon_2\)`<br/> `\(y_3=\tau_3+\lambda_3*\eta+\epsilon_3\)`<br/><br/> `\(\tau\)` - the item intercepts or means<br/> `\(\lambda\)` - factor loadings - regression coefficients <br/> `\(\epsilon\)` - error variances and covariances <br/> `\(\eta\)` - the latent predictor of the items<br/> `\(\psi\)` - factor variances and covariances <br/> --- ## Exploratory factor model <img src="image2.png" width="90%" style="display: block; margin: auto;" /> --- ## Confirmatory factor model <img src="image3.png" width="90%" style="display: block; margin: auto;" /> --- ## CFA: Reflective and formative <img src="RefVsForm.png" width="90%" style="display: block; margin: auto;" /> --- ## Defining latent variables LVs are not measured, however we can still infer them from the observed data. To be able to do so, we need to define their scale: 1. Marker variable: single factor loading constraint to 1 <br/><br/> 2. Standardized latent variables: setting variance of variable to 1 (Z-score) <br/><br/> 3. Effects-coding: constraints that all of the loadings to one LV average 1.0 or that their sum is equal to number of indicators ??? https://www.researchgate.net/publication/255606342_A_Non-arbitrary_Method_of_Identifying_and_Scaling_Latent_Variables_in_SEM_and_MACS_Models --- ## Indicator variable and Standardizing LVs <img src="IndVar.png" width="70%" style="display: block; margin: auto;" /> --- ## Effect coding <img src="Effect.png" width="40%" style="display: block; margin: auto;" /> --- ## Identification of the CFA Total number of parameters that we can estimate: `\(\frac{variables*(variables+1)}{2}\)` <br/> <br/> <br/> ```r Matrix<-cov(vars) Matrix[upper.tri(Matrix)]<-NA knitr::kable(Matrix, format = 'html') ``` <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> TimeOnTummy </th> <th style="text-align:right;"> PreciseLegMoves </th> <th style="text-align:right;"> PreciseHandMoves </th> <th style="text-align:right;"> Babbling </th> <th style="text-align:right;"> Screeching </th> <th style="text-align:right;"> VocalImitation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> TimeOnTummy </td> <td style="text-align:right;"> 24.8022903 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> <tr> <td style="text-align:left;"> PreciseLegMoves </td> <td style="text-align:right;"> 8.8224852 </td> <td style="text-align:right;"> 22.819361 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> <tr> <td style="text-align:left;"> PreciseHandMoves </td> <td style="text-align:right;"> 10.8533001 </td> <td style="text-align:right;"> 9.267157 </td> <td style="text-align:right;"> 24.266277 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> <tr> <td style="text-align:left;"> Babbling </td> <td style="text-align:right;"> 1.3525042 </td> <td style="text-align:right;"> 4.039860 </td> <td style="text-align:right;"> 3.519092 </td> <td style="text-align:right;"> 24.687576 </td> <td style="text-align:right;"> NA </td> <td style="text-align:right;"> NA </td> </tr> <tr> <td style="text-align:left;"> Screeching </td> <td style="text-align:right;"> 0.5893004 </td> <td style="text-align:right;"> 3.116331 </td> <td style="text-align:right;"> 1.720064 </td> <td style="text-align:right;"> 8.081221 </td> <td style="text-align:right;"> 21.159437 </td> <td style="text-align:right;"> NA </td> </tr> <tr> <td style="text-align:left;"> VocalImitation </td> <td style="text-align:right;"> 2.7149160 </td> <td style="text-align:right;"> 4.457381 </td> <td style="text-align:right;"> 4.325576 </td> <td style="text-align:right;"> 11.314745 </td> <td style="text-align:right;"> 4.809383 </td> <td style="text-align:right;"> 25.01231 </td> </tr> </tbody> </table> --- ## Theory and previous results Previous work in this area found that two __congeneric__ latent factors explain covariances of our six indicators: motoric and verbal latent component <br/><br/> <img src="image4.png" width="65%" style="display: block; margin: auto;" /> --- ## Estimated number of parameters <img src="parameters.png" width="50%" style="display: block; margin: auto;" /> -- Loadings `\((\lambda)\)`: 4 parameters<br/><br/> Residual variances `\((\epsilon)\)` : 6 parameters<br/><br/> Factor variances and covariances `\((\psi)\)` : 3 parameters<br/><br/> With intercepts: + 6 --- ## Syntax in R .center[ <img src="Rsyntax.png", width = "60%"> <br/> ] --- ## Coding of our model ```r #install.packages('lavaan') require(lavaan) model1<-' motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves verbal =~ Babbling + Screeching + VocalImitation ' fit1<-cfa(model1, data=Babies) ``` <style type="text/css"> pre { max-height: 300px; overflow-y: auto; } pre[class] { max-height: 100px; } </style> <style type="text/css"> .scroll-100 { max-height: 100px; overflow-y: auto; background-color: inherit; } </style> --- ## Results of the model ```r summary(fit1) ``` ``` ## lavaan 0.6-9 ended normally after 74 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 13 ## ## Number of observations 100 ## ## Model Test User Model: ## ## Test statistic 3.376 ## Degrees of freedom 8 ## P-value (Chi-square) 0.909 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) ## motor =~ ## TimeOnTummy 1.000 ## PreciseLegMovs 0.910 0.240 3.791 0.000 ## PreciseHandMvs 1.099 0.293 3.746 0.000 ## verbal =~ ## Babbling 1.000 ## Screeching 0.494 0.182 2.718 0.007 ## VocalImitation 0.716 0.246 2.906 0.004 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) ## motor ~~ ## verbal 3.433 1.901 1.806 0.071 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 15.031 3.181 4.725 0.000 ## .PreciseLegMovs 14.709 2.867 5.130 0.000 ## .PreciseHandMvs 12.515 3.338 3.749 0.000 ## .Babbling 8.805 5.149 1.710 0.087 ## .Screeching 17.139 2.748 6.236 0.000 ## .VocalImitation 16.742 3.514 4.764 0.000 ## motor 9.523 3.625 2.627 0.009 ## verbal 15.635 5.947 2.629 0.009 ``` --- ## Results: visual <img src="image6.png" width="90%" style="display: block; margin: auto;" /> --- ## Interpretation of the coefficients: factor loadings - When unstandardized and loaded on a single factor, then unstandardized regression coefficients. Model predicted difference in the LVs between groups that differ in 1-unit on the predictor <br/> <br/> - When loaded on multiple factors, then regression coefficients become contingent on other factors (check Lecture 1, slide 11) <br/> <br/> - When standardized and loaded on a single factor (congeneric structure), then standardized loadings are estimated correlations between indicators and LVs <br/> <br/> - When standardized and loaded on a multiple factors, then same as the second option only standardized (beta weights) <br/> --- ## Results of the model ```r summary(fit1, standardized=TRUE) ``` ``` ## lavaan 0.6-9 ended normally after 74 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 13 ## ## Number of observations 100 ## ## Model Test User Model: ## ## Test statistic 3.376 ## Degrees of freedom 8 ## P-value (Chi-square) 0.909 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## motor =~ ## TimeOnTummy 1.000 3.086 0.623 ## PreciseLegMovs 0.910 0.240 3.791 0.000 2.807 0.591 ## PreciseHandMvs 1.099 0.293 3.746 0.000 3.392 0.692 ## verbal =~ ## Babbling 1.000 3.954 0.800 ## Screeching 0.494 0.182 2.718 0.007 1.952 0.426 ## VocalImitation 0.716 0.246 2.906 0.004 2.832 0.569 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## motor ~~ ## verbal 3.433 1.901 1.806 0.071 0.281 0.281 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .TimeOnTummy 15.031 3.181 4.725 0.000 15.031 0.612 ## .PreciseLegMovs 14.709 2.867 5.130 0.000 14.709 0.651 ## .PreciseHandMvs 12.515 3.338 3.749 0.000 12.515 0.521 ## .Babbling 8.805 5.149 1.710 0.087 8.805 0.360 ## .Screeching 17.139 2.748 6.236 0.000 17.139 0.818 ## .VocalImitation 16.742 3.514 4.764 0.000 16.742 0.676 ## motor 9.523 3.625 2.627 0.009 1.000 1.000 ## verbal 15.635 5.947 2.629 0.009 1.000 1.000 ``` --- ## Scaling LVs: variance = 1 ```r model2<-' motor =~ NA*TimeOnTummy + PreciseLegMoves + PreciseHandMoves verbal =~ NA*Babbling + Screeching + VocalImitation motor ~~ 1*motor verbal ~~ 1*verbal ' fit2<-cfa(model2, data=Babies) summary(fit2, standardized=TRUE) ``` ``` ## lavaan 0.6-9 ended normally after 33 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 13 ## ## Number of observations 100 ## ## Model Test User Model: ## ## Test statistic 3.376 ## Degrees of freedom 8 ## P-value (Chi-square) 0.909 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## motor =~ ## TimeOnTummy 3.086 0.587 5.254 0.000 3.086 0.623 ## PreciseLegMovs 2.807 0.557 5.042 0.000 2.807 0.591 ## PreciseHandMvs 3.392 0.597 5.680 0.000 3.392 0.692 ## verbal =~ ## Babbling 3.954 0.752 5.259 0.000 3.954 0.800 ## Screeching 1.952 0.548 3.560 0.000 1.952 0.426 ## VocalImitation 2.832 0.646 4.381 0.000 2.832 0.569 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## motor ~~ ## verbal 0.281 0.139 2.028 0.043 0.281 0.281 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## motor 1.000 1.000 1.000 ## verbal 1.000 1.000 1.000 ## .TimeOnTummy 15.031 3.181 4.725 0.000 15.031 0.612 ## .PreciseLegMovs 14.709 2.867 5.130 0.000 14.709 0.651 ## .PreciseHandMvs 12.515 3.338 3.749 0.000 12.515 0.521 ## .Babbling 8.805 5.149 1.710 0.087 8.805 0.360 ## .Screeching 17.139 2.748 6.236 0.000 17.139 0.818 ## .VocalImitation 16.742 3.514 4.764 0.000 16.742 0.676 ``` --- ## Scaling LVs: effect coding ```r model3<-' motor =~ NA*TimeOnTummy+a*TimeOnTummy + b*PreciseLegMoves + c*PreciseHandMoves verbal =~ NA*Babbling+a1*Babbling + b1*Screeching + c1*VocalImitation a+b+c==3 a1+b1+c1==3 ' fit3<-cfa(model3, data=Babies) summary(fit3, standardized=TRUE) ``` ``` ## lavaan 0.6-9 ended normally after 58 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 15 ## Number of equality constraints 2 ## ## Number of observations 100 ## ## Model Test User Model: ## ## Test statistic 3.376 ## Degrees of freedom 8 ## P-value (Chi-square) 0.909 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## motor =~ ## TimOnTmmy (a) 0.997 0.153 6.524 0.000 3.086 0.623 ## PrcsLgMvs (b) 0.907 0.146 6.227 0.000 2.807 0.591 ## PrcsHndMv (c) 1.096 0.162 6.770 0.000 3.392 0.692 ## verbal =~ ## Babbling (a1) 1.358 0.236 5.753 0.000 3.954 0.800 ## Screechng (b1) 0.670 0.158 4.236 0.000 1.952 0.426 ## VoclImttn (c1) 0.972 0.189 5.140 0.000 2.832 0.569 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## motor ~~ ## verbal 2.536 1.341 1.891 0.059 0.281 0.281 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .TimeOnTummy 15.031 3.181 4.725 0.000 15.031 0.612 ## .PreciseLegMovs 14.709 2.867 5.130 0.000 14.709 0.651 ## .PreciseHandMvs 12.515 3.338 3.749 0.000 12.515 0.521 ## .Babbling 8.805 5.149 1.710 0.087 8.805 0.360 ## .Screeching 17.139 2.748 6.236 0.000 17.139 0.818 ## .VocalImitation 16.742 3.514 4.764 0.000 16.742 0.676 ## motor 9.581 2.070 4.628 0.000 1.000 1.000 ## verbal 8.483 1.896 4.475 0.000 1.000 1.000 ## ## Constraints: ## |Slack| ## a+b+c - (3) 0.000 ## a1+b1+c1 - (3) 0.000 ``` --- ## Adding intercepts ```r model3<-' motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves verbal =~ Babbling + Screeching + VocalImitation TimeOnTummy ~ 1 PreciseLegMoves ~ 1 PreciseHandMoves ~ 1 Babbling ~ 1 Screeching ~ 1 VocalImitation ~ 1' fit3<-cfa(model3, data=Babies) summary(fit3, standardized=TRUE, fit.measures=T) ``` ``` ## lavaan 0.6-9 ended normally after 74 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 19 ## ## Number of observations 100 ## ## Model Test User Model: ## ## Test statistic 3.376 ## Degrees of freedom 8 ## P-value (Chi-square) 0.909 ## ## Model Test Baseline Model: ## ## Test statistic 88.357 ## Degrees of freedom 15 ## P-value 0.000 ## ## User Model versus Baseline Model: ## ## Comparative Fit Index (CFI) 1.000 ## Tucker-Lewis Index (TLI) 1.118 ## ## Loglikelihood and Information Criteria: ## ## Loglikelihood user model (H0) -1756.127 ## Loglikelihood unrestricted model (H1) -1754.439 ## ## Akaike (AIC) 3550.253 ## Bayesian (BIC) 3599.751 ## Sample-size adjusted Bayesian (BIC) 3539.744 ## ## Root Mean Square Error of Approximation: ## ## RMSEA 0.000 ## 90 Percent confidence interval - lower 0.000 ## 90 Percent confidence interval - upper 0.047 ## P-value RMSEA <= 0.05 0.954 ## ## Standardized Root Mean Square Residual: ## ## SRMR 0.034 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## motor =~ ## TimeOnTummy 1.000 3.086 0.623 ## PreciseLegMovs 0.910 0.240 3.791 0.000 2.807 0.591 ## PreciseHandMvs 1.099 0.293 3.746 0.000 3.392 0.692 ## verbal =~ ## Babbling 1.000 3.954 0.800 ## Screeching 0.494 0.182 2.718 0.007 1.952 0.426 ## VocalImitation 0.716 0.246 2.906 0.004 2.832 0.569 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## motor ~~ ## verbal 3.433 1.901 1.806 0.071 0.281 0.281 ## ## Intercepts: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .TimeOnTummy 31.423 0.496 63.415 0.000 31.423 6.341 ## .PreciseLegMovs 30.600 0.475 64.380 0.000 30.600 6.438 ## .PreciseHandMvs 29.799 0.490 60.798 0.000 29.799 6.080 ## .Babbling 29.946 0.494 60.573 0.000 29.946 6.057 ## .Screeching 29.785 0.458 65.078 0.000 29.785 6.508 ## .VocalImitation 30.409 0.498 61.110 0.000 30.409 6.111 ## motor 0.000 0.000 0.000 ## verbal 0.000 0.000 0.000 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .TimeOnTummy 15.031 3.181 4.725 0.000 15.031 0.612 ## .PreciseLegMovs 14.709 2.867 5.130 0.000 14.709 0.651 ## .PreciseHandMvs 12.515 3.338 3.749 0.000 12.515 0.521 ## .Babbling 8.805 5.149 1.710 0.087 8.805 0.360 ## .Screeching 17.139 2.748 6.236 0.000 17.139 0.818 ## .VocalImitation 16.742 3.514 4.764 0.000 16.742 0.676 ## motor 9.523 3.625 2.627 0.009 1.000 1.000 ## verbal 15.635 5.947 2.629 0.009 1.000 1.000 ``` --- ## Indices of global model fit ```r summary(fit1, fit.measures=TRUE) ``` ``` ## lavaan 0.6-9 ended normally after 74 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 13 ## ## Number of observations 100 ## ## Model Test User Model: ## ## Test statistic 3.376 ## Degrees of freedom 8 ## P-value (Chi-square) 0.909 ## ## Model Test Baseline Model: ## ## Test statistic 88.357 ## Degrees of freedom 15 ## P-value 0.000 ## ## User Model versus Baseline Model: ## ## Comparative Fit Index (CFI) 1.000 ## Tucker-Lewis Index (TLI) 1.118 ## ## Loglikelihood and Information Criteria: ## ## Loglikelihood user model (H0) -1756.127 ## Loglikelihood unrestricted model (H1) -1754.439 ## ## Akaike (AIC) 3538.253 ## Bayesian (BIC) 3572.120 ## Sample-size adjusted Bayesian (BIC) 3531.063 ## ## Root Mean Square Error of Approximation: ## ## RMSEA 0.000 ## 90 Percent confidence interval - lower 0.000 ## 90 Percent confidence interval - upper 0.047 ## P-value RMSEA <= 0.05 0.954 ## ## Standardized Root Mean Square Residual: ## ## SRMR 0.038 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) ## motor =~ ## TimeOnTummy 1.000 ## PreciseLegMovs 0.910 0.240 3.791 0.000 ## PreciseHandMvs 1.099 0.293 3.746 0.000 ## verbal =~ ## Babbling 1.000 ## Screeching 0.494 0.182 2.718 0.007 ## VocalImitation 0.716 0.246 2.906 0.004 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) ## motor ~~ ## verbal 3.433 1.901 1.806 0.071 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 15.031 3.181 4.725 0.000 ## .PreciseLegMovs 14.709 2.867 5.130 0.000 ## .PreciseHandMvs 12.515 3.338 3.749 0.000 ## .Babbling 8.805 5.149 1.710 0.087 ## .Screeching 17.139 2.748 6.236 0.000 ## .VocalImitation 16.742 3.514 4.764 0.000 ## motor 9.523 3.625 2.627 0.009 ## verbal 15.635 5.947 2.629 0.009 ``` ??? Measurement error in CFA: https://rdrr.io/cran/semTools/man/reliability.html --- class: inverse, middle, center # Structural equation model --- ## SEM, finally <img src="SEM.png" width="80%" style="display: block; margin: auto;" /> --- ## Estimation of SEM ```r model4<-' #CFA model motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves verbal =~ Babbling + Screeching + VocalImitation #Path model motor ~ Age + Weight verbal ~ Age + Weight ' fit4<-sem(model4, data=Babies) ``` --- ## Structural equation model: Results ```r summary(fit4, standardized=TRUE) ``` ``` ## lavaan 0.6-9 ended normally after 81 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 17 ## ## Number of observations 100 ## ## Model Test User Model: ## ## Test statistic 13.018 ## Degrees of freedom 16 ## P-value (Chi-square) 0.671 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## motor =~ ## TimeOnTummy 1.000 3.064 0.618 ## PreciseLegMovs 0.919 0.242 3.803 0.000 2.816 0.592 ## PreciseHandMvs 1.111 0.295 3.765 0.000 3.403 0.694 ## verbal =~ ## Babbling 1.000 3.498 0.708 ## Screeching 0.583 0.189 3.089 0.002 2.040 0.446 ## VocalImitation 0.899 0.263 3.422 0.001 3.144 0.632 ## ## Regressions: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## motor ~ ## Age -0.016 0.045 -0.355 0.723 -0.005 -0.043 ## Weight 0.000 0.001 0.085 0.932 0.000 0.010 ## verbal ~ ## Age -0.041 0.051 -0.803 0.422 -0.012 -0.097 ## Weight 0.002 0.001 2.108 0.035 0.001 0.263 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .motor ~~ ## .verbal 3.292 1.738 1.894 0.058 0.320 0.320 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .TimeOnTummy 15.164 3.159 4.800 0.000 15.164 0.618 ## .PreciseLegMovs 14.662 2.862 5.122 0.000 14.662 0.649 ## .PreciseHandMvs 12.440 3.327 3.740 0.000 12.440 0.518 ## .Babbling 12.204 3.740 3.263 0.001 12.204 0.499 ## .Screeching 16.784 2.717 6.177 0.000 16.784 0.801 ## .VocalImitation 14.880 3.440 4.326 0.000 14.880 0.601 ## .motor 9.372 3.577 2.620 0.009 0.998 0.998 ## .verbal 11.299 4.205 2.687 0.007 0.923 0.923 ``` --- class: inverse, middle, center # Measurement invariance --- ## Measurement invariance Compare our model between the groups: <br/> - Configural invarience: Model fitted for each group separately<br/><br/> - Metric invariance: restriction of the factor loadings, but intercepts are allowed to vary <br/><br/> - Scalar invariance: restriction of the both, factor loadings and intercepts<br/><br/> - Strict invariance: restriction on factor loadings, intercepts and residual variances --- ## Configural invariance ```r modelMI<-' motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves verbal =~ Babbling + Screeching + VocalImitation ' fitMIC<-cfa(modelMI, data=Babies, group='Gender') summary(fitMIC) ``` ``` ## lavaan 0.6-9 ended normally after 141 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 38 ## ## Number of observations per group: ## Girls 48 ## Boys 52 ## ## Model Test User Model: ## ## Test statistic 15.880 ## Degrees of freedom 16 ## P-value (Chi-square) 0.461 ## Test statistic for each group: ## Girls 6.246 ## Boys 9.634 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## ## Group 1 [Girls]: ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) ## motor =~ ## TimeOnTummy 1.000 ## PreciseLegMovs 0.754 0.282 2.677 0.007 ## PreciseHandMvs 0.982 0.345 2.849 0.004 ## verbal =~ ## Babbling 1.000 ## Screeching 0.428 0.237 1.809 0.070 ## VocalImitation 0.632 0.313 2.021 0.043 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) ## motor ~~ ## verbal 7.298 3.772 1.935 0.053 ## ## Intercepts: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 31.723 0.769 41.272 0.000 ## .PreciseLegMovs 30.863 0.734 42.020 0.000 ## .PreciseHandMvs 30.204 0.725 41.681 0.000 ## .Babbling 29.693 0.781 37.997 0.000 ## .Screeching 29.281 0.669 43.755 0.000 ## .VocalImitation 30.893 0.750 41.215 0.000 ## motor 0.000 ## verbal 0.000 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 15.328 5.201 2.947 0.003 ## .PreciseLegMovs 18.476 4.570 4.042 0.000 ## .PreciseHandMvs 12.648 4.739 2.669 0.008 ## .Babbling 11.319 8.343 1.357 0.175 ## .Screeching 18.192 4.110 4.426 0.000 ## .VocalImitation 19.781 5.256 3.764 0.000 ## motor 13.031 6.402 2.036 0.042 ## verbal 17.992 9.733 1.849 0.065 ## ## ## Group 2 [Boys]: ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) ## motor =~ ## TimeOnTummy 1.000 ## PreciseLegMovs 1.168 0.448 2.607 0.009 ## PreciseHandMvs 1.106 0.409 2.706 0.007 ## verbal =~ ## Babbling 1.000 ## Screeching 0.410 0.238 1.727 0.084 ## VocalImitation 0.562 0.302 1.861 0.063 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) ## motor ~~ ## verbal -0.787 2.007 -0.392 0.695 ## ## Intercepts: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 31.147 0.634 49.148 0.000 ## .PreciseLegMovs 30.357 0.611 49.675 0.000 ## .PreciseHandMvs 29.426 0.660 44.593 0.000 ## .Babbling 30.179 0.617 48.873 0.000 ## .Screeching 30.251 0.620 48.790 0.000 ## .VocalImitation 29.962 0.655 45.744 0.000 ## motor 0.000 ## verbal 0.000 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 13.783 3.710 3.715 0.000 ## .PreciseLegMovs 9.737 3.951 2.465 0.014 ## .PreciseHandMvs 13.963 4.139 3.373 0.001 ## .Babbling -0.009 9.707 -0.001 0.999 ## .Screeching 16.649 3.652 4.559 0.000 ## .VocalImitation 16.035 4.395 3.649 0.000 ## motor 7.101 3.990 1.779 0.075 ## verbal 19.837 10.457 1.897 0.058 ``` --- ## Metric invariance: 1 ```r modelMI<-' motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves verbal =~ Babbling + Screeching + VocalImitation ' fitMIM<-cfa(modelMI, data=Babies, group='Gender',group.equal='loadings') summary(fitMIM) ``` ``` ## lavaan 0.6-9 ended normally after 135 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 38 ## Number of equality constraints 4 ## ## Number of observations per group: ## Girls 48 ## Boys 52 ## ## Model Test User Model: ## ## Test statistic 16.556 ## Degrees of freedom 20 ## P-value (Chi-square) 0.682 ## Test statistic for each group: ## Girls 6.606 ## Boys 9.951 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## ## Group 1 [Girls]: ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) ## motor =~ ## TmOnTmm 1.000 ## PrcsLgM (.p2.) 0.949 0.247 3.840 0.000 ## PrcsHnM (.p3.) 1.037 0.267 3.880 0.000 ## verbal =~ ## Babblng 1.000 ## Scrchng (.p5.) 0.430 0.164 2.619 0.009 ## VclImtt (.p6.) 0.601 0.211 2.854 0.004 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) ## motor ~~ ## verbal 7.124 3.399 2.096 0.036 ## ## Intercepts: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 31.723 0.757 41.929 0.000 ## .PreciseLegMovs 30.863 0.751 41.121 0.000 ## .PreciseHandMvs 30.204 0.722 41.828 0.000 ## .Babbling 29.693 0.782 37.967 0.000 ## .Screeching 29.281 0.671 43.661 0.000 ## .VocalImitation 30.893 0.747 41.353 0.000 ## motor 0.000 ## verbal 0.000 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 16.616 4.612 3.603 0.000 ## .PreciseLegMovs 17.266 4.551 3.794 0.000 ## .PreciseHandMvs 13.341 4.238 3.148 0.002 ## .Babbling 10.817 7.130 1.517 0.129 ## .Screeching 18.163 4.020 4.518 0.000 ## .VocalImitation 20.094 4.867 4.128 0.000 ## motor 10.861 4.723 2.300 0.021 ## verbal 18.541 8.400 2.207 0.027 ## ## ## Group 2 [Boys]: ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) ## motor =~ ## TmOnTmm 1.000 ## PrcsLgM (.p2.) 0.949 0.247 3.840 0.000 ## PrcsHnM (.p3.) 1.037 0.267 3.880 0.000 ## verbal =~ ## Babblng 1.000 ## Scrchng (.p5.) 0.430 0.164 2.619 0.009 ## VclImtt (.p6.) 0.601 0.211 2.854 0.004 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) ## motor ~~ ## verbal -0.695 2.209 -0.315 0.753 ## ## Intercepts: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 31.147 0.643 48.419 0.000 ## .PreciseLegMovs 30.357 0.600 50.563 0.000 ## .PreciseHandMvs 29.426 0.662 44.430 0.000 ## .Babbling 30.179 0.617 48.879 0.000 ## .Screeching 30.251 0.619 48.885 0.000 ## .VocalImitation 29.962 0.657 45.613 0.000 ## motor 0.000 ## verbal 0.000 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 12.951 3.622 3.576 0.000 ## .PreciseLegMovs 11.037 3.171 3.481 0.000 ## .PreciseHandMvs 13.593 3.843 3.537 0.000 ## .Babbling 1.076 6.336 0.170 0.865 ## .Screeching 16.449 3.442 4.779 0.000 ## .VocalImitation 15.667 3.841 4.079 0.000 ## motor 8.566 3.646 2.349 0.019 ## verbal 18.747 7.377 2.541 0.011 ``` --- ## Metric invariance: 2 ```r #install.packages('semTools') require(semTools) summary(compareFit(fitMIC, fitMIM)) ``` ``` ## ################### Nested Model Comparison ######################### ## Chi-Squared Difference Test ## ## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq) ## fitMIC 16 3573.3 3672.3 15.880 ## fitMIM 20 3566.0 3654.6 16.556 0.67673 4 0.9542 ## ## ####################### Model Fit Indices ########################### ## chisq df pvalue rmsea cfi tli srmr aic bic ## fitMIC 15.880† 16 .461 .000† 1.000† 1.003 .065 3573.335 3672.332 ## fitMIM 16.556 20 .682 .000† 1.000† 1.067† .064† 3566.012† 3654.588† ## ## ################## Differences in Fit Indices ####################### ## df rmsea cfi tli srmr aic bic ## fitMIM - fitMIC 4 0 0 0.064 -0.001 -7.323 -17.744 ``` --- ## Scalar invariance: 1 ```r modelMI<-' motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves verbal =~ Babbling + Screeching + VocalImitation ' fitMISc<-cfa(modelMI, data=Babies, group='Gender',group.equal=c('loadings','intercepts')) summary(fitMISc) ``` ``` ## lavaan 0.6-9 ended normally after 153 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 40 ## Number of equality constraints 10 ## ## Number of observations per group: ## Girls 48 ## Boys 52 ## ## Model Test User Model: ## ## Test statistic 19.394 ## Degrees of freedom 24 ## P-value (Chi-square) 0.731 ## Test statistic for each group: ## Girls 8.252 ## Boys 11.142 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## ## Group 1 [Girls]: ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) ## motor =~ ## TmOnTmm 1.000 ## PrcsLgM (.p2.) 0.949 0.244 3.880 0.000 ## PrcsHnM (.p3.) 1.045 0.266 3.925 0.000 ## verbal =~ ## Babblng 1.000 ## Scrchng (.p5.) 0.410 0.163 2.510 0.012 ## VclImtt (.p6.) 0.560 0.206 2.715 0.007 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) ## motor ~~ ## verbal 7.181 3.418 2.101 0.036 ## ## Intercepts: ## Estimate Std.Err z-value P(>|z|) ## .TmOnTmm (.16.) 31.753 0.661 48.007 0.000 ## .PrcsLgM (.17.) 30.920 0.638 48.461 0.000 ## .PrcsHnM (.18.) 30.141 0.661 45.611 0.000 ## .Babblng (.19.) 29.775 0.767 38.801 0.000 ## .Scrchng (.20.) 29.718 0.504 58.930 0.000 ## .VclImtt (.21.) 30.219 0.575 52.515 0.000 ## motor 0.000 ## verbal 0.000 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 16.671 4.605 3.620 0.000 ## .PreciseLegMovs 17.293 4.543 3.807 0.000 ## .PreciseHandMvs 13.265 4.238 3.130 0.002 ## .Babbling 9.888 7.683 1.287 0.198 ## .Screeching 18.494 4.071 4.543 0.000 ## .VocalImitation 20.960 4.956 4.229 0.000 ## motor 10.803 4.679 2.309 0.021 ## verbal 19.542 8.992 2.173 0.030 ## ## ## Group 2 [Boys]: ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) ## motor =~ ## TmOnTmm 1.000 ## PrcsLgM (.p2.) 0.949 0.244 3.880 0.000 ## PrcsHnM (.p3.) 1.045 0.266 3.925 0.000 ## verbal =~ ## Babblng 1.000 ## Scrchng (.p5.) 0.410 0.163 2.510 0.012 ## VclImtt (.p6.) 0.560 0.206 2.715 0.007 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) ## motor ~~ ## verbal -0.874 2.208 -0.396 0.692 ## ## Intercepts: ## Estimate Std.Err z-value P(>|z|) ## .TmOnTmm (.16.) 31.753 0.661 48.007 0.000 ## .PrcsLgM (.17.) 30.920 0.638 48.461 0.000 ## .PrcsHnM (.18.) 30.141 0.661 45.611 0.000 ## .Babblng (.19.) 29.775 0.767 38.801 0.000 ## .Scrchng (.20.) 29.718 0.504 58.930 0.000 ## .VclImtt (.21.) 30.219 0.575 52.515 0.000 ## motor -0.628 0.766 -0.819 0.413 ## verbal 0.405 0.985 0.411 0.681 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) ## .TimeOnTummy 12.970 3.609 3.594 0.000 ## .PreciseLegMovs 11.054 3.159 3.500 0.000 ## .PreciseHandMvs 13.564 3.847 3.526 0.000 ## .Babbling -0.129 7.015 -0.018 0.985 ## .Screeching 16.806 3.500 4.802 0.000 ## .VocalImitation 16.307 3.880 4.203 0.000 ## motor 8.523 3.612 2.359 0.018 ## verbal 19.957 8.027 2.486 0.013 ``` --- ## Scalar invariance: 2 ```r summary(compareFit(fitMIM,fitMISc)) ``` ``` ## ################### Nested Model Comparison ######################### ## Chi-Squared Difference Test ## ## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq) ## fitMIM 20 3566.0 3654.6 16.556 ## fitMISc 24 3560.8 3639.0 19.394 2.8375 4 0.5854 ## ## ####################### Model Fit Indices ########################### ## chisq df pvalue rmsea cfi tli srmr aic bic ## fitMIM 16.556† 20 .682 .000† 1.000† 1.067 .064† 3566.012 3654.588 ## fitMISc 19.394 24 .731 .000† 1.000† 1.074† .071 3560.850† 3639.005† ## ## ################## Differences in Fit Indices ####################### ## df rmsea cfi tli srmr aic bic ## fitMISc - fitMIM 4 0 0 0.008 0.007 -5.162 -15.583 ``` --- ## Strict invariance: 1 ```r modelMI<-' motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves verbal =~ Babbling + Screeching + VocalImitation ' fitMISt<-cfa(modelMI, data=Babies, group='Gender',group.equal=c('loadings','intercepts','residuals')) summary(fitMISt) ``` ``` ## lavaan 0.6-9 ended normally after 107 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 40 ## Number of equality constraints 16 ## ## Number of observations per group: ## Girls 48 ## Boys 52 ## ## Model Test User Model: ## ## Test statistic 25.722 ## Degrees of freedom 30 ## P-value (Chi-square) 0.689 ## Test statistic for each group: ## Girls 10.852 ## Boys 14.870 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## ## Group 1 [Girls]: ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) ## motor =~ ## TmOnTmm 1.000 ## PrcsLgM (.p2.) 0.916 0.237 3.858 0.000 ## PrcsHnM (.p3.) 1.046 0.270 3.872 0.000 ## verbal =~ ## Babblng 1.000 ## Scrchng (.p5.) 0.357 0.152 2.352 0.019 ## VclImtt (.p6.) 0.497 0.194 2.561 0.010 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) ## motor ~~ ## verbal 7.237 3.475 2.083 0.037 ## ## Intercepts: ## Estimate Std.Err z-value P(>|z|) ## .TmOnTmm (.16.) 31.754 0.663 47.879 0.000 ## .PrcsLgM (.17.) 30.903 0.624 49.533 0.000 ## .PrcsHnM (.18.) 30.145 0.673 44.815 0.000 ## .Babblng (.19.) 29.707 0.776 38.266 0.000 ## .Scrchng (.20.) 29.700 0.506 58.645 0.000 ## .VclImtt (.21.) 30.290 0.582 52.081 0.000 ## motor 0.000 ## verbal 0.000 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) ## .TmOnTmm (.p7.) 14.722 3.138 4.691 0.000 ## .PrcsLgM (.p8.) 14.342 2.843 5.045 0.000 ## .PrcsHnM (.p9.) 13.256 3.161 4.194 0.000 ## .Babblng (.10.) 1.930 7.776 0.248 0.804 ## .Scrchng (.11.) 18.072 2.752 6.568 0.000 ## .VclImtt (.12.) 19.193 3.335 5.755 0.000 ## motor 11.438 4.800 2.383 0.017 ## verbal 27.035 9.847 2.745 0.006 ## ## ## Group 2 [Boys]: ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) ## motor =~ ## TmOnTmm 1.000 ## PrcsLgM (.p2.) 0.916 0.237 3.858 0.000 ## PrcsHnM (.p3.) 1.046 0.270 3.872 0.000 ## verbal =~ ## Babblng 1.000 ## Scrchng (.p5.) 0.357 0.152 2.352 0.019 ## VclImtt (.p6.) 0.497 0.194 2.561 0.010 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) ## motor ~~ ## verbal -0.554 2.239 -0.248 0.804 ## ## Intercepts: ## Estimate Std.Err z-value P(>|z|) ## .TmOnTmm (.16.) 31.754 0.663 47.879 0.000 ## .PrcsLgM (.17.) 30.903 0.624 49.533 0.000 ## .PrcsHnM (.18.) 30.145 0.673 44.815 0.000 ## .Babblng (.19.) 29.707 0.776 38.266 0.000 ## .Scrchng (.20.) 29.700 0.506 58.645 0.000 ## .VclImtt (.21.) 30.290 0.582 52.081 0.000 ## motor -0.635 0.772 -0.823 0.411 ## verbal 0.459 0.994 0.462 0.644 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) ## .TmOnTmm (.p7.) 14.722 3.138 4.691 0.000 ## .PrcsLgM (.p8.) 14.342 2.843 5.045 0.000 ## .PrcsHnM (.p9.) 13.256 3.161 4.194 0.000 ## .Babblng (.10.) 1.930 7.776 0.248 0.804 ## .Scrchng (.11.) 18.072 2.752 6.568 0.000 ## .VclImtt (.12.) 19.193 3.335 5.755 0.000 ## motor 8.157 3.558 2.292 0.022 ## verbal 18.233 8.616 2.116 0.034 ``` --- ## Strict invariance: 2 ```r summary(compareFit(fitMISc,fitMISt)) ``` ``` ## ################### Nested Model Comparison ######################### ## Chi-Squared Difference Test ## ## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq) ## fitMISc 24 3560.8 3639.0 19.394 ## fitMISt 30 3555.2 3617.7 25.722 6.328 6 0.3875 ## ## ####################### Model Fit Indices ########################### ## chisq df pvalue rmsea cfi tli srmr aic bic ## fitMISc 19.394† 24 .731 .000† 1.000† 1.074† .071† 3560.850 3639.005 ## fitMISt 25.722 30 .689 .000† 1.000† 1.055 .079 3555.178† 3617.702† ## ## ################## Differences in Fit Indices ####################### ## df rmsea cfi tli srmr aic bic ## fitMISt - fitMISc 6 0 0 -0.019 0.008 -5.672 -21.303 ``` --- ## Where are the differences ```r lavTestScore(fitMISc) ``` ``` ## $test ## ## total score test: ## ## test X2 df p.value ## 1 score 3.387 10 0.971 ## ## $uni ## ## univariate score tests: ## ## lhs op rhs X2 df p.value ## 1 .p2. == .p25. 0.479 1 0.489 ## 2 .p3. == .p26. 0.015 1 0.903 ## 3 .p5. == .p28. 0.006 1 0.939 ## 4 .p6. == .p29. 0.047 1 0.828 ## 5 .p16. == .p39. 0.007 1 0.936 ## 6 .p17. == .p40. 0.021 1 0.885 ## 7 .p18. == .p41. 0.045 1 0.831 ## 8 .p19. == .p42. 0.278 1 0.598 ## 9 .p20. == .p43. 0.958 1 0.328 ## 10 .p21. == .p44. 1.948 1 0.163 ``` --- class: inverse, middle, center # Examples --- ## Theory: Home-advantage in sports <br/><br/> <img src="theoreticalEx.png" width="90%" style="display: block; margin: auto;" /> --- ## Specification and results: HA in sports <img src="HA.png" width="70%" style="display: block; margin: auto;" /> --- class: inverse, middle, center # Practical aspect --- ## Theory and data Holzinger and Swineford data (1939) - [LINK](https://www.rdocumentation.org/packages/psychTools/versions/2.0.8/topics/holzinger.swineford) <br/> ```r #install.packages('sem') require(sem) data('HS.data') ``` <img src="image7.png" width="70%" style="display: block; margin: auto;" /> --- ## Checking the data ```r dim(HS.data) ``` ``` ## [1] 301 32 ``` ```r summary(HS.data[,c('visual','cubes','flags','paragrap','sentence','wordm','addition','counting','straight')]) ``` ``` ## visual cubes flags paragrap sentence ## Min. : 4.00 Min. : 9.00 Min. : 2 Min. : 0.000 Min. : 4.00 ## 1st Qu.:25.00 1st Qu.:21.00 1st Qu.:11 1st Qu.: 7.000 1st Qu.:14.00 ## Median :30.00 Median :24.00 Median :17 Median : 9.000 Median :18.00 ## Mean :29.61 Mean :24.35 Mean :18 Mean : 9.183 Mean :17.36 ## 3rd Qu.:34.00 3rd Qu.:27.00 3rd Qu.:25 3rd Qu.:11.000 3rd Qu.:21.00 ## Max. :51.00 Max. :37.00 Max. :36 Max. :19.000 Max. :28.00 ## wordm addition counting straight ## Min. : 1.0 Min. : 30.00 Min. : 61.0 Min. :100.0 ## 1st Qu.:10.0 1st Qu.: 80.00 1st Qu.: 97.0 1st Qu.:171.0 ## Median :14.0 Median : 94.00 Median :110.0 Median :195.0 ## Mean :15.3 Mean : 96.24 Mean :110.5 Mean :193.4 ## 3rd Qu.:19.0 3rd Qu.:113.00 3rd Qu.:122.0 3rd Qu.:219.0 ## Max. :43.0 Max. :171.00 Max. :200.0 Max. :333.0 ``` You can also plot univariate probability density functions (see Lecture 1 and 2) --- ## Multivariate normality ```r #install.packages('psych') require(psych) scatter.hist(x=HS.data$visual,y=HS.data$cubes, density = T, ellipse = T) ``` <img src="Week4_files/figure-html/unnamed-chunk-38-1.png" width="40%" style="display: block; margin: auto;" /> --- ## Model identification (LVs scale marker variable) Calculate number of parameters: <br/> <br/> 6 loadings + 9 residual variances + 3 LVs variances + 3 LVs covariances = 21 free parameters<br/><br/> Total number of parameters = 9*(9+1)/2 = 45<br/><br/> Overidentified model! --- ## Model ```r detach('package:sem') fact3<-' spatial =~ visual + cubes + flags verbal =~ paragrap + sentence + wordm speed =~ addition + counting + straight ' fact3fit<-cfa(fact3, data=HS.data) summary(fact3fit, fit.measures=TRUE ,standardized=TRUE) ``` ``` ## lavaan 0.6-9 ended normally after 150 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 21 ## ## Number of observations 301 ## ## Model Test User Model: ## ## Test statistic 85.172 ## Degrees of freedom 24 ## P-value (Chi-square) 0.000 ## ## Model Test Baseline Model: ## ## Test statistic 918.592 ## Degrees of freedom 36 ## P-value 0.000 ## ## User Model versus Baseline Model: ## ## Comparative Fit Index (CFI) 0.931 ## Tucker-Lewis Index (TLI) 0.896 ## ## Loglikelihood and Information Criteria: ## ## Loglikelihood user model (H0) -9578.017 ## Loglikelihood unrestricted model (H1) -9535.431 ## ## Akaike (AIC) 19198.034 ## Bayesian (BIC) 19275.883 ## Sample-size adjusted Bayesian (BIC) 19209.283 ## ## Root Mean Square Error of Approximation: ## ## RMSEA 0.092 ## 90 Percent confidence interval - lower 0.071 ## 90 Percent confidence interval - upper 0.114 ## P-value RMSEA <= 0.05 0.001 ## ## Standardized Root Mean Square Residual: ## ## SRMR 0.065 ## ## Parameter Estimates: ## ## Standard errors Standard ## Information Expected ## Information saturated (h1) model Structured ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## spatial =~ ## visual 1.000 5.397 0.772 ## cubes 0.369 0.066 5.553 0.000 1.992 0.424 ## flags 0.973 0.146 6.682 0.000 5.250 0.581 ## verbal =~ ## paragrap 1.000 2.969 0.852 ## sentence 1.484 0.087 17.015 0.000 4.406 0.855 ## wordm 2.161 0.129 16.703 0.000 6.416 0.838 ## speed =~ ## addition 1.000 14.235 0.570 ## counting 1.026 0.144 7.152 0.000 14.611 0.723 ## straight 1.696 0.237 7.154 0.000 24.143 0.665 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## spatial ~~ ## verbal 7.347 1.323 5.552 0.000 0.459 0.459 ## speed 36.098 7.755 4.655 0.000 0.470 0.470 ## verbal ~~ ## speed 11.991 3.401 3.525 0.000 0.284 0.284 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .visual 19.774 4.091 4.833 0.000 19.774 0.404 ## .cubes 18.139 1.628 11.145 0.000 18.139 0.820 ## .flags 54.033 5.801 9.314 0.000 54.033 0.662 ## .paragrap 3.340 0.429 7.778 0.000 3.340 0.275 ## .sentence 7.140 0.934 7.642 0.000 7.140 0.269 ## .wordm 17.455 2.109 8.278 0.000 17.455 0.298 ## .addition 421.390 42.927 9.816 0.000 421.390 0.675 ## .counting 195.314 29.676 6.582 0.000 195.314 0.478 ## .straight 735.487 91.898 8.003 0.000 735.487 0.558 ## spatial 29.127 5.238 5.561 0.000 1.000 1.000 ## verbal 8.816 1.009 8.737 0.000 1.000 1.000 ## speed 202.639 45.503 4.453 0.000 1.000 1.000 ``` --- ## Explained variance - R2 ```r inspect(fact3fit,'r2') ``` ``` ## visual cubes flags paragrap sentence wordm addition counting ## 0.596 0.180 0.338 0.725 0.731 0.702 0.325 0.522 ## straight ## 0.442 ``` --- ## Checking multivariate normality ```r #install.packages('MVN') require(MVN) test<-mvn(HS.data[,c('visual','cubes','flags','paragrap','sentence','wordm','addition','counting','straight')], mvnTest = 'royston') test$multivariateNormality ``` ``` ## Test H p value MVN ## 1 Royston 125.7982 6.647398e-23 NO ``` --- ## What can we do 1. Bootstrap our results <br/><br/> 2. Use robust standard errors <br/><br/> 3. Change test statistic (eg. Satorra Bentler) <br/><br/> --- ## Robust standard errors ```r fact3fitRob<-cfa(fact3, data=HS.data, se='robust.sem',test='satorra.bentler') summary(fact3fitRob,standardized=TRUE) ``` ``` ## lavaan 0.6-9 ended normally after 150 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 21 ## ## Number of observations 301 ## ## Model Test User Model: ## Standard Robust ## Test Statistic 85.172 80.756 ## Degrees of freedom 24 24 ## P-value (Chi-square) 0.000 0.000 ## Scaling correction factor 1.055 ## Satorra-Bentler correction ## ## Parameter Estimates: ## ## Standard errors Robust.sem ## Information Expected ## Information saturated (h1) model Structured ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## spatial =~ ## visual 1.000 5.397 0.772 ## cubes 0.369 0.069 5.358 0.000 1.992 0.424 ## flags 0.973 0.153 6.364 0.000 5.250 0.581 ## verbal =~ ## paragrap 1.000 2.969 0.852 ## sentence 1.484 0.089 16.763 0.000 4.406 0.855 ## wordm 2.161 0.139 15.498 0.000 6.416 0.838 ## speed =~ ## addition 1.000 14.235 0.570 ## counting 1.026 0.132 7.755 0.000 14.611 0.723 ## straight 1.696 0.208 8.170 0.000 24.143 0.665 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## spatial ~~ ## verbal 7.347 1.480 4.965 0.000 0.459 0.459 ## speed 36.098 7.596 4.752 0.000 0.470 0.470 ## verbal ~~ ## speed 11.991 3.811 3.146 0.002 0.284 0.284 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .visual 19.774 4.982 3.969 0.000 19.774 0.404 ## .cubes 18.139 1.719 10.552 0.000 18.139 0.820 ## .flags 54.033 5.413 9.981 0.000 54.033 0.662 ## .paragrap 3.340 0.450 7.423 0.000 3.340 0.275 ## .sentence 7.140 0.929 7.689 0.000 7.140 0.269 ## .wordm 17.455 2.267 7.701 0.000 17.455 0.298 ## .addition 421.390 41.596 10.130 0.000 421.390 0.675 ## .counting 195.314 29.691 6.578 0.000 195.314 0.478 ## .straight 735.487 88.246 8.335 0.000 735.487 0.558 ## spatial 29.127 6.024 4.836 0.000 1.000 1.000 ## verbal 8.816 1.087 8.109 0.000 1.000 1.000 ## speed 202.639 43.714 4.636 0.000 1.000 1.000 ``` --- ## Modification indices ```r mi <- modindices(fact3fitRob) mi ``` ``` ## lhs op rhs mi epc sepc.lv sepc.all sepc.nox ## 25 spatial =~ paragrap 1.216 0.038 0.207 0.059 0.059 ## 26 spatial =~ sentence 7.446 -0.140 -0.756 -0.147 -0.147 ## 27 spatial =~ wordm 2.839 0.130 0.701 0.092 0.092 ## 28 spatial =~ addition 18.568 -1.611 -8.697 -0.348 -0.348 ## 29 spatial =~ counting 4.192 -0.692 -3.736 -0.185 -0.185 ## 30 spatial =~ straight 36.044 3.446 18.600 0.512 0.512 ## 31 verbal =~ visual 8.923 0.702 2.083 0.298 0.298 ## 32 verbal =~ cubes 0.018 -0.015 -0.045 -0.010 -0.010 ## 33 verbal =~ flags 9.162 -0.725 -2.153 -0.238 -0.238 ## 34 verbal =~ addition 0.074 -0.139 -0.414 -0.017 -0.017 ## 35 verbal =~ counting 3.403 -0.811 -2.407 -0.119 -0.119 ## 36 verbal =~ straight 4.693 1.646 4.886 0.135 0.135 ## 37 speed =~ visual 0.014 0.006 0.091 0.013 0.013 ## 38 speed =~ cubes 1.572 -0.034 -0.490 -0.104 -0.104 ## 39 speed =~ flags 0.711 0.047 0.671 0.074 0.074 ## 40 speed =~ paragrap 0.002 -0.001 -0.008 -0.002 -0.002 ## 41 speed =~ sentence 0.199 -0.008 -0.109 -0.021 -0.021 ## 42 speed =~ wordm 0.260 0.013 0.187 0.024 0.024 ## 43 visual ~~ cubes 3.619 -4.421 -4.421 -0.233 -0.233 ## 44 visual ~~ flags 0.929 -6.633 -6.633 -0.203 -0.203 ## 45 visual ~~ paragrap 3.547 1.408 1.408 0.173 0.173 ## 46 visual ~~ sentence 0.521 -0.795 -0.795 -0.067 -0.067 ## 47 visual ~~ wordm 0.049 0.370 0.370 0.020 0.020 ## 48 visual ~~ addition 5.461 -17.860 -17.860 -0.196 -0.196 ## 49 visual ~~ counting 0.602 -4.804 -4.804 -0.077 -0.077 ## 50 visual ~~ straight 7.252 29.650 29.650 0.246 0.246 ## 51 cubes ~~ flags 8.529 6.986 6.986 0.223 0.223 ## 52 cubes ~~ paragrap 0.535 -0.406 -0.406 -0.052 -0.052 ## 53 cubes ~~ sentence 0.022 -0.122 -0.122 -0.011 -0.011 ## 54 cubes ~~ wordm 0.786 1.099 1.099 0.062 0.062 ## 55 cubes ~~ addition 8.918 -16.784 -16.784 -0.192 -0.192 ## 56 cubes ~~ counting 0.053 -0.993 -0.993 -0.017 -0.017 ## 57 cubes ~~ straight 1.907 10.864 10.864 0.094 0.094 ## 58 flags ~~ paragrap 0.143 -0.381 -0.381 -0.028 -0.028 ## 59 flags ~~ sentence 7.860 -4.163 -4.163 -0.212 -0.212 ## 60 flags ~~ wordm 1.856 3.068 3.068 0.100 0.100 ## 61 flags ~~ addition 0.641 -8.191 -8.191 -0.054 -0.054 ## 62 flags ~~ counting 0.054 -1.857 -1.857 -0.018 -0.018 ## 63 flags ~~ straight 4.097 29.208 29.208 0.147 0.147 ## 64 paragrap ~~ sentence 2.519 2.222 2.222 0.455 0.455 ## 65 paragrap ~~ wordm 6.207 -4.923 -4.923 -0.645 -0.645 ## 66 paragrap ~~ addition 6.015 6.818 6.818 0.182 0.182 ## 67 paragrap ~~ counting 3.856 -4.173 -4.173 -0.163 -0.163 ## 68 paragrap ~~ straight 0.187 -1.681 -1.681 -0.034 -0.034 ## 69 sentence ~~ wordm 0.920 2.829 2.829 0.253 0.253 ## 70 sentence ~~ addition 1.188 -4.459 -4.459 -0.081 -0.081 ## 71 sentence ~~ counting 0.338 1.818 1.818 0.049 0.049 ## 72 sentence ~~ straight 0.982 5.664 5.664 0.078 0.078 ## 73 wordm ~~ addition 0.270 -3.226 -3.226 -0.038 -0.038 ## 74 wordm ~~ counting 0.283 2.526 2.526 0.043 0.043 ## 75 wordm ~~ straight 0.105 -2.810 -2.810 -0.025 -0.025 ## 76 addition ~~ counting 33.716 245.021 245.021 0.854 0.854 ## 77 addition ~~ straight 5.108 -153.592 -153.592 -0.276 -0.276 ## 78 counting ~~ straight 14.764 -302.949 -302.949 -0.799 -0.799 ``` --- ## Change the model ```r fact3A<-' spatial =~ visual + cubes + flags + straight + addition verbal =~ paragrap + sentence + wordm speed =~ addition + counting + straight ' fact3AfitRob<-cfa(fact3A, data=HS.data,se='robust.sem',test='satorra.bentler') summary(fact3AfitRob, fit.measures=TRUE ,standardized=TRUE) ``` ``` ## lavaan 0.6-9 ended normally after 200 iterations ## ## Estimator ML ## Optimization method NLMINB ## Number of model parameters 23 ## ## Number of observations 301 ## ## Model Test User Model: ## Standard Robust ## Test Statistic 46.251 44.242 ## Degrees of freedom 22 22 ## P-value (Chi-square) 0.002 0.003 ## Scaling correction factor 1.045 ## Satorra-Bentler correction ## ## Model Test Baseline Model: ## ## Test statistic 918.592 789.304 ## Degrees of freedom 36 36 ## P-value 0.000 0.000 ## Scaling correction factor 1.164 ## ## User Model versus Baseline Model: ## ## Comparative Fit Index (CFI) 0.973 0.970 ## Tucker-Lewis Index (TLI) 0.955 0.952 ## ## Robust Comparative Fit Index (CFI) 0.973 ## Robust Tucker-Lewis Index (TLI) 0.957 ## ## Loglikelihood and Information Criteria: ## ## Loglikelihood user model (H0) -9558.556 -9558.556 ## Loglikelihood unrestricted model (H1) -9535.431 -9535.431 ## ## Akaike (AIC) 19163.112 19163.112 ## Bayesian (BIC) 19248.376 19248.376 ## Sample-size adjusted Bayesian (BIC) 19175.433 19175.433 ## ## Root Mean Square Error of Approximation: ## ## RMSEA 0.061 0.058 ## 90 Percent confidence interval - lower 0.036 0.033 ## 90 Percent confidence interval - upper 0.085 0.082 ## P-value RMSEA <= 0.05 0.220 0.271 ## ## Robust RMSEA 0.059 ## 90 Percent confidence interval - lower 0.033 ## 90 Percent confidence interval - upper 0.084 ## ## Standardized Root Mean Square Residual: ## ## SRMR 0.042 0.042 ## ## Parameter Estimates: ## ## Standard errors Robust.sem ## Information Expected ## Information saturated (h1) model Structured ## ## Latent Variables: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## spatial =~ ## visual 1.000 5.283 0.755 ## cubes 0.397 0.068 5.864 0.000 2.100 0.447 ## flags 1.014 0.140 7.234 0.000 5.355 0.593 ## straight 2.255 0.573 3.936 0.000 11.911 0.328 ## addition -1.049 0.532 -1.971 0.049 -5.541 -0.222 ## verbal =~ ## paragrap 1.000 2.965 0.850 ## sentence 1.489 0.089 16.725 0.000 4.415 0.857 ## wordm 2.163 0.140 15.491 0.000 6.413 0.838 ## speed =~ ## addition 1.000 18.934 0.758 ## counting 0.775 0.141 5.486 0.000 14.678 0.726 ## straight 0.921 0.144 6.404 0.000 17.434 0.480 ## ## Covariances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## spatial ~~ ## verbal 6.864 1.441 4.764 0.000 0.438 0.438 ## speed 39.193 14.124 2.775 0.006 0.392 0.392 ## verbal ~~ ## speed 15.065 5.610 2.686 0.007 0.268 0.268 ## ## Variances: ## Estimate Std.Err z-value P(>|z|) Std.lv Std.all ## .visual 20.990 4.397 4.774 0.000 20.990 0.429 ## .cubes 17.700 1.664 10.635 0.000 17.700 0.801 ## .flags 52.914 4.821 10.976 0.000 52.914 0.649 ## .straight 709.836 81.046 8.758 0.000 709.836 0.538 ## .addition 317.043 59.743 5.307 0.000 317.043 0.508 ## .paragrap 3.366 0.452 7.448 0.000 3.366 0.277 ## .sentence 7.064 0.922 7.663 0.000 7.064 0.266 ## .wordm 17.493 2.271 7.702 0.000 17.493 0.298 ## .counting 193.360 35.564 5.437 0.000 193.360 0.473 ## spatial 27.911 5.407 5.162 0.000 1.000 1.000 ## verbal 8.790 1.087 8.085 0.000 1.000 1.000 ## speed 358.496 94.408 3.797 0.000 1.000 1.000 ``` --- ## Compare the models ```r diff<-compareFit(fact3fitRob, fact3AfitRob) summary(diff) ``` ``` ## ################### Nested Model Comparison ######################### ## Scaled Chi-Squared Difference Test (method = "satorra.bentler.2001") ## ## lavaan NOTE: ## The "Chisq" column contains standard test statistics, not the ## robust test that should be reported per model. A robust difference ## test is a function of two standard (not robust) statistics. ## ## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq) ## fact3AfitRob 22 19163 19248 46.251 ## fact3fitRob 24 19198 19276 85.172 33.645 2 4.943e-08 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## ####################### Model Fit Indices ########################### ## chisq.scaled df.scaled pvalue.scaled rmsea.robust cfi.robust ## fact3AfitRob 44.242† 22 .003 .059† .973† ## fact3fitRob 80.756 24 .000 .091 .932 ## tli.robust srmr aic bic ## fact3AfitRob .957† .042† 19163.112† 19248.376† ## fact3fitRob .898 .065 19198.034 19275.883 ## ## ################## Differences in Fit Indices ####################### ## df.scaled rmsea.robust cfi.robust tli.robust srmr ## fact3fitRob - fact3AfitRob 2 0.032 -0.042 -0.059 0.023 ## aic bic ## fact3fitRob - fact3AfitRob 34.922 27.508 ``` --- ## Model development <img src="Loop.png" width="50%" style="display: block; margin: auto;" /> --- ## Important aspects: theory - Understanding differences between Exploratory FA and Confirmatory FA <br/> - How is linear model defined in the CFA<br/> - Scaling of the latent variables <br/> - Interpretation of the coefficients <br/> - Number of free parameters versus total number of parameters <br/> --- ## Important aspects: practice - Specifying and estimating CFA model <br/> - Scaling the LVs by using marker variable or by scaling LVs variance to 1<br/> - Adding intercepts to your CFA model <br/> - Making a full SEM model <br/> --- ## Literature Confirmatory Factor Analysis for Applied Research by Timothy A. Brown <br/> <br/> Chapters 9 of Principles and Practice of Structural Equation Modeling by Rex B. Kline <br/><br/> Latent Variable Modeling Using R: A Step-by-Step Guide by A. Alexander Beaujean <br/><br/> --- ## Thank you for your attention