Lecture 4: Confirmatory Factor Analysis

class: center, inverse
background-image: url("blindman2.jpg")
---

</style>

## Press record

---

## Corrections from previous lecture

---

## R code

[LINK](https://nvaci.github.io/Lecture_4_code/Lecture_4Rcode.html)

---

## In previous episode

We talked about: 
- Path modelling: observed variables 
- Theoretical pathways of influence 
- Causal interpretations

---

## Confirmatory factor analysis 
- What is CFA 
- Major benefits 
- How to implement it in lavaan package in R 
- Full structural equation model: CFA + path 
- Measurement invariance and modification indices

---

## Exploratory factor analysis (EFA)

Multivariate statistical procedure (Spearman): understanding and accounting for variation and covariation among of set of observed variables by postulating __latent__ structures (factors)

Factor: unobservable variable that influences more than one observed measures and accounts for their intercorrelations

If we partial out latent construct then intercorrelations would be zero

Factor analysis decomposes variance: __a) common variance (communality)__ and __b) unique variance + random error variance (iniqueness)__

???
Thourough example of EFA in R: https://psu-psychology.github.io/psy-597-SEM/06_factor_models/factor_models.html#overview
---

## EFA versus CFA

Reproduce observer relationships between measured variables with smaller number of latent factors

EFA is data-driven approach: weak or no assumptions on a number of latent dimensions and factor loadings (relations between indicators and factors)

CFA is theory-driven approach: strong assumptions for both things

EFA is used earlier in the process of questionnaire development and construct validation

---

## Factor model

---

## Factor or measurement model

Is linear regression where the main predictor is latent or unobserved:

`$$y_1=\tau_1+\lambda_1*\eta+\epsilon_1$$` 
`$y_1=\tau_1+\lambda_1*\eta+\epsilon_1$` 
`$y_2=\tau_2+\lambda_2*\eta+\epsilon_2$` 
`$y_3=\tau_3+\lambda_3*\eta+\epsilon_3$`

`$\tau$` - the item intercepts or means 
`$\lambda$` - factor loadings - regression coefficients 
`$\epsilon$` - error variances and covariances 
`$\eta$` - the latent predictor of the items 
`$\psi$` - factor variances and covariances 
---

## Exploratory factor model

---

## Confirmatory factor model

---

## Assumptions

1. Error variances have mean of zero: `$E(\epsilon_i)=0$` 
2. Latent factors have mean of zero: `$E(\eta)=0$` 
3. Error variances are uncorrelated with each other: `$cov(\epsilon_i,\epsilon_{-i})=0$` 
4. Latent factors are uncorrelated with each other: `$cov(\eta_i,\eta_{-i})=0$` 
5. Latent factors are uncorrelated with error variances: `$cov(\epsilon_i,\eta_i)$`

---

## Parameters

<img src="image3.png" width="90%" style="display: block; margin: auto;" />
---

## Lets simulate some data: Babies

```r
#install.packages('faux')
require(faux)
set.seed(456)

cmat <- c(1, .4,.4, .1, .1, .1,
 .4, 1,.3, .1, .1, .1,
 .4,.2, 1, .1, .1, .1,
 .1,.1,.1, 1, .4, .4,
 .1,.1,.1, .4, 1, .2,
 .1,.1,.1, .4, .2, 1)

vars<-rnorm_multi(n=100, 6,30,5,cmat)

names(vars)=c('TimeOnTummy','PreciseLegMoves','PreciseHandMoves','Babbling','Screeching','VocalImitation')

Babies=cbind(Babies,vars)
```

---

## Our data

```r
options(digits=3)
head(Babies[,8:13])
```

```
##   TimeOnTummy PreciseLegMoves PreciseHandMoves Babbling Screeching
## 1        27.6            21.2             27.9     26.0       26.1
## 2        28.6            31.8             29.5     35.4       29.7
## 3        32.7            30.6             34.7     33.7       31.0
## 4        22.8            33.4             20.8     22.5       29.9
## 5        31.9            30.8             32.5     26.3       22.4
## 6        30.3            25.1             23.7     35.7       33.1
##   VocalImitation
## 1           27.9
## 2           36.4
## 3           30.9
## 4           27.4
## 5           21.9
## 6           26.4
```

---

## Our theory and knowledge (specification)

In our previous work, we postulated two __congeneric__ latent factors that explain variances and covariances of our six indicators: motoric and verbal latent component

---

## Latent Variable's Scale

As LVs are not directly observed, we need to set their scale:

1. Standardized latent variables: setting variance of variable to 1 (Z-score) 
2. Marker variable: single factor loading constraint to 1 
3. Effects-coding: constraints that all of the loadings to one LV average 1.0 or that their sum is equal to number of indicators

---

## Syntax in R

.center[
<img src="Rsyntax.png", width = "60%"> 
]
---

## Reflective versus formative LVs

Reflective: indicators of the construct are caused by the construct (eg. inteligence)

- Covariances between indicators 0, when we partial out LVs

Formative: indicators are causing the latent variable (eg. value of a car)

- No assumptions on covariances

---

## Coding of our model

```r
#install.packages('lavaan')
require(lavaan)
model1<-'
motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves
verbal =~ Babbling + Screeching + VocalImitation
'

fit1<-cfa(model1, data=Babies)
```

pre[class] {
 max-height: 100px;
}
</style>

## Results of the model

```r
summary(fit1)
```

---

## Results 2

---

## Results 3

---

## Interpretation of the coefficients: factor loadings

- When unstandardized and loaded on a single factor, then unstandardized regression coefficients. Model predicted difference in the LVs between groups that differ in 1-unit on the predictor

- When loaded on multiple factors, then regression coefficients become contingent on other factors (check Lecture 1, slide 11)

- When standardized and loaded on a single factor (congeneric structure), then standardized loadings are estimated correlations between indicators and LVs

- When standardized and loaded on a multiple factors, then same as the second option only standardized (beta weights)

---

## Results of the model

```r
summary(fit1, standardized=TRUE)
```

```
## lavaan 0.6-7 ended normally after 74 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         13
##                                                       
##   Number of observations                           100
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 3.376
##   Degrees of freedom                                 8
##   P-value (Chi-square)                           0.909
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   motor =~                                                              
##     TimeOnTummy       1.000                               3.086    0.623
##     PreciseLegMovs    0.910    0.240    3.791    0.000    2.807    0.591
##     PreciseHandMvs    1.099    0.293    3.746    0.000    3.392    0.692
##   verbal =~                                                             
##     Babbling          1.000                               3.954    0.800
##     Screeching        0.494    0.182    2.718    0.007    1.952    0.426
##     VocalImitation    0.716    0.246    2.906    0.004    2.832    0.569
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   motor ~~                                                              
##     verbal            3.433    1.901    1.806    0.071    0.281    0.281
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .TimeOnTummy      15.031    3.181    4.725    0.000   15.031    0.612
##    .PreciseLegMovs   14.709    2.867    5.130    0.000   14.709    0.651
##    .PreciseHandMvs   12.515    3.338    3.749    0.000   12.515    0.521
##    .Babbling          8.805    5.149    1.710    0.087    8.805    0.360
##    .Screeching       17.139    2.748    6.236    0.000   17.139    0.818
##    .VocalImitation   16.742    3.514    4.764    0.000   16.742    0.676
##     motor             9.523    3.625    2.627    0.009    1.000    1.000
##     verbal           15.635    5.947    2.629    0.009    1.000    1.000
```

---

## Scaling LVs: variance = 1

```r
model2<-'
motor =~ NA*TimeOnTummy + PreciseLegMoves + PreciseHandMoves
verbal =~ NA*Babbling + Screeching + VocalImitation
motor ~~ 1*motor
verbal ~~ 1*verbal
'

fit2<-cfa(model2, data=Babies)
summary(fit2, standardized=TRUE)
```

```
## lavaan 0.6-7 ended normally after 33 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         13
##                                                       
##   Number of observations                           100
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                 3.376
##   Degrees of freedom                                 8
##   P-value (Chi-square)                           0.909
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   motor =~                                                              
##     TimeOnTummy       3.086    0.587    5.254    0.000    3.086    0.623
##     PreciseLegMovs    2.807    0.557    5.042    0.000    2.807    0.591
##     PreciseHandMvs    3.392    0.597    5.680    0.000    3.392    0.692
##   verbal =~                                                             
##     Babbling          3.954    0.752    5.259    0.000    3.954    0.800
##     Screeching        1.952    0.548    3.560    0.000    1.952    0.426
##     VocalImitation    2.832    0.646    4.381    0.000    2.832    0.569
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   motor ~~                                                              
##     verbal            0.281    0.139    2.028    0.043    0.281    0.281
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     motor             1.000                               1.000    1.000
##     verbal            1.000                               1.000    1.000
##    .TimeOnTummy      15.031    3.181    4.725    0.000   15.031    0.612
##    .PreciseLegMovs   14.709    2.867    5.130    0.000   14.709    0.651
##    .PreciseHandMvs   12.515    3.338    3.749    0.000   12.515    0.521
##    .Babbling          8.805    5.149    1.710    0.087    8.805    0.360
##    .Screeching       17.139    2.748    6.236    0.000   17.139    0.818
##    .VocalImitation   16.742    3.514    4.764    0.000   16.742    0.676
```

---

## Adding intercepts

```r
model3<-'
motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves
verbal =~ Babbling + Screeching + VocalImitation
TimeOnTummy ~ 1
PreciseLegMoves ~ 1
PreciseHandMoves ~ 1
Babbling ~ 1
Screeching ~ 1 
VocalImitation ~ 1'
fit3<-cfa(model3, data=Babies)
summary(fit3, standardized=TRUE, fit.measures=T)
```

```
## lavaan 0.6-7 ended normally after 74 iterations
## 
## Estimator ML
## Optimization method NLMINB
## Number of free parameters 19
## 
## Number of observations 100
## 
## Model Test User Model:
## 
## Test statistic 3.376
## Degrees of freedom 8
## P-value (Chi-square) 0.909
## 
## Model Test Baseline Model:
## 
## Test statistic 88.357
## Degrees of freedom 15
## P-value 0.000
## 
## User Model versus Baseline Model:
## 
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.118
## 
## Loglikelihood and Information Criteria:
## 
## Loglikelihood user model (H0) -1756.127
## Loglikelihood unrestricted model (H1) -1754.439
## 
## Akaike (AIC) 3550.253
## Bayesian (BIC) 3599.751
## Sample-size adjusted Bayesian (BIC) 3539.744
## 
## Root Mean Square Error of Approximation:
## 
## RMSEA 0.000
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.047
## P-value RMSEA <= 0.05 0.954
## 
## Standardized Root Mean Square Residual:
## 
## SRMR 0.034
## 
## Parameter Estimates:
## 
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
## 
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## motor =~ 
## TimeOnTummy 1.000 3.086 0.623
## PreciseLegMovs 0.910 0.240 3.791 0.000 2.807 0.591
## PreciseHandMvs 1.099 0.293 3.746 0.000 3.392 0.692
## verbal =~ 
## Babbling 1.000 3.954 0.800
## Screeching 0.494 0.182 2.718 0.007 1.952 0.426
## VocalImitation 0.716 0.246 2.906 0.004 2.832 0.569
## 
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## motor ~~ 
## verbal 3.433 1.901 1.806 0.071 0.281 0.281
## 
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .TimeOnTummy 31.423 0.496 63.415 0.000 31.423 6.341
## .PreciseLegMovs 30.600 0.475 64.380 0.000 30.600 6.438
## .PreciseHandMvs 29.799 0.490 60.798 0.000 29.799 6.080
## .Babbling 29.946 0.494 60.573 0.000 29.946 6.057
## .Screeching 29.785 0.458 65.078 0.000 29.785 6.508
## .VocalImitation 30.409 0.498 61.110 0.000 30.409 6.111
## motor 0.000 0.000 0.000
## verbal 0.000 0.000 0.000
## 
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .TimeOnTummy 15.031 3.181 4.725 0.000 15.031 0.612
## .PreciseLegMovs 14.709 2.867 5.130 0.000 14.709 0.651
## .PreciseHandMvs 12.515 3.338 3.749 0.000 12.515 0.521
## .Babbling 8.805 5.149 1.710 0.087 8.805 0.360
## .Screeching 17.139 2.748 6.236 0.000 17.139 0.818
## .VocalImitation 16.742 3.514 4.764 0.000 16.742 0.676
## motor 9.523 3.625 2.627 0.009 1.000 1.000
## verbal 15.635 5.947 2.629 0.009 1.000 1.000
```

---

## Identification (Lecture 3)

Total number of parameters: `$\frac{vars*(vars+1)}{2}=\frac{6*7}{2}=21$`

Variance-covariance matrix:

```r
cov(Babies[,8:13])
```

```
##                  TimeOnTummy PreciseLegMoves PreciseHandMoves Babbling
## TimeOnTummy           24.802            8.82            10.85     1.35
## PreciseLegMoves        8.822           22.82             9.27     4.04
## PreciseHandMoves      10.853            9.27            24.27     3.52
## Babbling               1.353            4.04             3.52    24.69
## Screeching             0.589            3.12             1.72     8.08
## VocalImitation         2.715            4.46             4.33    11.31
##                  Screeching VocalImitation
## TimeOnTummy           0.589           2.71
## PreciseLegMoves       3.116           4.46
## PreciseHandMoves      1.720           4.33
## Babbling              8.081          11.31
## Screeching           21.159           4.81
## VocalImitation        4.809          25.01
```

---
## Estimated number of parameters

Loadings `$(\lambda)$`: 4 parameters 
Residual variances `$(\epsilon)$` : 6 parameters 
Factor variances and covariances `$(\psi)$` : 3 parameters 
With intercepts: + 6 
---

## Indices of global model fit

```r
summary(fit1, fit.measures=TRUE)
```

```
## lavaan 0.6-7 ended normally after 74 iterations
## 
## Estimator ML
## Optimization method NLMINB
## Number of free parameters 13
## 
## Number of observations 100
## 
## Model Test User Model:
## 
## Test statistic 3.376
## Degrees of freedom 8
## P-value (Chi-square) 0.909
## 
## Model Test Baseline Model:
## 
## Test statistic 88.357
## Degrees of freedom 15
## P-value 0.000
## 
## User Model versus Baseline Model:
## 
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.118
## 
## Loglikelihood and Information Criteria:
## 
## Loglikelihood user model (H0) -1756.127
## Loglikelihood unrestricted model (H1) -1754.439
## 
## Akaike (AIC) 3538.253
## Bayesian (BIC) 3572.120
## Sample-size adjusted Bayesian (BIC) 3531.063
## 
## Root Mean Square Error of Approximation:
## 
## RMSEA 0.000
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.047
## P-value RMSEA <= 0.05 0.954
## 
## Standardized Root Mean Square Residual:
## 
## SRMR 0.038
## 
## Parameter Estimates:
## 
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
## 
## Latent Variables:
## Estimate Std.Err z-value P(>|z|)
## motor =~ 
## TimeOnTummy 1.000 
## PreciseLegMovs 0.910 0.240 3.791 0.000
## PreciseHandMvs 1.099 0.293 3.746 0.000
## verbal =~ 
## Babbling 1.000 
## Screeching 0.494 0.182 2.718 0.007
## VocalImitation 0.716 0.246 2.906 0.004
## 
## Covariances:
## Estimate Std.Err z-value P(>|z|)
## motor ~~ 
## verbal 3.433 1.901 1.806 0.071
## 
## Variances:
## Estimate Std.Err z-value P(>|z|)
## .TimeOnTummy 15.031 3.181 4.725 0.000
## .PreciseLegMovs 14.709 2.867 5.130 0.000
## .PreciseHandMvs 12.515 3.338 3.749 0.000
## .Babbling 8.805 5.149 1.710 0.087
## .Screeching 17.139 2.748 6.236 0.000
## .VocalImitation 16.742 3.514 4.764 0.000
## motor 9.523 3.625 2.627 0.009
## verbal 15.635 5.947 2.629 0.009
```

---
class: inverse, middle, center
# Structural equation model
---

## Structural equation model 1

```r
model4<-'
#CFA model
motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves
verbal =~ Babbling + Screeching + VocalImitation

#Path model
motor ~ Age + Weight
verbal ~ Age + Weight
'

fit4<-sem(model4, data=Babies)
```
---
## Structural equation model 1: Results

```r
summary(fit4, standardized=TRUE)
```

```
## lavaan 0.6-7 ended normally after 81 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         17
##                                                       
##   Number of observations                           100
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                13.018
##   Degrees of freedom                                16
##   P-value (Chi-square)                           0.671
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate    Std.Err  z-value  P(>|z|)   Std.lv    Std.all
##   motor =~                                                                  
##     TimeOnTummy         1.000                                 3.064    0.618
##     PreciseLegMovs      0.919    0.242    3.803    0.000      2.816    0.592
##     PreciseHandMvs      1.111    0.295    3.765    0.000      3.403    0.694
##   verbal =~                                                                 
##     Babbling            1.000                                 3.498    0.708
##     Screeching          0.583    0.189    3.089    0.002      2.040    0.446
##     VocalImitation      0.899    0.263    3.422    0.001      3.144    0.632
## 
## Regressions:
##                    Estimate    Std.Err  z-value  P(>|z|)   Std.lv    Std.all
##   motor ~                                                                   
##     Age                -0.016    0.045   -0.355    0.723     -0.005   -0.043
##     Weight              0.000    0.001    0.085    0.932      0.000    0.010
##   verbal ~                                                                  
##     Age                -0.041    0.051   -0.803    0.422     -0.012   -0.097
##     Weight              0.002    0.001    2.108    0.035      0.001    0.263
## 
## Covariances:
##                    Estimate    Std.Err  z-value  P(>|z|)   Std.lv    Std.all
##  .motor ~~                                                                  
##    .verbal              3.292    1.738    1.894    0.058      0.320    0.320
## 
## Variances:
##                    Estimate    Std.Err  z-value  P(>|z|)   Std.lv    Std.all
##    .TimeOnTummy        15.164    3.159    4.800    0.000     15.164    0.618
##    .PreciseLegMovs     14.662    2.862    5.122    0.000     14.662    0.649
##    .PreciseHandMvs     12.440    3.327    3.740    0.000     12.440    0.518
##    .Babbling           12.204    3.740    3.263    0.001     12.204    0.499
##    .Screeching         16.784    2.717    6.177    0.000     16.784    0.801
##    .VocalImitation     14.880    3.440    4.326    0.000     14.880    0.601
##    .motor               9.372    3.577    2.620    0.009      0.998    0.998
##    .verbal             11.299    4.205    2.687    0.007      0.923    0.923
```
---

## Structural equation model 2

```r
model5<-'
#CFA model
motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves
verbal =~ Babbling + Screeching + VocalImitation

#Path model
Height ~ Age
motor ~ Age + Weight + Height
verbal ~ Age + Weight + Height
'

fit5<-sem(model5, data=Babies)
```

---

## Structural equation model 2: results

```r
summary(fit5, standardized=TRUE)
```

```
## lavaan 0.6-7 ended normally after 83 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         21
##                                                       
##   Number of observations                           100
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                36.805
##   Degrees of freedom                                21
##   P-value (Chi-square)                           0.018
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate    Std.Err  z-value  P(>|z|)   Std.lv    Std.all
##   motor =~                                                                  
##     TimeOnTummy         1.000                                 3.066    0.618
##     PreciseLegMovs      0.954    0.248    3.850    0.000      2.924    0.614
##     PreciseHandMvs      1.084    0.283    3.838    0.000      3.324    0.677
##   verbal =~                                                                 
##     Babbling            1.000                                 3.444    0.687
##     Screeching          0.606    0.182    3.326    0.001      2.089    0.454
##     VocalImitation      0.979    0.251    3.894    0.000      3.371    0.669
## 
## Regressions:
##                    Estimate    Std.Err  z-value  P(>|z|)   Std.lv    Std.all
##   Height ~                                                                  
##     Age                 0.143    0.064    2.251    0.024      0.143    0.220
##   motor ~                                                                   
##     Age                -0.005    0.046   -0.109    0.914     -0.002   -0.013
##     Weight              0.000    0.001    0.543    0.587      0.000    0.066
##     Height             -0.082    0.072   -1.144    0.253     -0.027   -0.144
##   verbal ~                                                                  
##     Age                -0.021    0.050   -0.426    0.670     -0.006   -0.051
##     Weight              0.003    0.001    2.817    0.005      0.001    0.349
##     Height             -0.140    0.078   -1.788    0.074     -0.041   -0.218
## 
## Covariances:
##                    Estimate    Std.Err  z-value  P(>|z|)   Std.lv    Std.all
##  .motor ~~                                                                  
##    .verbal              2.976    1.621    1.836    0.066      0.314    0.314
## 
## Variances:
##                    Estimate    Std.Err  z-value  P(>|z|)   Std.lv    Std.all
##    .TimeOnTummy        15.221    3.132    4.860    0.000     15.221    0.618
##    .PreciseLegMovs     14.103    2.876    4.904    0.000     14.103    0.623
##    .PreciseHandMvs     13.052    3.217    4.057    0.000     13.052    0.542
##    .Babbling           13.265    3.353    3.957    0.000     13.265    0.528
##    .Screeching         16.837    2.702    6.231    0.000     16.837    0.794
##    .VocalImitation     14.055    3.335    4.215    0.000     14.055    0.553
##    .Height             27.352    3.868    7.071    0.000     27.352    0.952
##    .motor               9.158    3.477    2.634    0.008      0.974    0.974
##    .verbal              9.796    3.520    2.783    0.005      0.826    0.826
```

---
class: inverse, middle, center
# Measurement invariance 
---

## Measurement invariance

Compare our model between the groups: 
 - Configural invarience: Model fitted for each group separately 
 - Metric invariance: restriction of the factor loadings, but intercepts are allowed to vary 
 - Scalar invariance: restriction of the both, factor loadings and intercepts 
 - Strict invariance: restriction on factor loadings, intercepts and residual variances

---

## Configural invariance

```r
modelMI<-'
motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves
verbal =~ Babbling + Screeching + VocalImitation
'

fitMIC<-cfa(modelMI, data=Babies, group='Gender')
summary(fitMIC)
```

```
## lavaan 0.6-7 ended normally after 141 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         38
##                                                       
##   Number of observations per group:                   
##     Girls                                           48
##     Boys                                            52
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                15.880
##   Degrees of freedom                                16
##   P-value (Chi-square)                           0.461
##   Test statistic for each group:
##     Girls                                        6.246
##     Boys                                         9.634
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## 
## Group 1 [Girls]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor =~                                            
##     TimeOnTummy       1.000                           
##     PreciseLegMovs    0.754    0.282    2.677    0.007
##     PreciseHandMvs    0.982    0.345    2.849    0.004
##   verbal =~                                           
##     Babbling          1.000                           
##     Screeching        0.428    0.237    1.809    0.070
##     VocalImitation    0.632    0.313    2.021    0.043
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor ~~                                            
##     verbal            7.298    3.772    1.935    0.053
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TimeOnTummy      31.723    0.769   41.272    0.000
##    .PreciseLegMovs   30.863    0.734   42.020    0.000
##    .PreciseHandMvs   30.204    0.725   41.681    0.000
##    .Babbling         29.693    0.781   37.997    0.000
##    .Screeching       29.281    0.669   43.755    0.000
##    .VocalImitation   30.893    0.750   41.215    0.000
##     motor             0.000                           
##     verbal            0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TimeOnTummy      15.328    5.201    2.947    0.003
##    .PreciseLegMovs   18.476    4.570    4.042    0.000
##    .PreciseHandMvs   12.648    4.739    2.669    0.008
##    .Babbling         11.319    8.343    1.357    0.175
##    .Screeching       18.192    4.110    4.426    0.000
##    .VocalImitation   19.781    5.256    3.764    0.000
##     motor            13.031    6.402    2.036    0.042
##     verbal           17.992    9.733    1.849    0.065
## 
## 
## Group 2 [Boys]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor =~                                            
##     TimeOnTummy       1.000                           
##     PreciseLegMovs    1.168    0.448    2.607    0.009
##     PreciseHandMvs    1.106    0.409    2.706    0.007
##   verbal =~                                           
##     Babbling          1.000                           
##     Screeching        0.410    0.238    1.727    0.084
##     VocalImitation    0.562    0.302    1.861    0.063
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor ~~                                            
##     verbal           -0.787    2.007   -0.392    0.695
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TimeOnTummy      31.147    0.634   49.148    0.000
##    .PreciseLegMovs   30.357    0.611   49.675    0.000
##    .PreciseHandMvs   29.426    0.660   44.593    0.000
##    .Babbling         30.179    0.617   48.873    0.000
##    .Screeching       30.251    0.620   48.790    0.000
##    .VocalImitation   29.962    0.655   45.744    0.000
##     motor             0.000                           
##     verbal            0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TimeOnTummy      13.783    3.710    3.715    0.000
##    .PreciseLegMovs    9.737    3.951    2.465    0.014
##    .PreciseHandMvs   13.963    4.139    3.373    0.001
##    .Babbling         -0.009    9.707   -0.001    0.999
##    .Screeching       16.649    3.652    4.559    0.000
##    .VocalImitation   16.035    4.395    3.649    0.000
##     motor             7.101    3.990    1.779    0.075
##     verbal           19.837   10.457    1.897    0.058
```

---

## Metric invariance: 1

```r
modelMI<-'
motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves
verbal =~ Babbling + Screeching + VocalImitation
'

fitMIM<-cfa(modelMI, data=Babies, group='Gender',group.equal='loadings')
summary(fitMIM)
```

```
## lavaan 0.6-7 ended normally after 135 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         38
##   Number of equality constraints                     4
##                                                       
##   Number of observations per group:                   
##     Girls                                           48
##     Boys                                            52
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                16.556
##   Degrees of freedom                                20
##   P-value (Chi-square)                           0.682
##   Test statistic for each group:
##     Girls                                        6.606
##     Boys                                         9.951
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## 
## Group 1 [Girls]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor =~                                            
##     TmOnTmm           1.000                           
##     PrcsLgM (.p2.)    0.949    0.247    3.840    0.000
##     PrcsHnM (.p3.)    1.037    0.267    3.880    0.000
##   verbal =~                                           
##     Babblng           1.000                           
##     Scrchng (.p5.)    0.430    0.164    2.619    0.009
##     VclImtt (.p6.)    0.601    0.211    2.854    0.004
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor ~~                                            
##     verbal            7.124    3.399    2.096    0.036
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TimeOnTummy      31.723    0.757   41.929    0.000
##    .PreciseLegMovs   30.863    0.751   41.121    0.000
##    .PreciseHandMvs   30.204    0.722   41.828    0.000
##    .Babbling         29.693    0.782   37.967    0.000
##    .Screeching       29.281    0.671   43.661    0.000
##    .VocalImitation   30.893    0.747   41.353    0.000
##     motor             0.000                           
##     verbal            0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TimeOnTummy      16.616    4.612    3.603    0.000
##    .PreciseLegMovs   17.266    4.551    3.794    0.000
##    .PreciseHandMvs   13.341    4.238    3.148    0.002
##    .Babbling         10.817    7.130    1.517    0.129
##    .Screeching       18.163    4.020    4.518    0.000
##    .VocalImitation   20.094    4.867    4.128    0.000
##     motor            10.861    4.723    2.300    0.021
##     verbal           18.541    8.400    2.207    0.027
## 
## 
## Group 2 [Boys]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor =~                                            
##     TmOnTmm           1.000                           
##     PrcsLgM (.p2.)    0.949    0.247    3.840    0.000
##     PrcsHnM (.p3.)    1.037    0.267    3.880    0.000
##   verbal =~                                           
##     Babblng           1.000                           
##     Scrchng (.p5.)    0.430    0.164    2.619    0.009
##     VclImtt (.p6.)    0.601    0.211    2.854    0.004
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor ~~                                            
##     verbal           -0.695    2.209   -0.315    0.753
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TimeOnTummy      31.147    0.643   48.419    0.000
##    .PreciseLegMovs   30.357    0.600   50.563    0.000
##    .PreciseHandMvs   29.426    0.662   44.430    0.000
##    .Babbling         30.179    0.617   48.879    0.000
##    .Screeching       30.251    0.619   48.885    0.000
##    .VocalImitation   29.962    0.657   45.613    0.000
##     motor             0.000                           
##     verbal            0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TimeOnTummy      12.951    3.622    3.576    0.000
##    .PreciseLegMovs   11.037    3.171    3.481    0.000
##    .PreciseHandMvs   13.593    3.843    3.537    0.000
##    .Babbling          1.076    6.336    0.170    0.865
##    .Screeching       16.449    3.442    4.779    0.000
##    .VocalImitation   15.667    3.841    4.079    0.000
##     motor             8.566    3.646    2.349    0.019
##     verbal           18.747    7.377    2.541    0.011
```

---

## Metric invariance: 2

```r
#install.packages('semTools')
require(semTools)
summary(compareFit(fitMIC, fitMIM))
```

```
## ################### Nested Model Comparison #########################
## Chi-Squared Difference Test
## 
##        Df  AIC  BIC Chisq Chisq diff Df diff Pr(>Chisq)
## fitMIC 16 3573 3672  15.9                              
## fitMIM 20 3566 3655  16.6      0.677       4       0.95
## 
## ####################### Model Fit Indices ###########################
##          chisq df pvalue    cfi    tli       aic       bic rmsea  srmr
## fitMIC 15.880† 16   .461 1.000† 1.003  3573.335  3672.332  .000† .065 
## fitMIM 16.556  20   .682 1.000† 1.067† 3566.012† 3654.588† .000† .064†
## 
## ################## Differences in Fit Indices #######################
##                 df cfi   tli   aic   bic rmsea   srmr
## fitMIM - fitMIC  4   0 0.064 -7.32 -17.7     0 -0.001
```

---

## Scalar invariance: 1

```r
modelMI<-'
motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves
verbal =~ Babbling + Screeching + VocalImitation
'

fitMISc<-cfa(modelMI, data=Babies, group='Gender',group.equal=c('loadings','intercepts'))
summary(fitMISc)
```

```
## lavaan 0.6-7 ended normally after 153 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         40
##   Number of equality constraints                    10
##                                                       
##   Number of observations per group:                   
##     Girls                                           48
##     Boys                                            52
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                19.394
##   Degrees of freedom                                24
##   P-value (Chi-square)                           0.731
##   Test statistic for each group:
##     Girls                                        8.252
##     Boys                                        11.142
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## 
## Group 1 [Girls]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor =~                                            
##     TmOnTmm           1.000                           
##     PrcsLgM (.p2.)    0.949    0.244    3.880    0.000
##     PrcsHnM (.p3.)    1.045    0.266    3.925    0.000
##   verbal =~                                           
##     Babblng           1.000                           
##     Scrchng (.p5.)    0.410    0.163    2.510    0.012
##     VclImtt (.p6.)    0.560    0.206    2.715    0.007
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor ~~                                            
##     verbal            7.181    3.418    2.101    0.036
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TmOnTmm (.16.)   31.753    0.661   48.007    0.000
##    .PrcsLgM (.17.)   30.920    0.638   48.461    0.000
##    .PrcsHnM (.18.)   30.141    0.661   45.611    0.000
##    .Babblng (.19.)   29.775    0.767   38.801    0.000
##    .Scrchng (.20.)   29.718    0.504   58.930    0.000
##    .VclImtt (.21.)   30.219    0.575   52.515    0.000
##     motor             0.000                           
##     verbal            0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TimeOnTummy      16.671    4.605    3.620    0.000
##    .PreciseLegMovs   17.293    4.543    3.807    0.000
##    .PreciseHandMvs   13.265    4.238    3.130    0.002
##    .Babbling          9.888    7.683    1.287    0.198
##    .Screeching       18.494    4.071    4.543    0.000
##    .VocalImitation   20.960    4.956    4.229    0.000
##     motor            10.803    4.679    2.309    0.021
##     verbal           19.542    8.992    2.173    0.030
## 
## 
## Group 2 [Boys]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor =~                                            
##     TmOnTmm           1.000                           
##     PrcsLgM (.p2.)    0.949    0.244    3.880    0.000
##     PrcsHnM (.p3.)    1.045    0.266    3.925    0.000
##   verbal =~                                           
##     Babblng           1.000                           
##     Scrchng (.p5.)    0.410    0.163    2.510    0.012
##     VclImtt (.p6.)    0.560    0.206    2.715    0.007
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor ~~                                            
##     verbal           -0.874    2.208   -0.396    0.692
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TmOnTmm (.16.)   31.753    0.661   48.007    0.000
##    .PrcsLgM (.17.)   30.920    0.638   48.461    0.000
##    .PrcsHnM (.18.)   30.141    0.661   45.611    0.000
##    .Babblng (.19.)   29.775    0.767   38.801    0.000
##    .Scrchng (.20.)   29.718    0.504   58.930    0.000
##    .VclImtt (.21.)   30.219    0.575   52.515    0.000
##     motor            -0.628    0.766   -0.819    0.413
##     verbal            0.405    0.985    0.411    0.681
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TimeOnTummy      12.970    3.609    3.594    0.000
##    .PreciseLegMovs   11.054    3.159    3.500    0.000
##    .PreciseHandMvs   13.564    3.847    3.526    0.000
##    .Babbling         -0.129    7.015   -0.018    0.985
##    .Screeching       16.806    3.500    4.802    0.000
##    .VocalImitation   16.307    3.880    4.203    0.000
##     motor             8.523    3.612    2.359    0.018
##     verbal           19.957    8.027    2.486    0.013
```

---

## Scalar invariance: 2

```r
summary(compareFit(fitMIM,fitMISc))
```

```
## ################### Nested Model Comparison #########################
## Chi-Squared Difference Test
## 
##         Df  AIC  BIC Chisq Chisq diff Df diff Pr(>Chisq)
## fitMIM  20 3566 3655  16.6                              
## fitMISc 24 3561 3639  19.4       2.84       4       0.59
## 
## ####################### Model Fit Indices ###########################
##           chisq df pvalue    cfi    tli       aic       bic rmsea  srmr
## fitMIM  16.556† 20   .682 1.000† 1.067  3566.012  3654.588  .000† .064†
## fitMISc 19.394  24   .731 1.000† 1.074† 3560.850† 3639.005† .000† .071 
## 
## ################## Differences in Fit Indices #######################
##                  df cfi   tli   aic   bic rmsea  srmr
## fitMISc - fitMIM  4   0 0.008 -5.16 -15.6     0 0.007
```

---

## Strict invariance: 1

```r
modelMI<-'
motor =~ TimeOnTummy + PreciseLegMoves + PreciseHandMoves
verbal =~ Babbling + Screeching + VocalImitation
'

fitMISt<-cfa(modelMI, data=Babies, group='Gender',group.equal=c('loadings','intercepts','residuals'))
summary(fitMISt)
```

```
## lavaan 0.6-7 ended normally after 107 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         40
##   Number of equality constraints                    16
##                                                       
##   Number of observations per group:                   
##     Girls                                           48
##     Boys                                            52
##                                                       
## Model Test User Model:
##                                                       
##   Test statistic                                25.722
##   Degrees of freedom                                30
##   P-value (Chi-square)                           0.689
##   Test statistic for each group:
##     Girls                                       10.852
##     Boys                                        14.870
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## 
## Group 1 [Girls]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor =~                                            
##     TmOnTmm           1.000                           
##     PrcsLgM (.p2.)    0.916    0.237    3.858    0.000
##     PrcsHnM (.p3.)    1.046    0.270    3.872    0.000
##   verbal =~                                           
##     Babblng           1.000                           
##     Scrchng (.p5.)    0.357    0.152    2.352    0.019
##     VclImtt (.p6.)    0.497    0.194    2.561    0.010
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor ~~                                            
##     verbal            7.237    3.475    2.083    0.037
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TmOnTmm (.16.)   31.754    0.663   47.879    0.000
##    .PrcsLgM (.17.)   30.903    0.624   49.533    0.000
##    .PrcsHnM (.18.)   30.145    0.673   44.815    0.000
##    .Babblng (.19.)   29.707    0.776   38.266    0.000
##    .Scrchng (.20.)   29.700    0.506   58.645    0.000
##    .VclImtt (.21.)   30.290    0.582   52.081    0.000
##     motor             0.000                           
##     verbal            0.000                           
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TmOnTmm (.p7.)   14.722    3.138    4.691    0.000
##    .PrcsLgM (.p8.)   14.342    2.843    5.045    0.000
##    .PrcsHnM (.p9.)   13.256    3.161    4.194    0.000
##    .Babblng (.10.)    1.930    7.776    0.248    0.804
##    .Scrchng (.11.)   18.072    2.752    6.568    0.000
##    .VclImtt (.12.)   19.193    3.335    5.755    0.000
##     motor            11.438    4.800    2.383    0.017
##     verbal           27.035    9.847    2.745    0.006
## 
## 
## Group 2 [Boys]:
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor =~                                            
##     TmOnTmm           1.000                           
##     PrcsLgM (.p2.)    0.916    0.237    3.858    0.000
##     PrcsHnM (.p3.)    1.046    0.270    3.872    0.000
##   verbal =~                                           
##     Babblng           1.000                           
##     Scrchng (.p5.)    0.357    0.152    2.352    0.019
##     VclImtt (.p6.)    0.497    0.194    2.561    0.010
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   motor ~~                                            
##     verbal           -0.554    2.239   -0.248    0.804
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TmOnTmm (.16.)   31.754    0.663   47.879    0.000
##    .PrcsLgM (.17.)   30.903    0.624   49.533    0.000
##    .PrcsHnM (.18.)   30.145    0.673   44.815    0.000
##    .Babblng (.19.)   29.707    0.776   38.266    0.000
##    .Scrchng (.20.)   29.700    0.506   58.645    0.000
##    .VclImtt (.21.)   30.290    0.582   52.081    0.000
##     motor            -0.635    0.772   -0.823    0.411
##     verbal            0.459    0.994    0.462    0.644
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .TmOnTmm (.p7.)   14.722    3.138    4.691    0.000
##    .PrcsLgM (.p8.)   14.342    2.843    5.045    0.000
##    .PrcsHnM (.p9.)   13.256    3.161    4.194    0.000
##    .Babblng (.10.)    1.930    7.776    0.248    0.804
##    .Scrchng (.11.)   18.072    2.752    6.568    0.000
##    .VclImtt (.12.)   19.193    3.335    5.755    0.000
##     motor             8.157    3.558    2.292    0.022
##     verbal           18.233    8.616    2.116    0.034
```

---

## Strict invariance: 2

```r
summary(compareFit(fitMISc,fitMISt))
```

```
## ################### Nested Model Comparison #########################
## Chi-Squared Difference Test
## 
##         Df  AIC  BIC Chisq Chisq diff Df diff Pr(>Chisq)
## fitMISc 24 3561 3639  19.4                              
## fitMISt 30 3555 3618  25.7       6.33       6       0.39
## 
## ####################### Model Fit Indices ###########################
##           chisq df pvalue    cfi    tli       aic       bic rmsea  srmr
## fitMISc 19.394† 24   .731 1.000† 1.074† 3560.850  3639.005  .000† .071†
## fitMISt 25.722  30   .689 1.000† 1.055  3555.178† 3617.702† .000† .079 
## 
## ################## Differences in Fit Indices #######################
##                   df cfi    tli   aic   bic rmsea  srmr
## fitMISt - fitMISc  6   0 -0.019 -5.67 -21.3     0 0.008
```

---

## Where are the differences

```r
lavTestScore(fitMISc)
```

```
## $test
## 
## total score test:
## 
##    test   X2 df p.value
## 1 score 3.39 10   0.971
## 
## $uni
## 
## univariate score tests:
## 
##      lhs op   rhs    X2 df p.value
## 1   .p2. == .p25. 0.479  1   0.489
## 2   .p3. == .p26. 0.015  1   0.903
## 3   .p5. == .p28. 0.006  1   0.939
## 4   .p6. == .p29. 0.047  1   0.828
## 5  .p16. == .p39. 0.007  1   0.936
## 6  .p17. == .p40. 0.021  1   0.885
## 7  .p18. == .p41. 0.045  1   0.831
## 8  .p19. == .p42. 0.278  1   0.598
## 9  .p20. == .p43. 0.958  1   0.328
## 10 .p21. == .p44. 1.948  1   0.163
```

---
class: inverse, middle, center
# Modification indices
---

## Modification indices

What if? 
New relations suggested by the measures of fit:

```r
mi <- modindices(fit1)
mi[mi$op == "=~"]
```

```
##                 lhs op              rhs    mi    epc sepc.lv
## 16            motor =~         Babbling 0.997 -0.265  -0.817
## 17            motor =~       Screeching 0.001  0.005   0.014
## 18            motor =~   VocalImitation 1.157  0.224   0.690
## 19           verbal =~      TimeOnTummy 1.354 -0.178  -0.705
## 20           verbal =~  PreciseLegMoves 1.061  0.150   0.593
## 21           verbal =~ PreciseHandMoves 0.028  0.026   0.104
## 22      TimeOnTummy ~~  PreciseLegMoves 0.028  0.870   0.870
## 23      TimeOnTummy ~~ PreciseHandMoves 1.061  7.239   7.239
## 24      TimeOnTummy ~~         Babbling 0.811 -1.728  -1.728
## 25      TimeOnTummy ~~       Screeching 0.171 -0.771  -0.771
## 26      TimeOnTummy ~~   VocalImitation 0.019  0.270   0.270
## 27  PreciseLegMoves ~~ PreciseHandMoves 1.354 -7.123  -7.123
## 28  PreciseLegMoves ~~         Babbling 0.038  0.362   0.362
## 29  PreciseLegMoves ~~       Screeching 0.525  1.311   1.311
## 30  PreciseLegMoves ~~   VocalImitation 0.240  0.920   0.920
## 31 PreciseHandMoves ~~         Babbling 0.010 -0.190  -0.190
## 32 PreciseHandMoves ~~       Screeching 0.045 -0.383  -0.383
## 33 PreciseHandMoves ~~   VocalImitation 0.233  0.917   0.917
## 34         Babbling ~~       Screeching 1.157  6.158   6.158
## 35         Babbling ~~   VocalImitation 0.001  0.267   0.267
## 36       Screeching ~~   VocalImitation 0.997 -3.737  -3.737
```

---
class: inverse, middle, center
# Examples
---

## Theory: Home-advantage in sports

<img src="theoreticalEx.png" width="90%" style="display: block; margin: auto;" />
---

## Specification and results: HA in sports

---

## Bayesian CFA with structural model

<img src="BayesianMod.png" width="50%" style="display: block; margin: auto;" />
---
class: inverse, middle, center
# Practical aspect
---

## Theory and data

Holzinger and Swineford data (1939) - [LINK](https://www.rdocumentation.org/packages/psychTools/versions/2.0.8/topics/holzinger.swineford)

```r
#install.packages('sem')
require(sem)
data('HS.data')
```

<img src="image7.png" width="70%" style="display: block; margin: auto;" />
---

## Checking the data

```r
dim(HS.data)
```

```
## [1] 301  32
```

```r
summary(HS.data[,c('visual','cubes','flags','paragrap','sentence','wordm','addition','counting','straight')])
```

```
##      visual         cubes          flags       paragrap        sentence   
##  Min.   : 4.0   Min.   : 9.0   Min.   : 2   Min.   : 0.00   Min.   : 4.0  
##  1st Qu.:25.0   1st Qu.:21.0   1st Qu.:11   1st Qu.: 7.00   1st Qu.:14.0  
##  Median :30.0   Median :24.0   Median :17   Median : 9.00   Median :18.0  
##  Mean   :29.6   Mean   :24.4   Mean   :18   Mean   : 9.18   Mean   :17.4  
##  3rd Qu.:34.0   3rd Qu.:27.0   3rd Qu.:25   3rd Qu.:11.00   3rd Qu.:21.0  
##  Max.   :51.0   Max.   :37.0   Max.   :36   Max.   :19.00   Max.   :28.0  
##      wordm         addition        counting      straight  
##  Min.   : 1.0   Min.   : 30.0   Min.   : 61   Min.   :100  
##  1st Qu.:10.0   1st Qu.: 80.0   1st Qu.: 97   1st Qu.:171  
##  Median :14.0   Median : 94.0   Median :110   Median :195  
##  Mean   :15.3   Mean   : 96.2   Mean   :110   Mean   :193  
##  3rd Qu.:19.0   3rd Qu.:113.0   3rd Qu.:122   3rd Qu.:219  
##  Max.   :43.0   Max.   :171.0   Max.   :200   Max.   :333
```

You can also plot univariate probability density functions (see Lecture 1 and 2)

---

## Multivariate normality

```r
#install.packages('psych')
require(psych)
scatter.hist(x=HS.data$visual,y=HS.data$cubes, density = T, ellipse = T)
```

<img src="CFA_files/figure-html/unnamed-chunk-40-1.png" width="40%" style="display: block; margin: auto;" />
---

## Model identification (LVs scale marker variable)

Calculate number of parameters:

6 loadings + 9 residual variances + 3 LVs variances + 3 LVs covariances = 21 free parameters

Total number of parameters = 9*(9+1)/2 = 45

Overidentified model!
---

## Model

```r
detach('package:sem')
fact3<-'
spatial =~ visual + cubes + flags
verbal =~ paragrap + sentence + wordm
speed =~ addition + counting + straight
'

fact3fit<-cfa(fact3, data=HS.data)
summary(fact3fit, fit.measures=TRUE ,standardized=TRUE)
```

```
## lavaan 0.6-7 ended normally after 150 iterations
## 
## Estimator ML
## Optimization method NLMINB
## Number of free parameters 21
## 
## Number of observations 301
## 
## Model Test User Model:
## 
## Test statistic 85.172
## Degrees of freedom 24
## P-value (Chi-square) 0.000
## 
## Model Test Baseline Model:
## 
## Test statistic 918.592
## Degrees of freedom 36
## P-value 0.000
## 
## User Model versus Baseline Model:
## 
## Comparative Fit Index (CFI) 0.931
## Tucker-Lewis Index (TLI) 0.896
## 
## Loglikelihood and Information Criteria:
## 
## Loglikelihood user model (H0) -9578.017
## Loglikelihood unrestricted model (H1) -9535.431
## 
## Akaike (AIC) 19198.034
## Bayesian (BIC) 19275.883
## Sample-size adjusted Bayesian (BIC) 19209.283
## 
## Root Mean Square Error of Approximation:
## 
## RMSEA 0.092
## 90 Percent confidence interval - lower 0.071
## 90 Percent confidence interval - upper 0.114
## P-value RMSEA <= 0.05 0.001
## 
## Standardized Root Mean Square Residual:
## 
## SRMR 0.065
## 
## Parameter Estimates:
## 
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
## 
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## spatial =~ 
## visual 1.000 5.397 0.772
## cubes 0.369 0.066 5.553 0.000 1.992 0.424
## flags 0.973 0.146 6.682 0.000 5.250 0.581
## verbal =~ 
## paragrap 1.000 2.969 0.852
## sentence 1.484 0.087 17.015 0.000 4.406 0.855
## wordm 2.161 0.129 16.703 0.000 6.416 0.838
## speed =~ 
## addition 1.000 14.235 0.570
## counting 1.026 0.144 7.152 0.000 14.611 0.723
## straight 1.696 0.237 7.154 0.000 24.143 0.665
## 
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## spatial ~~ 
## verbal 7.347 1.323 5.552 0.000 0.459 0.459
## speed 36.098 7.755 4.655 0.000 0.470 0.470
## verbal ~~ 
## speed 11.991 3.401 3.525 0.000 0.284 0.284
## 
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .visual 19.774 4.091 4.833 0.000 19.774 0.404
## .cubes 18.139 1.628 11.145 0.000 18.139 0.820
## .flags 54.033 5.801 9.314 0.000 54.033 0.662
## .paragrap 3.340 0.429 7.778 0.000 3.340 0.275
## .sentence 7.140 0.934 7.642 0.000 7.140 0.269
## .wordm 17.455 2.109 8.278 0.000 17.455 0.298
## .addition 421.390 42.927 9.816 0.000 421.390 0.675
## .counting 195.314 29.676 6.582 0.000 195.314 0.478
## .straight 735.487 91.898 8.003 0.000 735.487 0.558
## spatial 29.127 5.238 5.561 0.000 1.000 1.000
## verbal 8.816 1.009 8.737 0.000 1.000 1.000
## speed 202.639 45.503 4.453 0.000 1.000 1.000
```

---

## Explained variance - R2

```r
inspect(fact3fit,'r2')
```

```
##   visual    cubes    flags paragrap sentence    wordm addition counting 
##    0.596    0.180    0.338    0.725    0.731    0.702    0.325    0.522 
## straight 
##    0.442
```

---

## Checking multivariate normality

```r
#install.packages('MVN')
require(MVN)
test<-mvn(HS.data[,c('visual','cubes','flags','paragrap','sentence','wordm','addition','counting','straight')], mvnTest = 'royston')
test$multivariateNormality
```

```
##      Test   H  p value MVN
## 1 Royston 126 6.65e-23  NO
```

---

## What can we do

1. Bootstrap our results 
2. Use robust standard errors 
3. Change test statistic (eg. Satorra Bentler)

---

## Robust standard errors

```r
fact3fitRob<-cfa(fact3, data=HS.data, se='robust.sem',test='satorra.bentler')
summary(fact3fitRob,standardized=TRUE)
```

```
## lavaan 0.6-7 ended normally after 150 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of free parameters                         21
##                                                       
##   Number of observations                           301
##                                                       
## Model Test User Model:
##                                               Standard      Robust
##   Test Statistic                                85.172      80.756
##   Degrees of freedom                                24          24
##   P-value (Chi-square)                           0.000       0.000
##   Scaling correction factor                                  1.055
##        Satorra-Bentler correction                                 
## 
## Parameter Estimates:
## 
##   Standard errors                           Robust.sem
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   spatial =~                                                            
##     visual            1.000                               5.397    0.772
##     cubes             0.369    0.069    5.358    0.000    1.992    0.424
##     flags             0.973    0.153    6.364    0.000    5.250    0.581
##   verbal =~                                                             
##     paragrap          1.000                               2.969    0.852
##     sentence          1.484    0.089   16.763    0.000    4.406    0.855
##     wordm             2.161    0.139   15.498    0.000    6.416    0.838
##   speed =~                                                              
##     addition          1.000                              14.235    0.570
##     counting          1.026    0.132    7.755    0.000   14.611    0.723
##     straight          1.696    0.208    8.170    0.000   24.143    0.665
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   spatial ~~                                                            
##     verbal            7.347    1.480    4.965    0.000    0.459    0.459
##     speed            36.098    7.596    4.752    0.000    0.470    0.470
##   verbal ~~                                                             
##     speed            11.991    3.811    3.146    0.002    0.284    0.284
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .visual           19.774    4.982    3.969    0.000   19.774    0.404
##    .cubes            18.139    1.719   10.552    0.000   18.139    0.820
##    .flags            54.033    5.413    9.981    0.000   54.033    0.662
##    .paragrap          3.340    0.450    7.423    0.000    3.340    0.275
##    .sentence          7.140    0.929    7.689    0.000    7.140    0.269
##    .wordm            17.455    2.267    7.701    0.000   17.455    0.298
##    .addition        421.390   41.596   10.130    0.000  421.390    0.675
##    .counting        195.314   29.691    6.578    0.000  195.314    0.478
##    .straight        735.487   88.246    8.335    0.000  735.487    0.558
##     spatial          29.127    6.024    4.836    0.000    1.000    1.000
##     verbal            8.816    1.087    8.109    0.000    1.000    1.000
##     speed           202.639   43.714    4.636    0.000    1.000    1.000
```

---

## Modification indices

```r
mi <- modindices(fact3fitRob)
mi
```

```
##         lhs op      rhs     mi      epc  sepc.lv sepc.all sepc.nox
## 25  spatial =~ paragrap  1.216    0.038    0.207    0.059    0.059
## 26  spatial =~ sentence  7.446   -0.140   -0.756   -0.147   -0.147
## 27  spatial =~    wordm  2.839    0.130    0.701    0.092    0.092
## 28  spatial =~ addition 18.568   -1.611   -8.697   -0.348   -0.348
## 29  spatial =~ counting  4.192   -0.692   -3.736   -0.185   -0.185
## 30  spatial =~ straight 36.044    3.446   18.600    0.512    0.512
## 31   verbal =~   visual  8.923    0.702    2.083    0.298    0.298
## 32   verbal =~    cubes  0.018   -0.015   -0.045   -0.010   -0.010
## 33   verbal =~    flags  9.162   -0.725   -2.153   -0.238   -0.238
## 34   verbal =~ addition  0.074   -0.139   -0.414   -0.017   -0.017
## 35   verbal =~ counting  3.403   -0.811   -2.407   -0.119   -0.119
## 36   verbal =~ straight  4.693    1.646    4.886    0.135    0.135
## 37    speed =~   visual  0.014    0.006    0.091    0.013    0.013
## 38    speed =~    cubes  1.572   -0.034   -0.490   -0.104   -0.104
## 39    speed =~    flags  0.711    0.047    0.671    0.074    0.074
## 40    speed =~ paragrap  0.002   -0.001   -0.008   -0.002   -0.002
## 41    speed =~ sentence  0.199   -0.008   -0.109   -0.021   -0.021
## 42    speed =~    wordm  0.260    0.013    0.187    0.024    0.024
## 43   visual ~~    cubes  3.619   -4.421   -4.421   -0.233   -0.233
## 44   visual ~~    flags  0.929   -6.633   -6.633   -0.203   -0.203
## 45   visual ~~ paragrap  3.547    1.408    1.408    0.173    0.173
## 46   visual ~~ sentence  0.521   -0.795   -0.795   -0.067   -0.067
## 47   visual ~~    wordm  0.049    0.370    0.370    0.020    0.020
## 48   visual ~~ addition  5.461  -17.860  -17.860   -0.196   -0.196
## 49   visual ~~ counting  0.602   -4.804   -4.804   -0.077   -0.077
## 50   visual ~~ straight  7.252   29.650   29.650    0.246    0.246
## 51    cubes ~~    flags  8.529    6.986    6.986    0.223    0.223
## 52    cubes ~~ paragrap  0.535   -0.406   -0.406   -0.052   -0.052
## 53    cubes ~~ sentence  0.022   -0.122   -0.122   -0.011   -0.011
## 54    cubes ~~    wordm  0.786    1.099    1.099    0.062    0.062
## 55    cubes ~~ addition  8.918  -16.784  -16.784   -0.192   -0.192
## 56    cubes ~~ counting  0.053   -0.993   -0.993   -0.017   -0.017
## 57    cubes ~~ straight  1.907   10.864   10.864    0.094    0.094
## 58    flags ~~ paragrap  0.143   -0.381   -0.381   -0.028   -0.028
## 59    flags ~~ sentence  7.860   -4.163   -4.163   -0.212   -0.212
## 60    flags ~~    wordm  1.856    3.068    3.068    0.100    0.100
## 61    flags ~~ addition  0.641   -8.191   -8.191   -0.054   -0.054
## 62    flags ~~ counting  0.054   -1.857   -1.857   -0.018   -0.018
## 63    flags ~~ straight  4.097   29.208   29.208    0.147    0.147
## 64 paragrap ~~ sentence  2.519    2.222    2.222    0.455    0.455
## 65 paragrap ~~    wordm  6.207   -4.923   -4.923   -0.645   -0.645
## 66 paragrap ~~ addition  6.015    6.818    6.818    0.182    0.182
## 67 paragrap ~~ counting  3.856   -4.173   -4.173   -0.163   -0.163
## 68 paragrap ~~ straight  0.187   -1.681   -1.681   -0.034   -0.034
## 69 sentence ~~    wordm  0.920    2.829    2.829    0.253    0.253
## 70 sentence ~~ addition  1.188   -4.459   -4.459   -0.081   -0.081
## 71 sentence ~~ counting  0.338    1.818    1.818    0.049    0.049
## 72 sentence ~~ straight  0.982    5.664    5.664    0.078    0.078
## 73    wordm ~~ addition  0.270   -3.226   -3.226   -0.038   -0.038
## 74    wordm ~~ counting  0.283    2.526    2.526    0.043    0.043
## 75    wordm ~~ straight  0.105   -2.810   -2.810   -0.025   -0.025
## 76 addition ~~ counting 33.716  245.021  245.021    0.854    0.854
## 77 addition ~~ straight  5.108 -153.592 -153.592   -0.276   -0.276
## 78 counting ~~ straight 14.764 -302.949 -302.949   -0.799   -0.799
```

---

## Change the model

```r
fact3A<-'
spatial =~ visual + cubes + flags + straight + addition
verbal =~ paragrap + sentence + wordm
speed =~ addition + counting + straight
'

fact3AfitRob<-cfa(fact3A, data=HS.data,se='robust.sem',test='satorra.bentler')
summary(fact3AfitRob, fit.measures=TRUE ,standardized=TRUE)
```

```
## lavaan 0.6-7 ended normally after 200 iterations
## 
## Estimator ML
## Optimization method NLMINB
## Number of free parameters 23
## 
## Number of observations 301
## 
## Model Test User Model:
## Standard Robust
## Test Statistic 46.251 44.242
## Degrees of freedom 22 22
## P-value (Chi-square) 0.002 0.003
## Scaling correction factor 1.045
## Satorra-Bentler correction 
## 
## Model Test Baseline Model:
## 
## Test statistic 918.592 789.304
## Degrees of freedom 36 36
## P-value 0.000 0.000
## Scaling correction factor 1.164
## 
## User Model versus Baseline Model:
## 
## Comparative Fit Index (CFI) 0.973 0.970
## Tucker-Lewis Index (TLI) 0.955 0.952
## 
## Robust Comparative Fit Index (CFI) 0.973
## Robust Tucker-Lewis Index (TLI) 0.957
## 
## Loglikelihood and Information Criteria:
## 
## Loglikelihood user model (H0) -9558.556 -9558.556
## Loglikelihood unrestricted model (H1) -9535.431 -9535.431
## 
## Akaike (AIC) 19163.112 19163.112
## Bayesian (BIC) 19248.376 19248.376
## Sample-size adjusted Bayesian (BIC) 19175.433 19175.433
## 
## Root Mean Square Error of Approximation:
## 
## RMSEA 0.061 0.058
## 90 Percent confidence interval - lower 0.036 0.033
## 90 Percent confidence interval - upper 0.085 0.082
## P-value RMSEA <= 0.05 0.220 0.271
## 
## Robust RMSEA 0.059
## 90 Percent confidence interval - lower 0.033
## 90 Percent confidence interval - upper 0.084
## 
## Standardized Root Mean Square Residual:
## 
## SRMR 0.042 0.042
## 
## Parameter Estimates:
## 
## Standard errors Robust.sem
## Information Expected
## Information saturated (h1) model Structured
## 
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## spatial =~ 
## visual 1.000 5.283 0.755
## cubes 0.397 0.068 5.864 0.000 2.100 0.447
## flags 1.014 0.140 7.234 0.000 5.355 0.593
## straight 2.255 0.573 3.936 0.000 11.911 0.328
## addition -1.049 0.532 -1.971 0.049 -5.541 -0.222
## verbal =~ 
## paragrap 1.000 2.965 0.850
## sentence 1.489 0.089 16.725 0.000 4.415 0.857
## wordm 2.163 0.140 15.491 0.000 6.413 0.838
## speed =~ 
## addition 1.000 18.934 0.758
## counting 0.775 0.141 5.486 0.000 14.678 0.726
## straight 0.921 0.144 6.404 0.000 17.434 0.480
## 
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## spatial ~~ 
## verbal 6.864 1.441 4.764 0.000 0.438 0.438
## speed 39.193 14.124 2.775 0.006 0.392 0.392
## verbal ~~ 
## speed 15.065 5.610 2.686 0.007 0.268 0.268
## 
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .visual 20.990 4.397 4.774 0.000 20.990 0.429
## .cubes 17.700 1.664 10.635 0.000 17.700 0.801
## .flags 52.914 4.821 10.976 0.000 52.914 0.649
## .straight 709.836 81.046 8.758 0.000 709.836 0.538
## .addition 317.043 59.743 5.307 0.000 317.043 0.508
## .paragrap 3.366 0.452 7.448 0.000 3.366 0.277
## .sentence 7.064 0.922 7.663 0.000 7.064 0.266
## .wordm 17.493 2.271 7.702 0.000 17.493 0.298
## .counting 193.360 35.564 5.437 0.000 193.360 0.473
## spatial 27.911 5.407 5.162 0.000 1.000 1.000
## verbal 8.790 1.087 8.085 0.000 1.000 1.000
## speed 358.496 94.408 3.797 0.000 1.000 1.000
```

---

## Compare the models

```r
diff<-compareFit(fact3fitRob, fact3AfitRob)
summary(diff)
```

```
## ################### Nested Model Comparison #########################
## Scaled Chi-Squared Difference Test (method = "satorra.bentler.2001")
## 
## lavaan NOTE:
##     The "Chisq" column contains standard test statistics, not the
##     robust test that should be reported per model. A robust difference
##     test is a function of two standard (not robust) statistics.
##  
##              Df   AIC   BIC Chisq Chisq diff Df diff Pr(>Chisq)    
## fact3AfitRob 22 19163 19248  46.2                                  
## fact3fitRob  24 19198 19276  85.2       33.6       2    4.9e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## ####################### Model Fit Indices ###########################
##              chisq.scaled df.scaled pvalue.scaled cfi.robust tli.robust
## fact3AfitRob      44.242†        22          .003      .973†      .957†
## fact3fitRob       80.756         24          .000      .932       .898 
##                     aic        bic rmsea.robust  srmr
## fact3AfitRob 19163.112† 19248.376†        .059† .042†
## fact3fitRob  19198.034  19275.883         .091  .065 
## 
## ################## Differences in Fit Indices #######################
##                            df.scaled cfi.robust tli.robust  aic  bic
## fact3fitRob - fact3AfitRob         2     -0.042     -0.059 34.9 27.5
##                            rmsea.robust  srmr
## fact3fitRob - fact3AfitRob        0.032 0.023
```

---

## Model development

---

## Important aspects: theory

- Understanding differences between Exploratory FA and Confirmatory FA 
- How is linear model defined in the CFA 
- Assumptions of the CFA 
- Scaling of the latent variables 
- Interpretation of the coefficients 
- Number of free parameters versus total number of parameters

---

## Important aspects: practice

- Specifying and estimating CFA model 
- Scaling the LVs by using marker variable or by scaling LVs variance to 1 
- Adding intercepts to your CFA model 
- Making a full SEM model

---

## Literature

Confirmatory Factor Analysis for Applied Research by Timothy A. Brown

Chapters 9 of Principles and Practice of Structural Equation Modeling by Rex B. Kline

Latent Variable Modeling Using R: A Step-by-Step Guide by A. Alexander Beaujean

---

## Thank you for your attention