Multigroup models
As an example, we will fit the model from the lavaan
tutorial with loadings constrained to equality across groups.
We first load the example data and split it between groups:
dat = example_data("holzinger_swineford")
dat_g1 = dat[dat.school .== "Pasteur", :]
dat_g2 = dat[dat.school .== "Grant-White", :]
We then specify our model via the graph interface:
latent_vars = [:visual, :textual, :speed]
observed_vars = Symbol.(:x, 1:9)
graph = @StenoGraph begin
# measurement model
visual → fixed(1, 1)*x1 + label(:λ₂, :λ₂)*x2 + label(:λ₃, :λ₃)*x3
textual → fixed(1, 1)*x4 + label(:λ₅, :λ₅)*x5 + label(:λ₆, :λ₆)*x6
speed → fixed(1, 1)*x7 + label(:λ₈, :λ₈)*x8 + label(:λ₉, :λ₉)*x9
# variances and covariances
_(observed_vars) ↔ _(observed_vars)
_(latent_vars) ⇔ _(latent_vars)
end
You can pass multiple arguments to fix()
and label()
for each group. Parameters with the same label (within and across groups) are constrained to be equal. To fix a parameter in one group, but estimate it freely in the other, you may write fix(NaN, 4.3)
.
You can then use the resulting graph to specify an EnsembleParameterTable
groups = [:Pasteur, :Grant_White]
partable = EnsembleParameterTable(;
graph = graph,
observed_vars = observed_vars,
latent_vars = latent_vars,
groups = groups)
EnsembleParameterTable with groups: |Grant_White||Pasteur|
Grant_White:
--------- ---------------- --------- ------- ------------- --------- ----------
from parameter_type to free value_fixed start estimat ⋯
Symbol Symbol Symbol Bool Float64 Float64 Float6 ⋯
--------- ---------------- --------- ------- ------------- --------- ----------
visual → x1 false 1.0 0.0 0. ⋯
visual → x2 true 0.0 0.0 0. ⋯
visual → x3 true 0.0 0.0 0. ⋯
textual → x4 false 1.0 0.0 0. ⋯
textual → x5 true 0.0 0.0 0. ⋯
textual → x6 true 0.0 0.0 0. ⋯
speed → x7 false 1.0 0.0 0. ⋯
speed → x8 true 0.0 0.0 0. ⋯
speed → x9 true 0.0 0.0 0. ⋯
x1 ↔ x1 true 0.0 0.0 0. ⋯
x2 ↔ x2 true 0.0 0.0 0. ⋯
x3 ↔ x3 true 0.0 0.0 0. ⋯
x4 ↔ x4 true 0.0 0.0 0. ⋯
x5 ↔ x5 true 0.0 0.0 0. ⋯
x6 ↔ x6 true 0.0 0.0 0. ⋯
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
--------- ---------------- --------- ------- ------------- --------- ----------
2 columns and 9 rows omitted
Latent Variables: [:visual, :textual, :speed]
Observed Variables: [:x1, :x2, :x3, :x4, :x5, :x6, :x7, :x8, :x9]
Pasteur:
--------- ---------------- --------- ------- ------------- --------- ----------
from parameter_type to free value_fixed start estimat ⋯
Symbol Symbol Symbol Bool Float64 Float64 Float6 ⋯
--------- ---------------- --------- ------- ------------- --------- ----------
visual → x1 false 1.0 0.0 0. ⋯
visual → x2 true 0.0 0.0 0. ⋯
visual → x3 true 0.0 0.0 0. ⋯
textual → x4 false 1.0 0.0 0. ⋯
textual → x5 true 0.0 0.0 0. ⋯
textual → x6 true 0.0 0.0 0. ⋯
speed → x7 false 1.0 0.0 0. ⋯
speed → x8 true 0.0 0.0 0. ⋯
speed → x9 true 0.0 0.0 0. ⋯
x1 ↔ x1 true 0.0 0.0 0. ⋯
x2 ↔ x2 true 0.0 0.0 0. ⋯
x3 ↔ x3 true 0.0 0.0 0. ⋯
x4 ↔ x4 true 0.0 0.0 0. ⋯
x5 ↔ x5 true 0.0 0.0 0. ⋯
x6 ↔ x6 true 0.0 0.0 0. ⋯
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
--------- ---------------- --------- ------- ------------- --------- ----------
2 columns and 9 rows omitted
Latent Variables: [:visual, :textual, :speed]
Observed Variables: [:x1, :x2, :x3, :x4, :x5, :x6, :x7, :x8, :x9]
The parameter table can be used to create a Dict
of RAMMatrices with keys equal to the group names and parameter tables as values:
specification = RAMMatrices(partable)
Dict{Symbol, RAMMatrices} with 2 entries:
:Grant_White => RAMMatrices…
:Pasteur => RAMMatrices…
That is, you can asses the group-specific RAMMatrices
as specification[:group_name]
.
Instead of choosing the workflow "Graph -> EnsembleParameterTable -> RAMMatrices", you may also directly specify RAMMatrices for each group (for an example see this test).
The next step is to construct the model:
model_g1 = Sem(
specification = specification[:Pasteur],
data = dat_g1
)
model_g2 = Sem(
specification = specification[:Grant_White],
data = dat_g2
)
model_ml_multigroup = SemEnsemble(model_g1, model_g2)
SemEnsemble
- Number of Models: 2
- Weights: [0.52, 0.48]
- optimizer: SemOptimizerOptim
Models:
===============================================
---------------------- 1 ----------------------
Structural Equation Model
- Loss Functions
SemML
- Fields
observed: SemObservedData
imply: RAM
optimizer: SemOptimizerOptim
---------------------- 2 ----------------------
Structural Equation Model
- Loss Functions
SemML
- Fields
observed: SemObservedData
imply: RAM
optimizer: SemOptimizerOptim
We now fit the model and inspect the parameter estimates:
solution = sem_fit(model_ml_multigroup)
update_estimate!(partable, solution)
sem_summary(partable)
--------------------------------- Variables ---------------------------------
Latent variables: visual textual speed
Observed variables: x1 x2 x3 x4 x5 x6 x7 x8 x9
Group: Grant_White
---------------------------- Parameter Estimates -----------------------------
Loadings:
visual
to estimate identifier value_fixed start free from type
x1 0.0 const 1.0 0.0 0.0 visual →
x2 0.6 λ₂ 0.0 0.0 1.0 visual →
x3 0.78 λ₃ 0.0 0.0 1.0 visual →
textual
to estimate identifier value_fixed start free from type
x4 0.0 const 1.0 0.0 0.0 textual →
x5 1.08 λ₅ 0.0 0.0 1.0 textual →
x6 0.91 λ₆ 0.0 0.0 1.0 textual →
speed
to estimate identifier value_fixed start free from type
x7 0.0 const 1.0 0.0 0.0 speed →
x8 1.2 λ₈ 0.0 0.0 1.0 speed →
x9 1.04 λ₉ 0.0 0.0 1.0 speed →
Directed Effects:
from to estimate identifier value_fixed start free
Variances:
from to estimate identifier value_fixed start free
x1 ↔ x1 0.65 g2_1 0.0 0.0 1.0
x2 ↔ x2 0.94 g2_2 0.0 0.0 1.0
x3 ↔ x3 0.61 g2_3 0.0 0.0 1.0
x4 ↔ x4 0.33 g2_4 0.0 0.0 1.0
x5 ↔ x5 0.39 g2_5 0.0 0.0 1.0
x6 ↔ x6 0.44 g2_6 0.0 0.0 1.0
x7 ↔ x7 0.6 g2_7 0.0 0.0 1.0
x8 ↔ x8 0.41 g2_8 0.0 0.0 1.0
x9 ↔ x9 0.54 g2_9 0.0 0.0 1.0
visual ↔ visual 0.73 g2_10 0.0 0.0 1.0
textual ↔ textual 0.91 g2_13 0.0 0.0 1.0
speed ↔ speed 0.48 g2_15 0.0 0.0 1.0
Covariances:
from to estimate identifier value_fixed start free
textual ↔ visual 0.44 g2_11 0.0 0.0 1.0
speed ↔ visual 0.32 g2_12 0.0 0.0 1.0
speed ↔ textual 0.23 g2_14 0.0 0.0 1.0
Group: Pasteur
---------------------------- Parameter Estimates -----------------------------
Loadings:
visual
to estimate identifier value_fixed start free from type
x1 0.0 const 1.0 0.0 0.0 visual →
x2 0.6 λ₂ 0.0 0.0 1.0 visual →
x3 0.78 λ₃ 0.0 0.0 1.0 visual →
textual
to estimate identifier value_fixed start free from type
x4 0.0 const 1.0 0.0 0.0 textual →
x5 1.08 λ₅ 0.0 0.0 1.0 textual →
x6 0.91 λ₆ 0.0 0.0 1.0 textual →
speed
to estimate identifier value_fixed start free from type
x7 0.0 const 1.0 0.0 0.0 speed →
x8 1.2 λ₈ 0.0 0.0 1.0 speed →
x9 1.04 λ₉ 0.0 0.0 1.0 speed →
Directed Effects:
from to estimate identifier value_fixed start free
Variances:
from to estimate identifier value_fixed start free
x1 ↔ x1 0.55 g1_1 0.0 0.0 1.0
x2 ↔ x2 1.27 g1_2 0.0 0.0 1.0
x3 ↔ x3 0.89 g1_3 0.0 0.0 1.0
x4 ↔ x4 0.44 g1_4 0.0 0.0 1.0
x5 ↔ x5 0.51 g1_5 0.0 0.0 1.0
x6 ↔ x6 0.27 g1_6 0.0 0.0 1.0
x7 ↔ x7 0.85 g1_7 0.0 0.0 1.0
x8 ↔ x8 0.52 g1_8 0.0 0.0 1.0
x9 ↔ x9 0.66 g1_9 0.0 0.0 1.0
visual ↔ visual 0.81 g1_10 0.0 0.0 1.0
textual ↔ textual 0.92 g1_13 0.0 0.0 1.0
speed ↔ speed 0.31 g1_15 0.0 0.0 1.0
Covariances:
from to estimate identifier value_fixed start free
textual ↔ visual 0.42 g1_11 0.0 0.0 1.0
speed ↔ visual 0.17 g1_12 0.0 0.0 1.0
speed ↔ textual 0.18 g1_14 0.0 0.0 1.0
Other things you can query about your fitted model (fit measures, standard errors, etc.) are described in the section Model inspection and work the same way for multigroup models.