A first model

In this tutorial, we will fit an example SEM with our package. The example we are using is from the lavaan tutorial, so it may be familiar. It looks like this:

Visualization of the Political Democracy model

We assume the StructuralEquationModels package is already installed. To use it in the current session, we run

using StructuralEquationModels

We then first define the graph of our model in a syntax which is similar to the R-package lavaan:

observed_vars = [:x1, :x2, :x3, :y1, :y2, :y3, :y4, :y5, :y6, :y7, :y8]
latent_vars = [:ind60, :dem60, :dem65]

graph = @StenoGraph begin

    # loadings
    ind60 → fixed(1)*x1 + x2 + x3
    dem60 → fixed(1)*y1 + y2 + y3 + y4
    dem65 → fixed(1)*y5 + y6 + y7 + y8

    # latent regressions
    ind60 → dem60
    dem60 → dem65
    ind60 → dem65

    # variances
    _(observed_vars) ↔ _(observed_vars)
    _(latent_vars) ↔ _(latent_vars)

    # covariances
    y1 ↔ y5
    y2 ↔ y4 + y6
    y3 ↔ y7
    y8 ↔ y4 + y6

end

Time to first model

When executing the code from this tutorial the first time in a fresh julia session, you may wonder that it takes quite some time. This is not because the implementation is slow, but because the functions are compiled the first time you use them. Try rerunning the example a second time - you will see that all function executions after the first one are quite fast.

We then use this graph to define a ParameterTable object

partable = ParameterTable(
    latent_vars = latent_vars,
    observed_vars = observed_vars,
    graph = graph)

 -------- ---------------- -------- ------- ------------- --------- ------------
    from   parameter_type       to    free   value_fixed     start   estimate  ⋯
  Symbol           Symbol   Symbol    Bool       Float64   Float64    Float64  ⋯
 -------- ---------------- -------- ------- ------------- --------- ------------
   ind60                →       x1   false           1.0       0.0        0.0  ⋯
   ind60                →       x2    true           0.0       0.0        0.0  ⋯
   ind60                →       x3    true           0.0       0.0        0.0  ⋯
   dem60                →       y1   false           1.0       0.0        0.0  ⋯
   dem60                →       y2    true           0.0       0.0        0.0  ⋯
   dem60                →       y3    true           0.0       0.0        0.0  ⋯
   dem60                →       y4    true           0.0       0.0        0.0  ⋯
   dem65                →       y5   false           1.0       0.0        0.0  ⋯
   dem65                →       y6    true           0.0       0.0        0.0  ⋯
   dem65                →       y7    true           0.0       0.0        0.0  ⋯
   dem65                →       y8    true           0.0       0.0        0.0  ⋯
   ind60                →    dem60    true           0.0       0.0        0.0  ⋯
   dem60                →    dem65    true           0.0       0.0        0.0  ⋯
   ind60                →    dem65    true           0.0       0.0        0.0  ⋯
      x1                ↔       x1    true           0.0       0.0        0.0  ⋯
    ⋮            ⋮            ⋮        ⋮          ⋮           ⋮         ⋮      ⋱
 -------- ---------------- -------- ------- ------------- --------- ------------
                                                    1 column and 19 rows omitted
Latent Variables:    [:ind60, :dem60, :dem65] 
Observed Variables:  [:x1, :x2, :x3, :y1, :y2, :y3, :y4, :y5, :y6, :y7, :y8]

load the example data

data = example_data("political_democracy")

and specify our model as

model = Sem(
    specification = partable,
    data = data
)

Structural Equation Model 
- Loss Functions 
   SemML
- Fields 
   observed:    SemObservedData 
   imply:       RAM 
   optimizer:   SemOptimizerOptim

We can now fit the model via

model_fit = sem_fit(model)

Fitted Structural Equation Model 
=============================================== 
--------------------- Model ------------------- 

Structural Equation Model 
- Loss Functions 
   SemML
- Fields 
   observed:    SemObservedData 
   imply:       RAM 
   optimizer:   SemOptimizerOptim 

------------- Optimization result ------------- 

 * Status: success

 * Candidate solution
    Final objective value:     2.120543e+01

 * Found with
    Algorithm:     L-BFGS

 * Convergence measures
    |x - x'|               = 3.43e-05 ≰ 1.5e-08
    |x - x'|/|x'|          = 4.59e-06 ≰ 0.0e+00
    |f(x) - f(x')|         = 7.47e-10 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 3.52e-11 ≤ 1.0e-10
    |g(x)|                 = 8.51e-05 ≰ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    189
    f(x) calls:    561
    ∇f(x) calls:   561

and compute fit measures as

fit_measures(model_fit)

Dict{Symbol, Union{Missing, Float64}} with 8 entries:
  :minus2ll => 3106.66
  :AIC      => 3168.66
  :BIC      => 3240.5
  :df       => 35.0
  :χ²       => 37.6169
  :p_value  => 0.350263
  :RMSEA    => 0.0315738
  :n_par    => 31.0

We can also get a bit more information about the fitted model via the sem_summary() function:

sem_summary(model_fit)


Fitted Structural Equation Model

--------------------------------- Properties --------------------------------- 

Optimization algorithm:      L-BFGS
Convergence:                 true
No. iterations/evaluations:  189

Number of parameters:        31
Number of observations:      75.0

----------------------------------- Model ----------------------------------- 

Structural Equation Model
- Loss Functions
   SemML
- Fields
   observed:    SemObservedData
   imply:       RAM
   optimizer:   SemOptimizerOptim

To investigate the parameter estimates, we can update our partable object to contain the new estimates:

update_estimate!(partable, model_fit)

 -------- ---------------- -------- ------- ------------- --------- ------------
    from   parameter_type       to    free   value_fixed     start    estimate ⋯
  Symbol           Symbol   Symbol    Bool       Float64   Float64     Float64 ⋯
 -------- ---------------- -------- ------- ------------- --------- ------------
   ind60                →       x1   false           1.0       0.0         0.0 ⋯
   ind60                →       x2    true           0.0       0.0     2.18035 ⋯
   ind60                →       x3    true           0.0       0.0     1.81848 ⋯
   dem60                →       y1   false           1.0       0.0         0.0 ⋯
   dem60                →       y2    true           0.0       0.0     1.25672 ⋯
   dem60                →       y3    true           0.0       0.0     1.05773 ⋯
   dem60                →       y4    true           0.0       0.0     1.26476 ⋯
   dem65                →       y5   false           1.0       0.0         0.0 ⋯
   dem65                →       y6    true           0.0       0.0     1.18572 ⋯
   dem65                →       y7    true           0.0       0.0     1.27954 ⋯
   dem65                →       y8    true           0.0       0.0     1.26596 ⋯
   ind60                →    dem60    true           0.0       0.0     1.48297 ⋯
   dem60                →    dem65    true           0.0       0.0    0.837319 ⋯
   ind60                →    dem65    true           0.0       0.0    0.572344 ⋯
      x1                ↔       x1    true           0.0       0.0   0.0826497 ⋯
    ⋮            ⋮            ⋮        ⋮          ⋮           ⋮          ⋮     ⋱
 -------- ---------------- -------- ------- ------------- --------- ------------
                                                    1 column and 19 rows omitted
Latent Variables:    [:ind60, :dem60, :dem65] 
Observed Variables:  [:x1, :x2, :x3, :y1, :y2, :y3, :y4, :y5, :y6, :y7, :y8]

and investigate the solution with

sem_summary(partable)


--------------------------------- Variables --------------------------------- 

Latent variables:    ind60 dem60 dem65
Observed variables:  x1 x2 x3 y1 y2 y3 y4 y5 y6 y7 y8

---------------------------- Parameter Estimates ----------------------------- 

Loadings: 

ind60

  to   estimate   identifier   value_fixed   start   free   from    type

  x1   0.0        const        1.0           0.0     0.0    ind60   →
  x2   2.18       θ_1          0.0           0.0     1.0    ind60   →
  x3   1.82       θ_2          0.0           0.0     1.0    ind60   →

dem60

  to   estimate   identifier   value_fixed   start   free   from    type

  y1   0.0        const        1.0           0.0     0.0    dem60   →
  y2   1.26       θ_3          0.0           0.0     1.0    dem60   →
  y3   1.06       θ_4          0.0           0.0     1.0    dem60   →
  y4   1.26       θ_5          0.0           0.0     1.0    dem60   →

dem65

  to   estimate   identifier   value_fixed   start   free   from    type

  y5   0.0        const        1.0           0.0     0.0    dem65   →
  y6   1.19       θ_6          0.0           0.0     1.0    dem65   →
  y7   1.28       θ_7          0.0           0.0     1.0    dem65   →
  y8   1.27       θ_8          0.0           0.0     1.0    dem65   →

Directed Effects: 

  from        to      estimate   identifier   value_fixed   start   free

  ind60   →   dem60   1.48       θ_9          0.0           0.0     1.0
  dem60   →   dem65   0.84       θ_10         0.0           0.0     1.0
  ind60   →   dem65   0.57       θ_11         0.0           0.0     1.0

Variances: 

  from        to      estimate   identifier   value_fixed   start   free

  x1      ↔   x1      0.08       θ_12         0.0           0.0     1.0
  x2      ↔   x2      0.12       θ_13         0.0           0.0     1.0
  x3      ↔   x3      0.47       θ_14         0.0           0.0     1.0
  y1      ↔   y1      1.92       θ_15         0.0           0.0     1.0
  y2      ↔   y2      7.47       θ_16         0.0           0.0     1.0
  y3      ↔   y3      5.14       θ_17         0.0           0.0     1.0
  y4      ↔   y4      3.19       θ_18         0.0           0.0     1.0
  y5      ↔   y5      2.38       θ_19         0.0           0.0     1.0
  y6      ↔   y6      5.02       θ_20         0.0           0.0     1.0
  y7      ↔   y7      3.48       θ_21         0.0           0.0     1.0
  y8      ↔   y8      3.3        θ_22         0.0           0.0     1.0
  ind60   ↔   ind60   0.45       θ_23         0.0           0.0     1.0
  dem60   ↔   dem60   4.01       θ_24         0.0           0.0     1.0
  dem65   ↔   dem65   0.17       θ_25         0.0           0.0     1.0

Covariances: 

  from       to   estimate   identifier   value_fixed   start   free

  y1     ↔   y5   0.63       θ_26         0.0           0.0     1.0
  y2     ↔   y4   1.33       θ_27         0.0           0.0     1.0
  y2     ↔   y6   2.18       θ_28         0.0           0.0     1.0
  y3     ↔   y7   0.81       θ_29         0.0           0.0     1.0
  y8     ↔   y4   0.35       θ_30         0.0           0.0     1.0
  y8     ↔   y6   1.37       θ_31         0.0           0.0     1.0

Congratulations, you fitted and inspected your very first model! We recommend continuing with Our Concept of a Structural Equation Model.