data(cake, package=‘lme4’)

Henrique Laureano

Last modification on 2022-09-14 10:51:24


if (!requireNamespace('pacman', quietly=TRUE))
    install.packages('pacman')
pacman::p_load(lme4, car, ggplot2, tibble, dplyr, marginaleffects,
               broom, equatiomatic, patchwork)
theme_set(theme_classic(base_size=14))

Data


The data is described in ?cake as

Data on the breakage angle of chocolate cakes made with three different
recipes and baked at six different temperatures.

An interesting note

The experimental notes suggest that the replicate numbering represents
temporal ordering.

is approached at the end.

Looking at the data

data(cake, package='lme4')
summary(cake)
   replicate   recipe temperature     angle            temp    
 1      : 18   A:90   175:45      Min.   :18.00   Min.   :175  
 2      : 18   B:90   185:45      1st Qu.:26.00   1st Qu.:185  
 3      : 18   C:90   195:45      Median :31.00   Median :200  
 4      : 18          205:45      Mean   :32.12   Mean   :200  
 5      : 18          215:45      3rd Qu.:36.75   3rd Qu.:215  
 6      : 18          225:45      Max.   :63.00   Max.   :225  
 (Other):162                                                   

We see that the baking temperatures are provided in two forms, numerically and as a factor. Below we have a histogram of the breakage angle to see if it is ok to assume a normal distribution to our model - the easiest / simplest one in terms of the estimation procedure.

hist(cake$angle)

Figure 1: Breakage angle of chocolate cakes (our response variable) histogram.

Well, our response is not symmetric. Neverthless, since we have a dependence structure that will still be commented on, we’ll start with a normal distribution. We hope that the dependence structure compensates for this - let’s say, misspecification -, by accommodating a good amount of variability.

Before talking about the dependence structure,

we still need to state our hypothesis/goals:

  • There is a difference between the recipes in terms of the breakage angle of chocolate cakes? If there is, estimate them;
  • The same for the baking temperatures;
  • And if there is an interaction between recipes and temperatures.

We can’t forget the replications!
However, there is no practical interest in estimating differences between replications. We only need to accommodate this extra level of variability. The idea is: that maybe there is a dependence between each replication. Maybe the same cooker baked all that cakes following a given recipe in all considered temperatures. So makes sense to believe that bakings from that cooker are similar, dependent, and independent from a different cooker following a different recipe.

We consider that there is a possible dependence or inter-variance in each replication for each recipe, turning out in 15 replications \(\times\) 3 recipes, totalizing 45 different levels. This forms a dependence structure that we do not want to infer about, but need to properly accommodate to not inflame our parameter estimates and mislead our final inferences. In practical terms we just consider an extra normal distribution (of zero mean) with 45 sampled values and what we want is to estimate / predict their standard error.

The two following figures end the summarization of the data.
Below we see how the replications vary according to the temperatures in terms of the angles. All this is split by recipes. In red, for each recipe, we have the mean of the replications for each of the temperatures. We see that there is an actual difference between replications but not between recipes. We have a difference between temperatures, but at this point, looking at this figure I’m not confident at the point to state that there is a statistical difference between them.

cake|>
    dplyr::mutate(recipe=paste('recipe:', recipe))|>
    ggplot()+
    aes(x=temperature, y=angle, group=replicate)+
    geom_vline(xintercept=unique(cake$temperature),
               linetype='dashed', color='gray')+
    geom_line()+
    stat_summary(aes(group=1),
                 fun=mean, geom='line', color='red', size=1)+
    facet_wrap(~ recipe)+
    theme(strip.text.x=element_text(face='bold', size=14))

Figure 2: Scatterplot of the breakage angles according to temperatures per recipe and grouped by replications. In red, is the mean.

Now we inverted. We look at the angles according to replications and grouped them by temperatures. Again, we don’t see a recipe effect. However, we can see a pattern of dependence between replications and a small effect on temperatures. Generally, as bigger the temperature bigger is the angle. Nevertheless, the biggest angles aren’t obtained with the highest temperatures. If there is a temperature effect, the effect is probably small.

cake|>
    dplyr::mutate(recipe=paste('recipe:', recipe))|>
    ggplot()+
    aes(x=replicate, y=angle, group=temperature, color=temperature)+
    geom_line()+
    stat_summary(aes(group=1), 
                 fun=mean, geom='line', color='red', size=1)+
    facet_wrap(~ recipe)+
    theme(legend.position='bottom',
          strip.text.x   =element_text(face='bold', size=14))

Figure 3: Scatterplot of the breakage angles according to replications per recipe and grouped by temperatures. In red, is the mean.

The last thing to say here is to point out the fact that this experiment is balanced, all levels have the same number of data points and in total, we’re working with a sample of size 270 - Not huge, but also not small.

Linear Mixed Model


We state a linear mixed model, i.e. normal distribution for the response and fixed and random/latent effects on the linear predictor.

We tried three characteristics as a fixed effect: the recipes; the temperatures; and their interaction. Only the temperatures appeared statistically significant (we performed a standard covariates selection procedure). In the random effect, i.e. to accommodate the extra variance, we used the 45 levels created by the combination of replications and recipes. This final model can be fitted as follows

mod <- lme4::lmer(angle ~ temperature + (1 | recipe : replicate), cake)

The temperature was tried numerically and as a factor. All the results were the same, so we chose to work within the final model as a factor, to them be able to compare all the temperatures - since they’re just a few. This model is mathematically written as

equatiomatic::extract_eq(mod,
                         wrap=TRUE, terms_per_line=1, use_coef=TRUE,
                         swap_subscript_names=c('.L'='185',
                                                '.Q'='195',
                                                '.C'='205',
                                                '^4'='215',
                                                '^5'='225'),
                         operator_location='start')

\[ \begin{aligned} \operatorname{\widehat{angle}}_{i} &\sim N \left(\mu, \sigma^2 \right) \\ \mu &=32.12_{\alpha_{j[i]}}\\ &\quad + 6.61_{\beta_{1}}(\operatorname{temperature}_{\operatorname{185}})\\ &\quad - 0.39_{\beta_{2}}(\operatorname{temperature}_{\operatorname{195}})\\ &\quad - 0.55_{\beta_{3}}(\operatorname{temperature}_{\operatorname{205}})\\ &\quad - 1.3_{\beta_{4}}(\operatorname{temperature}_{\operatorname{215}})\\ &\quad - 0.91_{\beta_{5}}(\operatorname{temperature}_{\operatorname{225}}) \\ \alpha_{j} &\sim N \left(0, 6.35 \right) \text{, for recipe:replicate j = 1,} \dots \text{,J} \end{aligned} \]

The estimated parameters are already in the equation.
We can see how the random effects (replication : recipe) standard deviation is big.

Quality-Of-Adjustment


Before starting to comment on the results, we check if the fitted model is ok, i.e. if the proposed model fitted well with the data. In the residual plots below we don’t see any critical pattern or behavior that leads us to a bad evaluation.

par(mfrow=c(1, 2))
plot(fitted(mod), resid(mod)) ; abline(a=0, b=0, lty=2)
qqnorm(resid(mod)) ; qqline(resid(mod))

Figure 4: Left: Fitted / predicted values vs. residuals; Right: quantile-quantile plot of the model residuals.

Model summary


Below we have the summary of our model.
We see that the scaled residuals are well behaved and, mainly, we have the latent effects variance estimates. We have a variance of 40.29 for the 45 levels of replications crossed with recipes. The non-explained variance is 20.48. Thus, our extra structure shows to be correctly necessary and that there is a clear difference between replications and dependence within replication.

For the rest is better to talk by showing other things.

summary(mod)
Linear mixed model fit by REML ['lmerMod']
Formula: angle ~ temperature + (1 | recipe:replicate)
   Data: cake

REML criterion at convergence: 1683.1

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.49523 -0.63378 -0.09213  0.55451  2.84177 

Random effects:
 Groups           Name        Variance Std.Dev.
 recipe:replicate (Intercept) 40.29    6.348   
 Residual                     20.48    4.525   
Number of obs: 270, groups:  recipe:replicate, 45

Fixed effects:
              Estimate Std. Error t value
(Intercept)    32.1222     0.9855  32.595
temperature.L   6.6109     0.6746   9.800
temperature.Q  -0.3855     0.6746  -0.572
temperature.C  -0.5483     0.6746  -0.813
temperature^4  -1.2977     0.6746  -1.924
temperature^5  -0.9141     0.6746  -1.355

Correlation of Fixed Effects:
            (Intr) tmpr.L tmpr.Q tmpr.C tmpr^4
temperatr.L 0.000                             
temperatr.Q 0.000  0.000                      
temperatr.C 0.000  0.000  0.000               
temperatr^4 0.000  0.000  0.000  0.000        
temperatr^5 0.000  0.000  0.000  0.000  0.000 

To start talking about the temperature findings, let’s start by computing the analysis of variance (ANOVA). We see that there is a strong difference between them.

car::Anova(mod)
Analysis of Deviance Table (Type II Wald chisquare tests)

Response: angle
             Chisq Df Pr(>Chisq)    
temperature 102.57  5  < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can compute all the marginal means of our response, breakage angle of chocolate cakes, for each temperature and even for each of the levels of the latent effect - even not being our interest to infer on that, just accommodate it.

(mm <- marginaleffects::marginalmeans(mod))
          term value marginalmean std.error conf.low conf.high
1       recipe     A     33.04413 0.9854924 31.11260  34.97566
2       recipe     B     31.68175 0.9854924 29.75022  33.61328
3       recipe     C     31.64078 0.9854924 29.70925  33.57231
4    replicate     1     45.68456 0.9854924 43.75303  47.61609
5    replicate     2     44.50656 0.9854924 42.57503  46.43809
6    replicate     3     36.56788 0.9854924 34.63635  38.49941
7    replicate     4     33.18754 0.9854924 31.25601  35.11907
8    replicate     5     32.36807 0.9854924 30.43654  34.29959
9    replicate     6     28.98772 0.9854924 27.05619  30.91925
10   replicate     7     27.70729 0.9854924 25.77576  29.63882
11   replicate     8     27.75851 0.9854924 25.82698  29.69004
12   replicate     9     27.70729 0.9854924 25.77576  29.63882
13   replicate    10     29.09016 0.9854924 27.15863  31.02169
14   replicate    11     30.06329 0.9854924 28.13176  31.99482
15   replicate    12     31.03642 0.9854924 29.10489  32.96795
16   replicate    13     31.90711 0.9854924 29.97558  33.83864
17   replicate    14     28.21946 0.9854924 26.28794  30.15099
18   replicate    15     27.04147 0.9854924 25.10994  28.97300
19 temperature   175     27.97778 1.1620640 25.70017  30.25538
20 temperature   185     29.95556 1.1620640 27.67795  32.23316
21 temperature   195     31.42222 1.1620640 29.14462  33.69983
22 temperature   205     32.17778 1.1620640 29.90017  34.45538
23 temperature   215     35.84444 1.1620640 33.56684  38.12205
24 temperature   225     35.35556 1.1620640 33.07795  37.63316

Still, is better to look at all that graphically.

mm|>
    ggplot()+
    aes(x=marginalmean, y=value, xmin=conf.low, xmax=conf.high)+
    geom_pointrange(shape=21)+
    facet_wrap(~ term, scales='free')+
    labs(x=NULL, y=NULL)+
    theme(strip.text.x=element_text(face='bold', size=14))

Figure 5: Marginal means with a 95% confidence interval for all recipes, replications and temperatures.

Above, we see how there is no practical difference between recipes; how the two first replications are bigger than the others; and how the breakage angle increase as the temperature increases, but reaches a plateau or maximum point at 215.

Besides computing the mean breakage angle for each temperature, we can compute and test for the significance of all pairwise differences (here, contrasts).

(comp.temp <-
     marginaleffects::comparisons(
                          mod,
                          variables=list(temperature='pairwise')
                      )|>
     broom::tidy()|>
     dplyr::select(!c(type, term, std.error))|>
     dplyr::relocate(conf.low , .after=estimate)|>
     dplyr::relocate(conf.high, .after=conf.low)
)
    contrast   estimate   conf.low conf.high  statistic      p.value
1  185 - 175  1.9777778  0.1080152  3.847540  2.0731901 3.815459e-02
2  195 - 175  3.4444444  1.5746819  5.314207  3.6106119 3.054754e-04
3  205 - 175  4.2000000  2.3302374  6.069763  4.4026171 1.069528e-05
4  215 - 175  7.8666667  5.9969041  9.736429  8.2461718 2.220446e-16
5  225 - 175  7.3777778  5.5080152  9.247540  7.7336978 1.043610e-14
6  195 - 185  1.4666667 -0.4030959  3.336429  1.5374219 1.241900e-01
7  205 - 185  2.2222222  0.3524597  4.091985  2.3294271 1.983645e-02
8  215 - 185  5.8888889  4.0191263  7.758651  6.1729817 6.701397e-10
9  225 - 185  5.4000000  3.5302374  7.269763  5.6605078 1.509258e-08
10 205 - 195  0.7555556 -1.1142070  2.625318  0.7920052 4.283576e-01
11 215 - 195  4.4222222  2.5524597  6.291985  4.6355598 3.559727e-06
12 225 - 195  3.9333333  2.0635708  5.803096  4.1230859 3.738302e-05
13 215 - 205  3.6666667  1.7969041  5.536429  3.8435546 1.212650e-04
14 225 - 205  3.1777778  1.3080152  5.047540  3.3310807 8.650952e-04
15 225 - 215 -0.4888889 -2.3586515  1.380874 -0.5124740 6.083193e-01

The same contrasts are seen below graphically, where is simpler to see which differences are not statistically significant. Almost all differences in breakage angles between baking temperates are statistically significant. Just three aren’t - the ones in which the 95% confidence interval reaches the value 0.

comp.temp|>
    ggplot()+
    aes(x=estimate, y=contrast, xmin=conf.low, xmax=conf.high)+
    labs(x=NULL, y=NULL)+
    scale_x_continuous(breaks=seq(-2.5, 10, by=1.25))+
    geom_vline(xintercept=seq(-2.5, 10, by=1.25),
               linetype='dashed', color='gray')+
    geom_vline(xintercept=0, linetype='dashed', color='red')+
    geom_pointrange(shape=21)

Figure 6: Estimated differences (contrasts) between temperatures with respective 95% confidence intervals.

Random effects


Here we look at the 45 estimated / predicted latent effects. They vary considerably

ranef.mod <- tibble::as_tibble(ranef(mod))|>
    dplyr::select(!c(grpvar, term))|>
    dplyr::mutate(
               recipe     =gsub(':.*$', '', grp, perl=TRUE),
               replication=factor(gsub('^.:', '', grp), levels=1:15)
           )
summary(ranef.mod$condval)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -7.642  -3.800  -2.264   0.000   1.731  15.253 

We can also see splitting by recipe

ranef.mod|>
    dplyr::group_by(recipe)|>
    dplyr::summarise(mean  =  mean(condval),
                     sd    =    sd(condval),
                     median=median(condval))
# A tibble: 3 × 4
  recipe   mean    sd median
  <chr>   <dbl> <dbl>  <dbl>
1 A       0.922  5.08 -0.881
2 B      -0.440  6.46 -2.57 
3 C      -0.481  6.92 -2.26 
p1 <-
    ranef.mod|>
    dplyr::mutate(recipe=paste('recipe:', recipe))|>
    ggplot()+
    aes(x=replication, y=condval)+
    geom_line(group=1, size=1)+
    facet_wrap(~ recipe)+
    theme(strip.text.x=element_text(face='bold', size=14))
p2 <-
    ranef.mod|>
    ggplot()+
    aes(x=replication, y=condval, group=recipe, color=recipe)+
    geom_vline(xintercept=unique(cake$replicate),
               linetype='dashed', color='gray')+
    geom_line(size=1)+
    theme(legend.position='bottom')
p1 / p2

Figure 7: Scatterplot of the predicted random effects according to replications per recipe.

We have already looked at the estimated breakage angle means per replications and recipes. Here, to see which levels appear to have a bigger variance, we look at the predicted random effects. Again, no considerable differences between recipes and bigger values for the first replications. It’s as if the first replications were used / necessary to practice. Very interesting.

Final remarks


There is

  • no statistically significant difference between recipes in terms of the breakage angle of chocolate cakes;

  • no statistically significant interaction between recipes and baking temperatures in terms of the breakage angle of chocolate cakes;

  • a statistically significant difference between baking temperatures in terms of the breakage angle of chocolate cakes;

  • a non-negligible difference between recipe replications, i.e. the replications don’t present an equal and all random variance. The data asks for a model with a dependence / latent-effects structure.

About the replicate numbering representing temporal ordering, we also addressed this point.

Basically, instead of considering the replication as a factor, we consider it as numeric in the model latent effects. Let’s say that we called the numeric representation of the replication as rep, to correctly accommodate this new dependence structure we just need to enter the latent structure as

(rep | recipe)

into the {lme4::lmer()} model specification.

We did it and basically: the fixed effects inferences don’t change and the model fitting presents some numerical instability and problems of convergence. All issues that we didn’t have dealt with the replication as a factor.

References


The analysis was performed with the R (R Core Team 2022) language and environment for statistical computing. The following R packages were used: {lme4} (Bates et al. 2015), {car} (Fox and Weisberg 2019), {tibble} (Müller and Wickham 2022), {dplyr} (Wickham et al. 2022), {ggplot2} (Wickham 2016), {marginaleffects} (Arel-Bundock 2022), {broom} (Robinson, Hayes, and Couch 2022), {equatiomatic} (Anderson, Heiss, and Sumners 2022) and {patchwork} (Pedersen 2020).

Anderson, Daniel, Andrew Heiss, and Jay Sumners. 2022. equatiomatic: Transform Models into ’LaTeX’ Equations. R package version 0.3.1. https://CRAN.R-project.org/package=equatiomatic.
Arel-Bundock, Vincent. 2022. marginaleffects: Marginal Effects, Marginal Means, Predictions, and Contrasts. R package version 0.5.0. https://CRAN.R-project.org/package=marginaleffects.
Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. “Fitting Linear Mixed-Effects Models Using lme4.” Journal of Statistical Software 67 (1): 1–48. https://doi.org/10.18637/jss.v067.i01.
Fox, John, and Sanford Weisberg. 2019. An R Companion to Applied Regression. Third Edition. Thousand Oaks CA: Sage. https://socialsciences.mcmaster.ca/jfox/Books/Companion/.
Müller, Kirill, and Hadley Wickham. 2022. tible: Simple Data Frames. R package version 3.1.7. https://CRAN.R-project.org/package=tibble.
Pedersen, Thomas Lin. 2020. patchwork: The Composer of Plots. R package version 1.1.1. https://CRAN.R-project.org/package=patchwork.
R Core Team. 2022. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Robinson, David, Alex Hayes, and Simon Couch. 2022. broom: Convert Statistical Objects into Tidy Tibbles. R package version 0.8.0. https://CRAN.R-project.org/package=broom.
Wickham, Hadley. 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4. https://ggplot2.tidyverse.org.
Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2022. dplyr: A Grammar of Data Manipulation. R package version 1.0.9. https://CRAN.R-project.org/package=dplyr.