https://www.zeileis.org/Achim Zeileis2024-03-01T00:31:35+01:00Research homepage of Achim Zeileis, Universität Innsbruck. <br/>Department of Statistics, Faculty of Economics and Statistics. <br/>Universitätsstr. 15, 6020 Innsbruck, Austria. <br/>Tel: +43/512/507-70403Achim ZeileisAchim.Zeileis@R-project.orghttps://www.zeileis.org/Jekyllhttps://www.zeileis.org/news/growth_curve_trees/Subgroup detection in linear growth curve models2023-11-13T00:00:00+01:00Achim ZeileisAchim.Zeileis@R-project.orghttps://www.zeileis.org/New arXiv working paper showing how generalized linear mixed effects model (GLMM) trees, along with their R implementation in the glmertree package, can be used to identify subgroups with differently shaped trajectories in linear growth curve models.<p>New arXiv working paper showing how generalized linear mixed effects model (GLMM) trees, along with their R implementation in the glmertree package, can be used to identify subgroups with differently shaped trajectories in linear growth curve models.</p> <h2 id="citation">Citation</h2> <p>Marjolein Fokkema, Achim Zeileis (2023). “Subgroup Detection in Linear Growth Curve Models with Generalized Linear Mixed Model (GLMM) Trees.” <em>arXiv.org E-Print Archive</em> arXiv:2309.05862 [stat.ME]. <a href="https://doi.org/10.48550/arXiv.2309.05862">doi:10.48550/arXiv.2309.05862</a></p> <h2 id="abstract">Abstract</h2> <p>Growth curve models are popular tools for studying the development of a response variable within subjects over time. Heterogeneity between subjects is common in such models, and researchers are typically interested in explaining or predicting this heterogeneity. We show how generalized linear mixed effects model (GLMM) trees can be used to identify subgroups with differently shaped trajectories in linear growth curve models. Originally developed for clustered cross-sectional data, GLMM trees are extended here to longitudinal data. The resulting extended GLMM trees are directly applicable to growth curve models as an important special case. In simulated and real-world data, we assess the performance of the extensions and compare against other partitioning methods for growth curve models. Extended GLMM trees perform more accurately than the original algorithm and LongCART, and similarly accurate as structural equation model (SEM) trees. In addition, GLMM trees allow for modeling both discrete and continuous time series, are less sensitive to (mis-)specification of the random-effects structure and are much faster to compute.</p> <p><a href="https://arxiv.org/pdf/2309.05862">Read full paper ›</a></p> <h2 id="software">Software</h2> <p><a href="https://CRAN.R-project.org/package=glmertree">https://CRAN.R-project.org/package=glmertree</a></p> <h2 id="illustration">Illustration</h2> <p>As an example, heterogeneity of science ability trajectories among a sample of 250 children is analyzed. The data are from the Early Childhood Longitudinal Study-Kindergarten (ECLS-K) class of 1998-1999 in the USA. Assessments took place from kindergarten in 1998 through 8th grade in 2007. Here we focus on assessments from kindergarten, 1st, 3rd, 5th, and 8th grade. The time since kindergarten was scaled to the number of months to the power of 2/3 in order to obtain approximately linear trajectories.</p> <p>A linear mixed-effect model tree is used to detect heterogeneity in a linear model for the growth of science ability over time. This employs a random intercept for each individual in order to account for the longitudinal nature of the data. The tree tests for differences in the baseline science abilities (i.e., the fixed-effect intercepts of the growth curve models) as well as the growth over time (i.e., the corresponding fixed-effect slopes), using eleven socio-demographic and behavioral characteristics of the children, assessed at baseline, as potential splitting variables.</p> <p>The plot below shows the resulting tree which identifies socio-economic status (SES), gross motor skills (GMOTOR), and internalizing problems (INTERN) as the splitting variables. The x-axes represent the number of months after the baseline assessment, y-axes represent science ability. Gray lines depict observed individual trajectories, red lines depict average growth curve within each terminal node, as estimated with a linear mixed-effect model comprising node-specific fixed effects of time and a random intercept with respect to individuals. The table presents numerical estimates of fixed intercepts and slopes.</p> <p><a href="https://www.zeileis.org/assets/posts/2023-11-13-growth_curve_trees/lmertree.png"><img src="https://www.zeileis.org/assets/posts/2023-11-13-growth_curve_trees/lmertree.png" alt="Linear growth curve model tree for science ability among children." /></a></p> <p>Five subgroups are identified, corresponding to the terminal nodes of the tree, each with a different estimate of the fixed intercept and slope. Groups of children with higher SES also have higher intercepts, indicating higher average science ability. The group of children with lower SES (node 2) is further split based on gross motor skills, with higher motor skills resulting in a higher intercept. The group of children with intermediate levels of SES (node 6) is further split based on internalizing problems, with lower internalizing problems resulting in a higher intercept. The two groups (or nodes) with higher intercepts also have higher slopes, indicating that children with higher ability also gain more ability over time.</p>2023-11-13T00:00:00+01:00https://www.zeileis.org/news/fifawomen2023/Probabilistic forecasting for the FIFA Women's World Cup 20232023-07-17T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://www.zeileis.org/Winning probabilities for all teams in the FIFA Women's World Cup are obtained using a consensus model based on quoted bookmakers' odds. The favorite is defending World Champion United States, followed by European Champion England, and Spain.<p>Winning probabilities for all teams in the FIFA Women's World Cup are obtained using a consensus model based on quoted bookmakers' odds. The favorite is defending World Champion United States, followed by European Champion England, and Spain.</p> <div class="row t20 b20"> <div class="small-8 medium-9 large-10 columns"> Football fans around the world anticipate the FIFA Women's World Cup 2023 that will take place in Australia and New Zealand from 20 July to 20 August 2023. 32 of the best World teams compete to determine the new World Champion. Here, a predictive model is established to forecast what the most likely outcome of the tournament will be. The forecast is based on the expert knowledge of 24 bookmakers and betting exchanges using a model averaging approach. </div> <div class="small-4 medium-3 large-2 columns"> <a href="https://www.fifa.com/tournaments/womens/womensworldcup" alt="FIFA Women's World Cup 2023 web page"><img src="https://upload.wikimedia.org/wikipedia/en/2/24/Logo_of_the_2023_FIFA_Women%27s_World_Cup.svg" alt="FIFA Women's World Cup 2023 logo" /></a> </div> </div> <h2 id="winning-probabilities">Winning probabilities</h2> <p>The model is the so-called bookmaker consensus model which has been proposed by Leitner, Hornik, and Zeileis (2010, <em>International Journal of Forecasting</em>, <a href="https://doi.org/10.1016/j.ijforecast.2009.10.001">doi:10.1016/j.ijforecast.2009.10.001</a>) and successfully applied in previous football tournaments, either by itself or in combination with even more refined <a href="https://www.zeileis.org/news/fifa2022/">machine learning techniques</a>.</p> <p>As in the <a href="https://www.zeileis.org/news/fifawomen2019/">FIFA Women’s World Cup 2019</a>, the forecast shows that the United States are the clear favorite with a forecasted winning probability of 21.5%, followed by England with a winning probability of 15.7% and Spain with 13.1%. Three other teams are still a bit ahead of the rest: Germany with 9.7%, France with 7.5%, and co-host Australia with 7.4%. More details are displayed in the following barchart.</p> <p><a href="https://www.zeileis.org/assets/posts/2023-07-17-fifawomen2023/p_win.html">Interactive full-width graphic</a></p> <p><a href="https://www.zeileis.org/assets/posts/2023-07-17-fifawomen2023/p_win.html"><img src="https://www.zeileis.org/assets/posts/2023-07-17-fifawomen2023/p_win.png" alt="Barchart: Winning probabilities" /></a></p> <p>These probabilistic forecasts have been obtained by model-based averaging of the quoted winning odds for all teams across bookmakers. More precisely, the odds are first adjusted for the bookmakers’ profit margins (“overrounds”, on average 8.6%), averaged on the log-odds scale to a consensus rating, and then transformed back to winning probabilities. The raw bookmakers’ odds as well as the forecasts for all teams are also available in machine-readable form in <a href="https://www.zeileis.org/assets/posts/2023-07-17-fifawomen2023/wwc2023.csv">wwc2023.csv</a>.</p> <p>Although forecasting the winning probabilities for the FIFA Women’s World Cup 2023 is probably of most interest, the bookmaker consensus forecasts can also be employed to infer team-specific abilities using an “inverse” tournament simulation:</p> <ol> <li>If team abilities are available, pairwise winning probabilities can be derived for each possible match (see below).</li> <li>Given pairwise winning probabilities, the whole tournament can be easily simulated to see which team proceeds to which stage in the tournament and which team finally wins.</li> <li>Such a tournament simulation can then be run sufficiently often (here 100,000 times) to obtain relative frequencies for each team winning the tournament.</li> </ol> <p>Using this idea, abilities in step 1 can be chosen such that the simulated winning probabilities in step 3 closely match those from the bookmaker consensus shown above.</p> <h2 id="pairwise-comparisons">Pairwise comparisons</h2> <p>A classical approach to obtain winning probabilities in pairwise comparisons (i.e., matches between teams/players) is the Bradley-Terry model, which is similar to the Elo rating, popular in sports. The Bradley-Terry approach models the probability that a Team A beats a Team B by their associated abilities (or strengths):</p> <math xmlns="http://www.w3.org/1998/Math/MathML"><mstyle displaystyle="true"><mrow><mi fontstyle="normal">Pr</mi><mo stretchy="false">(</mo><mi>A</mi><mtext> beats </mtext><mi>B</mi><mo stretchy="false">)</mo><mo>=</mo><mfrac><mrow><msub><mrow><mi fontstyle="italic">ability</mi></mrow><mrow><mi>A</mi></mrow></msub></mrow><mrow><msub><mrow><mi fontstyle="italic">ability</mi></mrow><mrow><mi>A</mi></mrow></msub><mo>+</mo><msub><mrow><mi fontstyle="italic">ability</mi></mrow><mrow><mi>B</mi></mrow></msub></mrow></mfrac><mo>.</mo></mrow></mstyle></math> <p>Coupled with the “inverse” simulation of the tournament, as described in step 1-3 above, this yields pairwise probabilities for each possible match. The following heatmap shows the probabilistic forecasts for each match with light gray signalling approximately equal chances and green vs. purple signalling advantages for Team A or B, respectively.</p> <p><a href="https://www.zeileis.org/assets/posts/2023-07-17-fifawomen2023/p_match.html">Interactive full-width graphic</a></p> <p><a href="https://www.zeileis.org/assets/posts/2023-07-17-fifawomen2023/p_match.html"><img src="https://www.zeileis.org/assets/posts/2023-07-17-fifawomen2023/p_match.png" alt="Heatmap: Match probabilities" /></a></p> <h2 id="performance-throughout-the-tournament">Performance throughout the tournament</h2> <p>As every single match can be simulated with the pairwise probabilities above, it is also straightfoward to simulate the entire tournament (here: 100,000 times) providing “survival” probabilities for each team across the different stages.</p> <p><a href="https://www.zeileis.org/assets/posts/2023-07-17-fifawomen2023/p_surv.html">Interactive full-width graphic</a></p> <p><a href="https://www.zeileis.org/assets/posts/2023-07-17-fifawomen2023/p_surv.html"><img src="https://www.zeileis.org/assets/posts/2023-07-17-fifawomen2023/p_surv.png" alt="Line plot: Survival probabilities" /></a></p> <p>For example, this shows that the probability for the United States to reach any stage of the tournament is higher than for any other team to reach the same stage. In fact, their survival probabilities are decreasing rather slowly because they can most likely avoid the other favorites for the title until the semifinal. Conversely, Germany’s chances to reach the round of 16 are almost as high (87.6%) as those of the United States but their chances to reach the quarterfinal are much lower (55.7%) because they are most likely to play the strongest expected runner-up, Brazil, in the round of 16.</p> <p>In addition to the curves shown in the plot above, further probabilities of interest can be obtained from the simulation. For example, the probability for the “dream final” between the top favorites, World Champion United States and European Champion England, is 9.1%. The most likely first semi-final is between the United States and Spain with a probability of 13.5%. For the second semi-final it is less clear who is the most likely opponent of England because there are three possible pairings with almost the same probability (around 7%): Against Australia, France, or Germany. This shows that this half of the tournament tree is somewhat more contested with a less certain outcome.</p> <h2 id="odds-and-ends">Odds and ends</h2> <p>The bookmaker consensus model has performed well in previous tournaments, often predicting winners or finalists correctly. However, all forecasts are probabilistic, clearly below 100%, and thus by no means certain. It would also be possible to post-process the bookmaker consensus along with data from historic matches, player ratings, and other information about the teams using <a href="https://www.zeileis.org/news/fifa2022/">machine learning techniques</a>. However, due to lack of time for more refined forecasts at the end of a busy academic year, at least the bookmaker consensus is provided as a solid basic forecast.</p> <p>As a final remark: Betting on the outcome based on the results presented here is not recommended. Not only because the winning probabilities are clearly far below 100% but, more importantly, because the bookmakers have a profit margin of 8.6% which assures that the best chances of making money based on sports betting lie with them.</p> <p>Enjoy the FIFA Women’s World Cup 2023!</p>2023-07-17T00:00:00+02:00https://www.zeileis.org/news/coat/Tree models for assessing covariate-dependent method agreement2023-07-12T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://www.zeileis.org/New arXiv working paper introducing conditional method agreement trees (COAT) which can capture the dependency of a Bland-Altman analysis on covariates. It is ccompanied by an R implementation in the CRAN package coat.<p>New arXiv working paper introducing conditional method agreement trees (COAT) which can capture the dependency of a Bland-Altman analysis on covariates. It is ccompanied by an R implementation in the CRAN package coat.</p> <h2 id="citation">Citation</h2> <p>Siranush Karapetyan, Achim Zeileis, André Henriksen, Alexander Hapfelmeier (2023). “Tree Models for Assessing Covariate-Dependent Method Agreement.” <em>arXiv.org E-Print Archive</em> arXiv:2306.04456 [stat.ME]. <a href="https://doi.org/10.48550/arXiv.2306.04456">doi:10.48550/arXiv.2306.04456</a></p> <h2 id="abstract">Abstract</h2> <p>Method comparison studies explore the agreement of measurements made by two or more methods. Commonly, agreement is evaluated by the well-established Bland-Altman analysis. However, the underlying assumption is that differences between measurements are identically distributed for all observational units and in all application settings. We introduce the concept of conditional method agreement and propose a respective modeling approach to alleviate this constraint. Therefore, the Bland-Altman analysis is embedded in the framework of recursive partitioning to explicitly define subgroups with heterogeneous agreement in dependence of covariates in an exploratory analysis. Three different modeling approaches, conditional inference trees with an appropriate transformation of the modeled differences (CTreeTrafo), distributional regression trees (DistTree), and model-based trees (MOB) are considered. The performance of these models is evaluated in terms of type-I error probability and power in several simulation studies. Further, the adjusted rand index (ARI) is used to quantify the models’ ability to uncover given subgroups. An application example to real data of accelerometer device measurements is used to demonstrate the applicability. Additionally, a two-sample Bland-Altman test is proposed for exploratory or confirmatory hypothesis testing of differences in agreement between subgroups. Results indicate that all models were able to detect given subgroups with high accuracy as the sample size increased. Relevant covariates that may affect agreement could be detected in the application to accelerometer data. We conclude that conditional method agreement trees (COAT) enable the exploratory analysis of method agreement in dependence of covariates and the respective exploratory or confirmatory hypothesis testing of group differences. It is made publicly available through the R package coat.</p> <p><a href="https://arxiv.org/pdf/2306.04456">Read full paper ›</a></p> <h2 id="links">Links</h2> <p>R package: <a href="https://CRAN.R-project.org/package=coat">https://CRAN.R-project.org/package=coat</a></p> <p>Presentation slides: <a href="https://www.zeileis.org/papers/Psychoco-2023.pdf">Psychoco 2023</a></p> <h2 id="illustration">Illustration</h2> <p>The paper presents an illustration in which measurements of activity energy expenditure (in 24 hours) from two different accelerometers (ActiGraph vs. Actiheart) are compared and their dependence on age, gender, weight, etc. is assessed. As the data is not freely available, we show below another illustration taken from the <a href="https://CRAN.R-project.org/package=MethComp">MethComp</a> package.</p> <p>The <code class="language-plaintext highlighter-rouge">scint</code> data provides measurements of the relative kidney function (renal function, percent of total) for 111 patients. The reference method is DMSA static scintigraphy and it is compared here with DTPA dynamic scintigraphy. The question we aim to answer using the new COAT method is:</p> <p><em>Does the agreement between DTPA and DMSA depend on the age and/or the gender of the patient?</em></p> <p>First, the package and data are loaded and reshaped to wide format:</p> <pre><code class="language-{r}">library("coat") data("scint", package = "MethComp") scint_wide <- reshape(scint, v.names = "y", timevar = "meth", idvar = "item", direction = "wide") </code></pre> <p>Then, COAT can be applied using the <code class="language-plaintext highlighter-rouge">coat()</code> function, by default leveraging <code class="language-plaintext highlighter-rouge">ctree()</code> from the <a href="https://CRAN.R-project.org/package=partykit">partykit</a> in the background:</p> <pre><code class="language-{r}">tr1 <- coat(y.DTPA + y.DMSA ~ age + sex, data = scint_wide) print(tr1) ## Conditional method agreement tree (COAT) ## ## Model formula: ## y.DTPA + y.DMSA ~ age + sex ## ## Fitted party: ## [1] root ## | [2] age <= 35: Bias = -0.49, SD = 3.42 ## | [3] age > 35: Bias = 0.25, SD = 7.04 ## ## Number of inner nodes: 1 ## Number of terminal nodes: 2 </code></pre> <p>This shows that the measurement differences between the two scintigraphies vary clearly between young and old patients. While the average difference between the measurements (bias) is close to zero for both age groups, the corresponding standard deviation (SD) is substantially larger (and hence the limits of agreement wider) for the older subgroup. This is better brought out graphically by the corresponding tree display with the classical <a href="https://en.wikipedia.org/wiki/Bland%E2%80%93Altman_plot">Bland-Altman plots</a> in the terminal nodes.</p> <pre><code class="language-{r}">plot(tr1) </code></pre> <p><a href="https://www.zeileis.org/assets/posts/2023-07-12-coat/scint1.png"><img src="https://www.zeileis.org/assets/posts/2023-07-12-coat/scint1.png" alt="COAT model tree for kidney function measurements where agreement depends on the age of the patients." /></a></p> <p>As the Bland-Altman plot for the older subgroup suggests that the bias between the methods may also depend on the mean measurement, we fit a second COAT tree. In addition to age and gender we also include the mean renal function measurement from DTPA and DMSA as a third potential split variable.</p> <pre><code class="language-{r}">tr2 <- coat(y.DTPA + y.DMSA ~ age + sex, data = scint_wide, means = TRUE) print(tr2) ## Conditional method agreement tree (COAT) ## ## Model formula: ## y.DTPA + y.DMSA ~ age + sex ## ## Fitted party: ## [1] root ## | [2] means(y.DTPA, y.DMSA) <= 31: Bias = 4.80, SD = 6.61 ## | [3] means(y.DTPA, y.DMSA) > 31 ## | | [4] means(y.DTPA, y.DMSA) <= 53.5: Bias = -0.38, SD = 3.33 ## | | [5] means(y.DTPA, y.DMSA) > 53.5: Bias = -4.27, SD = 3.90 ## ## Number of inner nodes: 2 ## Number of terminal nodes: 3 plot(tr2) </code></pre> <p><a href="https://www.zeileis.org/assets/posts/2023-07-12-coat/scint2.png"><img src="https://www.zeileis.org/assets/posts/2023-07-12-coat/scint2.png" alt="COAT model tree for kidney function measurements where agreement depends on the mean measurements." /></a></p> <p>This tree reveals three subgroups where only the middle group (with renal function between 31 and 53.5 percent) has both small bias and standard deviation for the scintigraphy differences while for the other two subgroups bias and/or standard deviation are larger.</p>2023-07-12T00:00:00+02:00https://www.zeileis.org/news/ctv/CRAN Task Views: The next generation2023-05-31T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://www.zeileis.org/New arXiv working paper on the relaunch of the CRAN Task View Initiative providing better infrastructure and workflows for proposing and maintaining CRAN Task Views and fostering interactions with the R community.<p>New arXiv working paper on the relaunch of the CRAN Task View Initiative providing better infrastructure and workflows for proposing and maintaining CRAN Task Views and fostering interactions with the R community.</p> <h2 id="citation">Citation</h2> <p>Achim Zeileis, Roger Bivand, Dirk Eddelbuettel, Kurt Hornik, Nathalie Vialaneix (2023). “CRAN Task Views: The Next Generation.” <em>arXiv.org E-Print Archive</em> arXiv:2305.17573 [stat.CO]. <a href="https://doi.org/10.48550/arXiv.2305.17573">doi:10.48550/arXiv.2305.17573</a></p> <h2 id="abstract">Abstract</h2> <div class="row t20 b20"> <div class="small-8 medium-9 large-10 columns"> CRAN Task Views have been available on the Comprehensive R Archive Network since 2005. They provide guidance about which CRAN packages are relevant for tasks related to a certain topic, and can also facilitate automatic installation of all corresponding packages. Motivated by challenges from the growth of CRAN and the R community as a whole since 2005, all of the task views infrastructure and workflows were rethought and relaunched in 2021/22 in order to facilitate maintenance, and to foster deeper interactions with the R community. The redesign encompasses the establishment of a group of CRAN Task View Editors, moving all task view sources to dedicated GitHub repositories, adopting well-documented workflows with a code of conduct, and leveraging R/Markdown files (rather than XML) for the content of the task views. </div> <div class="small-4 medium-3 large-2 columns"> <a href="https://github.com/cran-task-views/ctv/" alt="The CRAN Task View Initiative"><img src="https://www.zeileis.org/assets/posts/2023-05-31-ctv/logo_alpha.png" alt="Logo: CRAN Task Views" /></a> </div> </div> <p><a href="https://arxiv.org/pdf/2305.17573">Read full paper ›</a></p> <h2 id="links">Links</h2> <p>CRAN Task Views: <a href="https://CRAN.R-project.org/web/views/">https://CRAN.R-project.org/web/views/</a></p> <p>CRAN Task View Initiative: <a href="https://github.com/cran-task-views/ctv/">https://github.com/cran-task-views/ctv/</a></p> <p>R package: <a href="https://CRAN.R-project.org/package=ctv">https://CRAN.R-project.org/package=ctv</a></p>2023-05-31T00:00:00+02:00https://www.zeileis.org/news/simulate_cvd/Color vision deficiency emulation fixed in colorspace 2.1-02023-05-08T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://www.zeileis.org/The color vision deficiency emulation provided by R package colorspace was inaccurate for some highly-saturated colors due to a bug that was fixed in version 2.1-0. The (typically small) differences are illustrated for a range of palettes.<p>The color vision deficiency emulation provided by R package colorspace was inaccurate for some highly-saturated colors due to a bug that was fixed in version 2.1-0. The (typically small) differences are illustrated for a range of palettes.</p> <h2 id="background">Background</h2> <p>Functions for emulating <a href="https://en.wikipedia.org/wiki/Color_blindness">color vision deficiencies</a> have been part of the R package <a href="http://colorspace.R-Forge.R-project.org/">colorspace</a> for several years now (since the release of version 1.4-0 in January 2019). They are crucial for assessing how well data visualizations work for viewers affected by color vision deficiencies (about 8% of all males and 0.5% of all females) and for illustrating problems with <a href="http://colorspace.R-Forge.R-project.org/articles/endrainbow.html">poor color choices</a>.</p> <p>The <code class="language-plaintext highlighter-rouge">colorspace</code> package implements the physiologically-based model of <a href="https://doi.org/10.1109/TVCG.2009.113">Machado, Oliveira, and Fernandes (2009)</a> who provide a unified approach to various forms of deficiencies, in particular encompassing deuteranomaly (green cone cells defective), protanomaly (red cone cells defective), and tritanomaly (blue cone cells defective). See the <a href="http://colorspace.R-Forge.R-project.org/articles/color_vision_deficiency.html">corresponding package vignette</a> for more details.</p> <h2 id="bug-and-fix">Bug and fix</h2> <p>Recently, an inaccuracy in the <code class="language-plaintext highlighter-rouge">colorspace</code> implementation of the Machado <em>et al.</em> method was reported by Matthew Petroff and fixed in <code class="language-plaintext highlighter-rouge">colorspace</code> 2.1.0 (released earlier this year) with some advice and guidance from Kenneth Knoblauch.</p> <p>More specifically, Machado <em>et al.</em> provide linear transformations of RGB (red-green-blue) coordinates that simulate the different color vision deficiencies. Following some illustrations from the supplementary materials of Machado <em>et al.</em>, earlier versions of the <code class="language-plaintext highlighter-rouge">colorspace</code> package had applied the transformations directly to gamma-corrected sRGB coordinates that can be obtained from color hex codes. However, the paper implicitly relies on a linear RGB space (see page 1294, column 1) where the linear matrix transformations for simulating color vision deficiencies should be applied. Therefore, a new argument <code class="language-plaintext highlighter-rouge">linear = TRUE</code> has been added to <code class="language-plaintext highlighter-rouge">simulate_cvd()</code> (and hence in <code class="language-plaintext highlighter-rouge">deutan()</code>, <code class="language-plaintext highlighter-rouge">protan()</code>, and <code class="language-plaintext highlighter-rouge">tritan()</code>) that first maps the provided colors to linearized RGB coordinates, applies the color vision deficiency transformation, and then maps back to gamma-corrected sRGB coordinates. Optionally, <code class="language-plaintext highlighter-rouge">linear = FALSE</code> can be used to restore the behavior from previous versions where the transformations are applied directly to the sRGB coordinates.</p> <h2 id="illustration">Illustration</h2> <p>For most colors the difference between the two strategies (in linear vs. gamma-corrected RGB coordinates) is negligible but for some highly-saturated colors it becomes more noticeable, e.g., for red, purple, or orange.</p> <p>To illustrate this we set up a small convenience function <code class="language-plaintext highlighter-rouge">cvd_compare()</code> that contrasts both approaches for all three types of color vision deficiences using the <a href="http://colorspace.R-Forge.R-project.org/reference/swatchplot.html">swatchplot()</a> function from <code class="language-plaintext highlighter-rouge">colorspace</code>.</p> <pre><code class="language-{r}">cvd_compare <- function(pal) { x <- list( "Original" = rbind(pal), "Deutan" = rbind( "linear = TRUE " = colorspace::deutan(pal, linear = TRUE), "linear = FALSE" = colorspace::deutan(pal, linear = FALSE) ), "Protan" = rbind( "linear = TRUE " = colorspace::protan(pal, linear = TRUE), "linear = FALSE" = colorspace::protan(pal, linear = FALSE) ), "Tritan" = rbind( "linear = TRUE " = colorspace::tritan(pal, linear = TRUE), "linear = FALSE" = colorspace::tritan(pal, linear = FALSE) ) ) rownames(x$Original) <- deparse(substitute(pal)) colorspace::swatchplot(x) } </code></pre> <p>Subsequently, we apply this function to a selection of <a href="https://www.zeileis.org/news/coloring/">new base R palettes</a>, that have been available since R 4.0.0 in functions <code class="language-plaintext highlighter-rouge">palette.colors()</code> and <code class="language-plaintext highlighter-rouge">hcl.colors()</code>. First, it is shown that for many palettes the two strategies lead to almost equivalent output: e.g., for the default qualitative palette in <code class="language-plaintext highlighter-rouge">palette.colors()</code>, Okabe-Ito (excluding black and gray), and the default sequential palette in <code class="language-plaintext highlighter-rouge">hcl.colors()</code>, Viridis.</p> <pre><code class="language-{r}">cvd_compare(palette.colors()[2:8]) cvd_compare(hcl.colors(7)) </code></pre> <p><a href="https://www.zeileis.org/assets/posts/2023-05-08-simulate_cvd/cvd_compare_okabeito.svg"><img src="https://www.zeileis.org/assets/posts/2023-05-08-simulate_cvd/cvd_compare_okabeito.svg" alt="Comparison of color vision deficiency emulations for Okabe-Ito palette" /></a> <a href="https://www.zeileis.org/assets/posts/2023-05-08-simulate_cvd/cvd_compare_viridis.svg"><img src="https://www.zeileis.org/assets/posts/2023-05-08-simulate_cvd/cvd_compare_viridis.svg" alt="Comparison of color vision deficiency emulations for Viridis palette" /></a></p> <p>The comparison shows that both emulations lead to very similar output, bringing out clearly that both palettes are rather robust und color vision deficiencies.</p> <p>However, for palettes with more flashy colors (especially highly-saturated red, purple, or orange) the differences may be noticeable and practically relevant. This is illustrated using two sequential HCL palettes, PuRd (inspired from ColorBrewer.org) and Rocket (from the Viridis family):</p> <pre><code class="language-{r}">cvd_compare(hcl.colors(7, "PuRd")) cvd_compare(hcl.colors(7, "Rocket")) </code></pre> <p><a href="https://www.zeileis.org/assets/posts/2023-05-08-simulate_cvd/cvd_compare_purd.svg"><img src="https://www.zeileis.org/assets/posts/2023-05-08-simulate_cvd/cvd_compare_purd.svg" alt="Comparison of color vision deficiency emulations for PuRd palette" /></a> <a href="https://www.zeileis.org/assets/posts/2023-05-08-simulate_cvd/cvd_compare_rocket.svg"><img src="https://www.zeileis.org/assets/posts/2023-05-08-simulate_cvd/cvd_compare_rocket.svg" alt="Comparison of color vision deficiency emulations for Rocket palette" /></a></p> <p>The comparison shows that the emulation differs in particular for colors 2, 3, and 4 in both palettes, leading to slightly different insights regarding the properties of the palettes.</p> <p>The differences can become even more pronounced for fully-satured colors like those in the infamous rainbow palette, shown below.</p> <pre><code class="language-{r}">cvd_compare(rainbow(7)) </code></pre> <p><a href="https://www.zeileis.org/assets/posts/2023-05-08-simulate_cvd/cvd_compare_rainbow.svg"><img src="https://www.zeileis.org/assets/posts/2023-05-08-simulate_cvd/cvd_compare_rainbow.svg" alt="Comparison of color vision deficiency emulations for rainbow palette" /></a></p> <p>Luckily for palettes with better perceptual properties the differences between the old erroneous version and the new fixed one are typically rather small. Hence, we hope that the bug did not affect prior work too much and that the fixed version is even more useful for all users of the package.</p>2023-05-08T00:00:00+02:00https://www.zeileis.org/news/coloring/Coloring in R's blind spot2023-05-05T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://www.zeileis.org/New arXiv working paper on the new color palette functions palette.colors() and hcl.colors() in base R since version 4.0.0.<p>New arXiv working paper on the new color palette functions palette.colors() and hcl.colors() in base R since version 4.0.0.</p> <h2 id="citation">Citation</h2> <p>Achim Zeileis, Paul Murrell (2023). “Coloring in R’s Blind Spot.” <em>arXiv.org E-Print Archive</em> arXiv:2303.04918 [stat.CO]. <a href="https://doi.org/10.48550/arXiv.2303.04918">doi:10.48550/arXiv.2303.04918</a></p> <h2 id="abstract">Abstract</h2> <p>Prior to version 4.0.0 R had a poor default color palette (using highly saturated red, green, blue, etc.) and provided very few alternative palettes, most of which also had poor perceptual properties (like the infamous rainbow palette). Starting with version 4.0.0 R gained a new and much improved default palette and, in addition, a selection of more than 100 well-established palettes are now available via the functions <code class="language-plaintext highlighter-rouge">palette.colors()</code> and <code class="language-plaintext highlighter-rouge">hcl.colors()</code>. The former provides a range of popular qualitative palettes for categorical data while the latter closely approximates many popular sequential and diverging palettes by systematically varying the perceptual hue, chroma, luminance (HCL) properties in the palette. This paper provides an overview of these new color functions and the palettes they provide along with advice about which palettes are appropriate for specific tasks, especially with regard to making them accessible to viewers with color vision deficiencies.</p> <h2 id="software">Software</h2> <p>Package <code class="language-plaintext highlighter-rouge">grDevices</code> in base <a href="https://www.R-project.org/">R</a> provides <code class="language-plaintext highlighter-rouge">palette.colors()</code> and <code class="language-plaintext highlighter-rouge">hcl.colors()</code> and accompanying functionality since version R 4.0.0.</p> <p>Package <code class="language-plaintext highlighter-rouge">colorspace</code> (<a href="https://CRAN.R-project.org/package=colorspace">CRAN</a>, <a href="https://colorspace.R-Forge.R-project.org/">Web page</a>) provides color vision deficiency emulation along with many other color tools. See also below for the recent bug fix in color vision deficiency emulation.</p> <p>Replication code: <a href="https://www.zeileis.org/assets/posts/2023-05-05-coloring/coloring.R">coloring.R</a>, <a href="https://www.zeileis.org/assets/posts/2023-05-05-coloring/paletteGrid.R">paletteGrid.R</a></p> <h2 id="highlights">Highlights</h2> <p>The table below provides an overview of the new base R palette functionality: For each main type of palette, the <em>Purpose</em> row describes what sort of data the type of palette is appropriate for, the <em>Generate</em> row gives the functions that can be used to generate palettes of that type, the <em>List</em> row names the functions that can be used to list available palettes, and the <em>Robust</em> row identifies two or three good default palettes of that type.</p> <table> <thead> <tr> <th style="text-align: left"> </th> <th style="text-align: left">Qualitative</th> <th style="text-align: left">Sequential</th> <th style="text-align: left">Diverging</th> </tr> </thead> <tbody> <tr> <td style="text-align: left"><em>Purpose</em></td> <td style="text-align: left">Categorical data</td> <td style="text-align: left">Ordered or numeric data<br />(high → low)</td> <td style="text-align: left">Ordered or numeric with central value<br />(high ← neutral → low)</td> </tr> <tr> <td style="text-align: left"><em>Generate</em></td> <td style="text-align: left"><code class="language-plaintext highlighter-rouge">palette.colors()</code>,<br /><code class="language-plaintext highlighter-rouge">hcl.colors()</code></td> <td style="text-align: left"><code class="language-plaintext highlighter-rouge">hcl.colors()</code></td> <td style="text-align: left"><code class="language-plaintext highlighter-rouge">hcl.colors()</code></td> </tr> <tr> <td style="text-align: left"><em>List</em></td> <td style="text-align: left"><code class="language-plaintext highlighter-rouge">palette.pals()</code>,<br /><code class="language-plaintext highlighter-rouge">hcl.pals("qualitative")</code></td> <td style="text-align: left"><code class="language-plaintext highlighter-rouge">hcl.pals("sequential")</code></td> <td style="text-align: left"><code class="language-plaintext highlighter-rouge">hcl.pals("diverging")</code>,<br /><code class="language-plaintext highlighter-rouge">hcl.pals("divergingx")</code></td> </tr> <tr> <td style="text-align: left"><em>Robust</em></td> <td style="text-align: left"><code class="language-plaintext highlighter-rouge">"Okabe-Ito"</code>, <code class="language-plaintext highlighter-rouge">"R4"</code></td> <td style="text-align: left"><code class="language-plaintext highlighter-rouge">"Blues 3"</code>, <code class="language-plaintext highlighter-rouge">"YlGnBu"</code>, <code class="language-plaintext highlighter-rouge">"Viridis"</code></td> <td style="text-align: left"><code class="language-plaintext highlighter-rouge">"Purple-Green"</code>,<br /><code class="language-plaintext highlighter-rouge">"Blue-Red 3"</code></td> </tr> </tbody> </table> <p>Based on this, the color defaults in base R were adapted. In particular, the old default palette was replaced by the <code class="language-plaintext highlighter-rouge">"R4"</code> palette, using very similar hues but avoiding the garish colors with extreme variations in brightness (see below for an example).</p> <p>Recently, the recommended package <a href="https://CRAN.R-project.org/package=lattice">lattice</a> also changed its default color theme (in version 0.21-8), using the qualitative <code class="language-plaintext highlighter-rouge">"Okabe-Ito"</code> palette as the symbol and fill color and the sequential <code class="language-plaintext highlighter-rouge">"YlGnBu"</code> palette for shading regions.</p> <h2 id="qualitative-palettes-in-palettecolors">Qualitative palettes in palette.colors</h2> <p>All palettes provides by the <code class="language-plaintext highlighter-rouge">palette.colors()</code> functions are shown below (except the old default <code class="language-plaintext highlighter-rouge">"R3"</code> palette which is only implemented for backward compatibility).</p> <p><a href="https://www.zeileis.org/assets/posts/2023-05-05-coloring/palette-colors.png"><img src="https://www.zeileis.org/assets/posts/2023-05-05-coloring/palette-colors.png" alt="Qualitative palettes provided in palette.colors()" /></a></p> <p>Lighter palettes are typically more useful for shading areas, e.g., in bar plots or similar displays. Darker and more colorful palettes are usually better for coloring points or line. The palettes <code class="language-plaintext highlighter-rouge">"R4"</code> and <code class="language-plaintext highlighter-rouge">"Okabe-Ito"</code> are particularly noteworthy because they have been designed to be reasonably robust under color vision deficiencies.</p> <p>This is illustrated in a time series line plot of the base R <code class="language-plaintext highlighter-rouge">EuStockMarkets</code> data. The three rows show different <code class="language-plaintext highlighter-rouge">palette.colors()</code> palettes: The old <code class="language-plaintext highlighter-rouge">"R3"</code> default palette (top), the new <code class="language-plaintext highlighter-rouge">"R4"</code> default palette (middle), and the <code class="language-plaintext highlighter-rouge">"Okabe-Ito"</code> palette (bottom). The columns contrast normal vision (left) and emulated deuteranope vision (right), the most common type of color vision deficiency. A color legend is used in the first row and direct labels in the other rows.</p> <p><a href="https://www.zeileis.org/assets/posts/2023-05-05-coloring/EuStockMarkets.png"><img src="https://www.zeileis.org/assets/posts/2023-05-05-coloring/EuStockMarkets.png" alt="Illustration of qualitative palettes" /></a></p> <p>We can see that the <code class="language-plaintext highlighter-rouge">"R3"</code> colors are highly saturated and they vary in luminance (brightness). For example, the cyan line is noticeably lighter than the others. Futhermore, for deuteranope viewers, the CAC and the SMI lines are difficult to distinguish from each other (exacerbated by the use of a color legend that makes matching the lines to labels almost impossible). Moreover, the FTSE line is more difficult to distinguish from the white background, compared to the other lines. The <code class="language-plaintext highlighter-rouge">"R4"</code> palette is an improvement: the luminance is more even and the colors are less saturated, plus the colors are more distinguishable for deuteranope viewers (aided by the use of direct color labels instead of a legend). The <code class="language-plaintext highlighter-rouge">"Okabe-Ito"</code> palette works even better, particularly for deuteranope viewers.</p> <h2 id="sequential-and-diverging-palettes-in-hclcolors">Sequential and diverging palettes in hcl.colors</h2> <p>In addition to qualitative palettes, the <code class="language-plaintext highlighter-rouge">hcl.colors()</code> function provides a wide range of sequential and diverging palettes designed for numeric or ordered data with or without a neutral reference value, respectively. There are more than 100 such palettes, many of which closely approximate palettes from well-established packages such as the ColorBrewer.org, the Viridis family, CARTO colors, or Crameri’s scientific colors. The graphic below depicts just a subset of the multi-hue sequential palettes for illustration.</p> <p><a href="https://www.zeileis.org/assets/posts/2023-05-05-coloring/hcl-colors.png"><img src="https://www.zeileis.org/assets/posts/2023-05-05-coloring/hcl-colors.png" alt="Some of the multi-hue sequential palettes provided in hcl.colors()" /></a></p> <p>Some empirical examples and more insights are provided in the working paper linked above.</p>2023-05-05T00:00:00+02:00https://www.zeileis.org/news/lightning_amplification/Amplification of Lightning in the European Alps 1980-20192023-05-04T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://www.zeileis.org/Detailed measurements of lightning as well as reanalyses of atmospheric conditions enable the reconstruction of lightning probabilities over large spatial and temporal domains. Using flexible additive regression models it is shown that lightning activity in the high European Alps has doubled from the 1980s to the 2010s.<p>Detailed measurements of lightning as well as reanalyses of atmospheric conditions enable the reconstruction of lightning probabilities over large spatial and temporal domains. Using flexible additive regression models it is shown that lightning activity in the high European Alps has doubled from the 1980s to the 2010s.</p> <h2 id="citation">Citation</h2> <p>Thorsten Simon, Georg J. Mayr, Deborah Morgenstern, Nikolaus Umlauf, Achim Zeileis (2023). “Amplification of Annual and Diurnal Cycles of Alpine Lightning.” <em>Climate Dynamics</em>, Forthcoming. <a href="https://doi.org/10.1007/s00382-023-06786-8">doi:10.1007/s00382-023-06786-8</a></p> <h2 id="abstract">Abstract</h2> <p>The response of lightning to a changing climate is not fully understood. Historic trends of proxies known for fostering convective environments suggest an increase of lightning over large parts of Europe. Since lightning results from the interaction of processes on many scales, as many of these processes as possible must be considered for a comprehensive answer. Recent achievements of decade-long seamless lightning measurements and hourly reanalyses of atmospheric conditions including cloud micro-physics combined with flexible regression techniques have made a reliable reconstruction of cloud-to-ground lightning down to its seasonally varying diurnal cycle feasible. The European Eastern Alps and their surroundings are chosen as reconstruction region since this domain includes a large variety of land-cover, topographical and atmospheric circulation conditions. The most intense changes over the four decades from 1980 to 2019 occurred over the high Alps where lightning activity doubled in the 2010s compared to the 1980s. There, the lightning season reaches a higher maximum and starts one month earlier. Diurnally, the peak is up to 50% stronger with more lightning strikes in the afternoon and evening hours. Signals along the southern and northern alpine rim are similar but weaker whereas the flatlands surrounding the Alps have no significant trend.</p> <h2 id="software">Software</h2> <p>R packages <code class="language-plaintext highlighter-rouge">bamlss</code> (<a href="https://CRAN.R-project.org/package=bamlss">CRAN</a>, <a href="http://www.bamlss.org/">Web page</a>) and <code class="language-plaintext highlighter-rouge">mgcv</code> (<a href="https://CRAN.R-project.org/package=mgcv">CRAN</a>).</p> <h2 id="highlights">Highlights</h2> <p>The study links two sources of information which are both available in a spatio-temporal resolution of 32 km x 32 km and one hour:</p> <ol> <li>Measurements from the lightning location system ALDIS, available in homogenous quality for the period 2010-2019.</li> <li>40 single-level atmospheric parameters from ECMWF’s fifth reanalysis (ERA5), available from 1980 onward, along with 45 further atmospheric variables derived from vertical profiles etc.</li> </ol> <p>The idea is to learn the link between the lightning observations and the ERA5 atmospheric parameters on the time period where both data sources are available (2010-2019). Subsequently, probabilistic predictions can be made for lightning occurrence on the entire time period starting in 1980, i.e., including the period where only atmospheric parameters but no high-quality lightning detection observations are available. This then allows to track how the probability for lightning occurrence has evolved over the decades, both in terms of the annual seasonal cycles and the diurnal cycle.</p> <p>The probabilistic model learned on this challenging data set is a generalized additive model (GAM) using a binary logit link and smooth spline terms for all explanatory variables based on the atmospheric parameters and additional spatio-temporal information. In order to deal with variable selection due to the large number of explanatory variables, the model is estimated by gradient boosting (as opposed to the classical maximum likelihood technique) combined with stability selection. These have been implemented using the R packages <code class="language-plaintext highlighter-rouge">mgcv</code> and <code class="language-plaintext highlighter-rouge">bamlss</code>.</p> <p>Based on the probabilistic predictions from this boosted binary GAM, the figure below shows reconstructed annual cycles of probabilities for lightning events averaged over the four decades from 1980s to 2010s (color coded). The light curves in the background are aggregations to the day of the year. The dark curves in the foreground are smoothed versions of the light curves. This shows that the peak in summer is much more pronounced and starts earlier for the High Alps and the Southern Alpine rim while there are only minor changes at the Northern Alpine rim and the surrounding flatlands.</p> <p><a href="https://www.zeileis.org/assets/posts/2023-05-04-lightning_amplification/cycle-seasonal.png"><img src="https://www.zeileis.org/assets/posts/2023-05-04-lightning_amplification/cycle-seasonal.png" alt="Seasonal cycles of reconstructed lightning probabilities over four decades" /></a></p> <p>To aggregate these changes even further and capture climate changes, linear trends are fitted to the reconstructed probabilities for June (afternoons, 13-19 UTC) over time. The figure below shows the spatial distribution of these linear climate trends: Color luminance gives the slope per decade of a linear regression for mean probability of lightning within an hour in percent. Desaturated colors in the grids indicate that the linear trends for these grids are not significant at the 5% level. Again, this highlights the pronounced changes in the High Alps and the Southern Alpine rim while there are no significant changes in the surrounding flatlands.</p> <p><a href="https://www.zeileis.org/assets/posts/2023-05-04-lightning_amplification/map-slopes.png"><img src="https://www.zeileis.org/assets/posts/2023-05-04-lightning_amplification/map-slopes.png" alt="Map of linear climate change for reconstructed lightning probabilities" /></a></p> <p>For more details and further insights see the full paper linked above.</p>2023-05-04T00:00:00+02:00https://www.zeileis.org/news/fifa2022/Machine learning of a 2022 FIFA World Cup multiverse2022-11-14T00:00:00+01:00Achim ZeileisAchim.Zeileis@R-project.orghttps://www.zeileis.org/Probabilistic forecasts for the 2022 FIFA World Cup are obtained by using a hybrid model that combines data from three advanced statistical models through random forests. The favorite is Brazil, followed by Argentina, Netherlands, Germany, and France.<p>Probabilistic forecasts for the 2022 FIFA World Cup are obtained by using a hybrid model that combines data from three advanced statistical models through random forests. The favorite is Brazil, followed by Argentina, Netherlands, Germany, and France.</p> <div class="row t20 b20"> <div class="small-8 medium-9 large-10 columns"> The 2022 FIFA World Cup will take place in Qatar from 20 November to 18 December 2022. 32 of the best teams from all around the world compete to determine the new World Champion. Although the event is overshadowed by many issues, both ethical and sportive, we decided for scientific purposes to employ our machine learning approach that we successfully used in previous tournaments for making probabilistic forecasts. More specifically, our approach yields probabilistic forecasts for all possible matches which can then be used to explore the likely course of the tournament along with its most likely champion by simulation. </div> <div class="small-4 medium-3 large-2 columns"> <a href="https://www.fifa.com/fifaplus/en/tournaments/mens/worldcup/qatar2022" alt="2022 FIFA World Cup web page"><img src="https://upload.wikimedia.org/wikipedia/en/e/e3/2022_FIFA_World_Cup.svg" alt="2022 FIFA World Cup logo" /></a> </div> </div> <h2 id="winning-probabilities">Winning probabilities</h2> <p>The forecast is based on a conditional inference random forest learner that blends information capturing the past, present, and future of the competing football teams: <em>Insights from the past</em> are captured in an ability estimate for every team based on historic matches. <em>Expectations about the the future</em> in the upcoming tournament are captured in an ability estimate for every team based on odds from international bookmakers. <em>The present status</em> of the teams (and their countries) is represented by covariates such as market value or the types of players in the team as well as country-specific socio-economic factors like population or GDP. The random forest model is learned using the previous five FIFA World Cup tournaments from 2002 to 2018 as training data and then applied to current information to obtain a forecast for the 2022 FIFA World Cup. More precisely, the random forest is calibrated to predict the likely distribution of goals for each team in all possible matches in the tournament. This allows to simulate the outcome of each match in normal time as well as potential extra time and penalties in order to obtain probabilities for a <em>win</em>, <em>draw</em>, or <em>loss</em>. Moreover, because every individual match can be simulated like that, a “multiverse” of potential courses of the entire tournament can be created yielding overall winning probabilities for each team. The results show that - 20 years after winning the title the last time - Brazil is the clear favorite for the World Cup with a winning probability of 15.0%, followed by Argentina with 11.2%, the Netherlands with 9.7%, Germany with 9.2%, and France with 9.1%. The winning probabilities for all teams are shown in the barchart below with more information linked in the interactive full-width version.</p> <p><a href="https://www.zeileis.org/assets/posts/2022-11-14-fifa2022/p_win.html">Interactive full-width graphic</a></p> <p><a href="https://www.zeileis.org/assets/posts/2022-11-14-fifa2022/p_win.html"><img src="https://www.zeileis.org/assets/posts/2022-11-14-fifa2022/p_win.png" alt="Barchart: Winning probabilities" /></a></p> <p>The full study has been conducted by an international team of researchers: <a href="https://www.statistik.tu-dortmund.de/groll.html">Andreas Groll</a>, <a href="https://de.linkedin.com/in/neele-hormann-70164123a">Neele Hormann</a>, <a href="https://wwwfr.uni.lu/recherche/fstm/dmath/people/christophe_ley">Christophe Ley</a>, <a href="https://www.sg.tum.de/epidemiologie/team/schauberger/">Gunther Schauberger</a>, <a href="https://biblio.ugent.be/person/2C617710-F0EE-11E1-A9DE-61C894A0A6B4">Hans Van Eetvelde</a>, <a href="https://www.zeileis.org/">Achim Zeileis</a>. The core of the contribution is a hybrid approach that starts out from three state-of-the-art forecasting methods, based on disparate sets of information, and lets an adaptive machine learning model decide how to blend the different sources of information.</p> <ul> <li> <p><em>Historic information: Match abilities.</em><br /> An ability estimate is obtained for every team based on “retrospective” data, namely all historic national matches over the last 8 years. A <em>bivariate Poisson model</em> with team-specific fixed effects is fitted to the number of goals scored by both teams in each match. However, rather than equally weighting all matches to obtain <em>average</em> team abilities (or team strengths) over the entire history period, an exponential weighting scheme is employed. This assigns more weight to more recent results and thus yields an estimate of <em>current</em> team abilities. More details can be found in <a href="https://doi.org/10.1177/1471082X18817650">Ley, Van de Wiele, Van Eetvelde (2019)</a>.</p> </li> <li> <p><em>Future expectation: Bookmaker consensus abilities.</em><br /> Another ability estimate for every team is obtained based on “prospective” data, namely the odds of 28 international bookmakers that reflect their expert expectations for the tournament. Using an enhanced version of the <em>bookmaker consensus model</em> from <a href="https://doi.org/10.1016/j.ijforecast.2009.10.001">Leitner, Zeileis, Hornik (2010)</a>, the bookmaker odds are first adjusted for the bookmakers’ profit margins (“overround”) and then averaged (on a logit scale) to obtain a consensus for the winning probability of each team. To correct for the effects of the tournament draw (that might have led to easier or harder groups for some teams), an “inverse” simulation approach is used to infer which team abilities are most likely to lead up to these winning probabilities.</p> </li> <li> <p><em>Combination with present status: Hybrid random forests.</em><br /> Finally, machine learning is used to combine these highly aggregated ability estimates with a broad range of further relevant covariates reflecting the current states of the different teams and the countries they come from. Such a hybrid approach was first suggested by <a href="https://doi.org/10.1515/jqas-2018-0060">Groll, Ley, Schauberger, Van Eetvelde (2019)</a>. A random forest learner is trained to decide how to blend the different ability estimates with team-specific features that are typically less informative but still powerful enough to enhance the forecasts. The features considered comprise team-specific details (e.g., market value, FIFA rank, team structure) as well as country-specifc socio-economic factors (population and GDP per capita). By combining a large ensemble of rather weakly informative regression trees in a random forest, the relative importances of all the covariates can be inferred automatically. The resulting predicted number of goals for each team can then finally be used to simulate the entire tournament 100,000 times.</p> </li> </ul> <h2 id="match-probabilities">Match probabilities</h2> <p>Using the hybrid random forest an expected number of goals is obtained for both teams in each possible match. The covariate information used for this is the difference between the two teams in each of the variables listed above, i.e., the difference in historic match abilities (on a log scale), the difference in bookmaker consensus abilities (on a log scale), difference in market values (on a log scale), etc. Assuming a bivariate Poisson distribution with the expected numbers of goals for both teams, we can compute the probability that a certain match ends in a <em>win</em>, a <em>draw</em>, or a <em>loss</em>. The same can be repeated in overtime, if necessary, and a coin flip is used to decide penalties, if needed.</p> <p>The following heatmap shows for each possible combination of teams the probability that one team beats the other team in a knockout match. The color scheme uses green vs. purple to signal probabilities above vs. below 50%, respectively. The tooltips for each match in the interactive version of the graphic also print the probabilities for the match to end in a <em>win</em>, <em>draw</em>, or <em>loss</em> after normal time.</p> <p><a href="https://www.zeileis.org/assets/posts/2022-11-14-fifa2022/p_match.html">Interactive full-width graphic</a></p> <p><a href="https://www.zeileis.org/assets/posts/2022-11-14-fifa2022/p_match.html"><img src="https://www.zeileis.org/assets/posts/2022-11-14-fifa2022/p_match.png" alt="Heatmap: Match probabilities" /></a></p> <h2 id="performance-throughout-the-tournament">Performance throughout the tournament</h2> <p>Based on the simulation of individual pairwise matches, as described above, we can create a “multiverse” of potential courses of the entire tournament (here: 100,000). The chances of the teams’ “survival” throughout the tournament can then be described by the proportions of multiverses in which they reach the different stages from the round of 16 to winning the overall title.</p> <p><a href="https://www.zeileis.org/assets/posts/2022-11-14-fifa2022/p_surv.html">Interactive full-width graphic</a></p> <p><a href="https://www.zeileis.org/assets/posts/2022-11-14-fifa2022/p_surv.html"><img src="https://www.zeileis.org/assets/posts/2022-11-14-fifa2022/p_surv.png" alt="Line plot: Survival probabilities" /></a></p> <h2 id="odds-and-ends">Odds and ends</h2> <p>All our forecasts are probabilistic, clearly below 100%, and by no means certain. Thus, although we can quantify this uncertainty in terms of probabilities from a multiverse of tournaments, it is far from being predetermined which of these possible tournaments we will see in our universe.</p> <p>Unfortunately, the experience of observing the actual tournament will be far less exciting and joyful than usual for us as researchers/forecasters and also as football fans due to the special circumstances. In addition to the widely discussed ethical problems regarding this FIFA World Cup, there are also sportive issues that are absolutely critical: The climate in Qatar is extraordinarily hot which necessitated shifting the event to the winter months. Therefore, all major football leagues in Europe and South America have to interrupt their usual schedule in order to accomodate the tournament. This gives the national teams less time for preparation and the players less time for recovery before and after the World Cup. In combination with the extreme climate conditions this also increases the risk of injuries. Hence, having a team with many players in the international European leagues (Champions League, Europa League, Europa Conference League) might actually be a handicap rather than a strength this year.</p> <p>All of these factors make the forecast of the tournament outcome more difficult as variables that have been highly predictive in previous World Cups might not work or work differently.</p> <p>Finally, more from the perspective of football fans (rather than professional forecasters) we are sad that all the usual joy and anticipation of a football World Cup has been crushed by the terrible circumstances this year: starting from the alleged bribery and corruption in the FIFA assignment process, to the human rights and working conditions in Qatar, and the lack of sustainability in the construction and operation of the stadiums.</p>2022-11-14T00:00:00+01:00https://www.zeileis.org/news/weuro2022/Probabilistic forecasting for the UEFA Women's Euro 20222022-07-04T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://www.zeileis.org/Using a consensus model based on quoted bookmakers' odds winning probabilities for all competing teams in the UEFA Women's Euro are obtained: The favorite is Spain, followed by host England, France, and the Netherlands as the defending champion.<p>Using a consensus model based on quoted bookmakers' odds winning probabilities for all competing teams in the UEFA Women's Euro are obtained: The favorite is Spain, followed by host England, France, and the Netherlands as the defending champion.</p> <div class="row t20 b20"> <div class="small-8 medium-9 large-10 columns"> Football fans throughout Europe and the world anticipate the UEFA Women's Euro 2022 that will take place in England from 6 July to 31 July 2022. 16 of the best European teams compete to determine the new European Champion. Here, a predictive model is established to forecast what the most likely outcome of the tournament will be. The forecast is based on the expert knowledge of 16 bookmakers and betting exchanges using a model averaging approach. </div> <div class="small-4 medium-3 large-2 columns"> <a href="https://www.uefa.com/womenseuro/" alt="UEFA Women's Euro 2022 web page"><img src="https://upload.wikimedia.org/wikipedia/en/0/0b/UEFA_Women%27s_Euro_2022_logo.svg" alt="UEFA Women's Euro 2022 logo" /></a> </div> </div> <h2 id="winning-probabilities">Winning probabilities</h2> <p>The model is the so-called bookmaker consensus model which has been proposed by Leitner, Hornik, and Zeileis (2010, <em>International Journal of Forecasting</em>, <a href="https://doi.org/10.1016/j.ijforecast.2009.10.001">https://doi.org/10.1016/j.ijforecast.2009.10.001</a>) and successfully applied in previous football tournaments, either by itself or in combination with even more refined <a href="https://www.zeileis.org/news/euro2020/">machine learning techniques</a>.</p> <p>This time the forecast shows that Spain is the favorite with a forecasted winning probability of 19.6%, closely followed by England with a winning probability of 16.6%. Four teams also have double-digit winning probabilities: France with 13.5%, the Netherlands with 13.3%, Germany with 10.3%, and Sweden with 10.1%. More details are displayed in the following barchart.</p> <p><a href="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_win.html">Interactive full-width graphic</a></p> <p><a href="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_win.html"><img src="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_win.png" alt="Barchart: Winning probabilities" /></a></p> <p>These probabilistic forecasts have been obtained by model-based averaging the quoted winning odds for all teams across bookmakers. More precisely, the odds are first adjusted for the bookmakers’ profit margins (“overrounds”, on average 20.1%), averaged on the log-odds scale to a consensus rating, and then transformed back to winning probabilities. The raw bookmakers’ odds as well as the forecasts for all teams are also available in machine-readable form in <a href="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/weuro2022.csv">weuro2022.csv</a>.</p> <p>Although forecasting the winning probabilities for the UEFA Women’s Euro 2022 is probably of most interest, the bookmaker consensus forecasts can also be employed to infer team-specific abilities using an “inverse” tournament simulation:</p> <ol> <li>If team abilities are available, pairwise winning probabilities can be derived for each possible match (see below).</li> <li>Given pairwise winning probabilities, the whole tournament can be easily simulated to see which team proceeds to which stage in the tournament and which team finally wins.</li> <li>Such a tournament simulation can then be run sufficiently often (here 100,000 times) to obtain relative frequencies for each team winning the tournament.</li> </ol> <p>Using this idea, abilities in step 1 can be chosen such that the simulated winning probabilities in step 3 closely match those from the bookmaker consensus shown above.</p> <h2 id="pairwise-comparisons">Pairwise comparisons</h2> <p>A classical approach to obtain winning probabilities in pairwise comparisons (i.e., matches between teams/players) is the Bradley-Terry model, which is similar to the Elo rating, popular in sports. The Bradley-Terry approach models the probability that a Team A beats a Team B by their associated abilities (or strengths):</p> <math xmlns="http://www.w3.org/1998/Math/MathML"><mstyle displaystyle="true"><mrow><mi fontstyle="normal">Pr</mi><mo stretchy="false">(</mo><mi>A</mi><mtext> beats </mtext><mi>B</mi><mo stretchy="false">)</mo><mo>=</mo><mfrac><mrow><msub><mrow><mi fontstyle="italic">ability</mi></mrow><mrow><mi>A</mi></mrow></msub></mrow><mrow><msub><mrow><mi fontstyle="italic">ability</mi></mrow><mrow><mi>A</mi></mrow></msub><mo>+</mo><msub><mrow><mi fontstyle="italic">ability</mi></mrow><mrow><mi>B</mi></mrow></msub></mrow></mfrac><mo>.</mo></mrow></mstyle></math> <p>Coupled with the “inverse” simulation of the tournament, as described in step 1-3 above, this yields pairwise probabilities for each possible match. The following heatmap shows the probabilistic forecasts for each match with light gray signalling approximately equal chances and green vs. purple signalling advantages for Team A or B, respectively.</p> <p><a href="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_match.html">Interactive full-width graphic</a></p> <p><a href="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_match.html"><img src="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_match.png" alt="Heatmap: Match probabilities" /></a></p> <h2 id="performance-throughout-the-tournament">Performance throughout the tournament</h2> <p>As every single match can be simulated with the pairwise probabilities above, it is also straightfoward to simulate the entire tournament (here: 100,000 times) providing “survival” probabilities for each team across the different stages.</p> <p><a href="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_surv.html">Interactive full-width graphic</a></p> <p><a href="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_surv.html"><img src="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_surv.png" alt="Line plot: Survival probabilities" /></a></p> <p>For example, this shows that Spain’s chances compared to England and France are lower to reach one of the quarterfinals but higher to reach one of the semifinals. The reasons for this are that Spain plays another one of the strongest six teams in their group (Germany) but can likely avoid another of these six teams in the quarterfinal. Conversely, England and France do not have another of the six top teams in their group but most likely play one in their quarterfinals (Germany and Netherlands or Sweden, respectively).</p> <p>This effect of the tournament draw is also brought out by another display that highlights the likely flow of all teams through the tournament simultaneously. Compared to the survival curves shown above this visualization brings out more clearly at which stages of the tournament the strong teams are most likely to meet.</p> <p><a href="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_sankey.html">Interactive full-width graphic</a></p> <p><a href="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_sankey.html"><img src="https://www.zeileis.org/assets/posts/2022-07-04-weuro2022/p_sankey.png" alt="Sankey diagram" /></a></p> <h2 id="odds-and-ends">Odds and ends</h2> <p>The bookmaker consensus model has performed well in previous tournaments, often predicting winners or finalists correctly. However, all forecasts are probabilistic, clearly below 100%, and thus by no means certain. It would also be possible to post-process the bookmaker consensus along with data from historic matches, player ratings, and other information about the teams using <a href="https://www.zeileis.org/news/euro2020/">machine learning techniques</a>. However, due to lack of time for more refined forecasts at the end of a busy academic year, at least the bookmaker consensus is provided as a solid basic forecast.</p> <p>As a final remark: Betting on the outcome based on the results presented here is not recommended. Not only because the winning probabilities are clearly far below 100% but, more importantly, because the bookmakers have a sizeable profit margin of about 20.1% which assures that the best chances of making money based on sports betting lie with them!</p> <p>In a few days we will start learning which of the probable paths through the tournament, shown above, will actually come true. Enjoy the UEFA Women’s Euro 2022!</p>2022-07-04T00:00:00+02:00https://www.zeileis.org/news/causal_forests/Model-based causal forests for heterogeneous treatment effects2022-07-02T00:00:00+02:00Achim ZeileisAchim.Zeileis@R-project.orghttps://www.zeileis.org/A new arXiv paper investigates which building blocks of random forests, especially causal forests and model-based forests, make them work for heterogeneous treatment effect estimation, both in randomized trials and observational studies.<p>A new arXiv paper investigates which building blocks of random forests, especially causal forests and model-based forests, make them work for heterogeneous treatment effect estimation, both in randomized trials and observational studies.</p> <h3 id="citation">Citation</h3> <p>Susanne Dandl, Torsten Hothorn, Heidi Seibold, Erik Sverdrup, Stefan Wager, Achim Zeileis (2022). “What Makes Forest-Based Heterogeneous Treatment Effect Estimators Work?.” <em>arXiv.org E-Print Archive</em> arXiv:2206.10323 [stat.ME]. <a href="https://doi.org/10.48550/arXiv.2206.10323">doi:10.48550/arXiv.2206.10323</a></p> <h3 id="abstract">Abstract</h3> <p>Estimation of heterogeneous treatment effects (HTE) is of prime importance in many disciplines, ranging from personalized medicine to economics among many others. Random forests have been shown to be a flexible and powerful approach to HTE estimation in both randomized trials and observational studies. In particular “causal forests”, introduced by <a href="https://doi.org/10.1214/18-aos1709">Athey, Tibshirani, and Wager (2019)</a>, along with the R implementation in package <a href="https://CRAN.R-project.org/package=grf"><em>grf</em></a>, were rapidly adopted. A related approach, called “model-based forests”, that is geared towards randomized trials and simultaneously captures effects of both prognostic and predictive variables, was introduced by <a href="https://doi.org/10.1177/0962280217693034">Seibold, Zeileis, and Hothorn (2018)</a> along with a modular implementation in the R package <a href="https://CRAN.R-project.org/package=model4you"><em>model4you</em></a>.</p> <p>Here, we present a unifying view that goes beyond the <em>theoretical</em> motivations and investigates which <em>computational</em> elements make causal forests so successful and how these can be blended with the strengths of model-based forests. To do so, we show that both methods can be understood in terms of the same parameters and model assumptions for an additive model under <em>L</em><sub>2</sub> loss. This theoretical insight allows us to implement several flavors of “model-based causal forests” and dissect their different elements <em>in silico</em>.</p> <p>The original causal forests and model-based forests are compared with the new blended versions in a benchmark study exploring both randomized trials and observational settings. In the randomized setting, both approaches performed akin. If confounding was present in the data generating process, we found local centering of the treatment indicator with the corresponding propensities to be the main driver for good performance. Local centering of the outcome was less important, and might be replaced or enhanced by simultaneous split selection with respect to both prognostic and predictive effects. This lays the foundation for future research combining random forests for HTE estimation with other types of models.</p> <p>We demonstrate the practical aspects of such a model-agnostic approach to HTE estimation analyzing the effect of cesarean section on postpartum blood loss in comparison to vaginal delivery. Clearly, randomization is hardly possible in this setup, and we present a tailored model-based forest for skewed and interval-censored data to infer possible predictive variables and their impact on the treatment effect.</p> <h3 id="benchmark-study">Benchmark study</h3> <p>To investigate which elements of the different random forest algorithms in causal forests (cf) vs. model-based forests (mob) contribute to more precise estimation of heterogeneous treatment effects, a large simulation experiment was carried out, using normal outcomes, different predictive and prognostic effects, and a varying number of observations (N) and covariates (P).</p> <p>In addition to the original cf (from <em>grf</em>) and mob (from <em>model4you</em>) algorithms three blended versions (based on <em>model4you</em>) were assessed: mob(\(\widehat W\)) (model-based forests after centering of the treatment indicator), mob(\(\widehat W\), \(\widehat Y\)) (model-based forests after centering of both the treatment indicator and the outcome), mobcf (model-based forests after centering of both the treatment indicator and the outcome, only testing for splits in the treatment effect).</p> <p>Four data-generation setups are considered, as proposed by Nie and Wager (2021): Setup A has complicated confounding but a relatively simple treatment effect function. Setup B has no confounding. Setup C has strong confounding but a constant treatment effect. In Setup D the treatment and control arms are completely unrelated.</p> <p>Overall, the results in the figure below show that centering of the treatment indicator as in mob(\(\widehat W\)) is the most relevant ingredient to random forests for HTE estimation in observational studies. If possible, additional centering the outcome in combination with simultaneous estimation of predictive and prognostic effects in mob(\(\widehat W\), \(\widehat Y\)) is recommended as it always performs as well as mob(\(\widehat W\)) and mobcf but may yield relevant improvements in some scenarios. Other technical aspects of tree and forest induction did not contribute to major performance differences. The overall strong performance of mob(\(\widehat W\), \(\widehat Y\)), combining centering of outcome and treatment from causal forests with joint estimation of prognostic and predictive effects, suggests that alternative split criteria sensitive to both intercepts and treatment effects might be able to improve the performance of causal forests.</p> <p><a href="https://www.zeileis.org/assets/posts/2022-07-02-causal_forests/fig1.png"><img src="https://www.zeileis.org/assets/posts/2022-07-02-causal_forests/fig1.png" alt="Results for the experimental setups in Section 4.1 of the arXiv working paper. Direct comparison of the adaptive versions of causal forests, model-based forests without centering (mob), mob imitating causal forests (mobcf), mob with centered W (mob(W)) and additional of Y (mob(W, Y))." /></a></p> <p>For more details and more results see the <a href="https://doi.org/10.48550/arXiv.2206.10323">arXiv working paper</a>.</p> <h3 id="empirical-application">Empirical application</h3> <p>To illustrate how model-based causal forests can be tailored for specific situations, the effect of cesarean sections vs. vaginal deliveries (treatment) on the amount of postpartum blood loss (outcome) is invectigated. Clearly, covariates like maternal age, birth weight, gestational age, or multifetal pregnancy potentially have an impact on both the treatment and the outcome. As randomizing the mode of delivery is impossible, methods for HTE estimation from observational data are needed. Moreover, blood loss is a skewed variable that is additionally impossible to measure exactly in the sometimes hectic environment of a delivery ward. It is hence treated as interval-censored. To accomodate all these features, a model-based causal forest is fitted by using <code class="language-plaintext highlighter-rouge">pmforest()</code> from <em>model4you</em> in combination with:</p> <ul> <li>Centering of the treatment variable to account for the observational nature of the data.</li> <li>A transformation model (based on a Bernstein polynomial) to flexibly capture the skewness of the outcome variable.</li> <li>Interval censoring of the outcome observations.</li> </ul> <p>The dependency of the treatment effect on the prepartum variables is visualized in the figure below, using scatter plots for continuous covariates and boxplots for categorical covariates. While some variables have virtually no influence on the treatment effect (e.g., mother’s age), others are associated with clear effect differences. In particular, higher gestational age, higher neonatal weight, and no multifetal pregnancy have a higher risk for elevated blood loss due to cesarean section compared to vaginal delivery.</p> <p><a href="https://www.zeileis.org/assets/posts/2022-07-02-causal_forests/fig5.png"><img src="https://www.zeileis.org/assets/posts/2022-07-02-causal_forests/fig5.png" alt="Dependency plots of the individual treatment effects calculated by the model-based transformation forest. Values > 0 mean that cesarean section increases the blood loss compared to vaginal delivery. Blue lines and diamond points depict (smooth conditional) mean effects." /></a></p> <p>For more details see the <a href="https://doi.org/10.48550/arXiv.2206.10323">arXiv working paper</a>.</p>2022-07-02T00:00:00+02:00