<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="/assets/xslt/atom.xslt" ?>
<?xml-stylesheet type="text/css" href="/assets/css/atom.css" ?>
<feed xmlns="http://www.w3.org/2005/Atom">
	<id>https://www.zeileis.org/</id>
	<title>Achim Zeileis</title>
	<updated>2026-02-05T14:02:54+01:00</updated>
	<subtitle>Research homepage of Achim Zeileis, Universität Innsbruck. &lt;br/&gt;Department of Statistics, Faculty of Economics and Statistics. &lt;br/&gt;Universitätsstr. 15, 6020 Innsbruck, Austria. &lt;br/&gt;Tel: +43/512/507-70403</subtitle>
	
		
		<author>
			
				<name>Achim Zeileis</name>
			
			
				<email>Achim.Zeileis@R-project.org</email>
			
			
				<uri>https://www.zeileis.org/</uri>
			
		</author>
	
	<link href="https://www.zeileis.org/atom.xml" rel="self" type="application/rss+xml" />
	<link href="https://www.zeileis.org/" rel="alternate" type="text/html" />
	<generator uri="http://jekyllrb.com" version="4.2.2">Jekyll</generator>
	
    
    
		<entry>
			<id>https://www.zeileis.org/news/undebunking/</id>
			<title>Un-debunking the GAMLSS myth</title>
			<link href="https://www.zeileis.org/news/undebunking/" rel="alternate" type="text/html" title="Un-debunking the GAMLSS myth" />
			<updated>2025-12-12T00:00:00+01:00</updated>
			
				
				<author>
					
						<name>Achim Zeileis</name>
					
					
						<email>Achim.Zeileis@R-project.org</email>
					
					
						<uri>https://www.zeileis.org/</uri>
					
				</author>
			
			<summary>Generalized additive models for location, scale, and shape (GAMLSS) are popular for estimating reference equations, e.g., for lung function measurements. A recent paper claimed to &apos;debunk&apos; the GAMLSS myth in this context by showing that much simpler piecewise linear models work just as well. A new letter to the editor refutes these claims.</summary>
			<content type="html" xml:base="https://www.zeileis.org/news/undebunking/">&lt;p&gt;Generalized additive models for location, scale, and shape (GAMLSS) are popular for estimating reference equations, e.g., for lung function measurements. A recent paper claimed to &apos;debunk&apos; the GAMLSS myth in this context by showing that much simpler piecewise linear models work just as well. A new letter to the editor refutes these claims.&lt;/p&gt; &lt;h2 id=&quot;citation&quot;&gt;Citation&lt;/h2&gt; &lt;p&gt;Robert A. Rigby, Mikis D. Stasinopoulos, Achim Zeileis, Sanja Stanojevic, Gillian Heller, Fernanda de Bastiani, Thomas Kneib, Andreas Mayr, Reto Stauffer, Nikolaus Umlauf (2026). “Letter to the Editor Refuting ‘Debunking the GAMLSS Myth: Simplicity Reigns in Pulmonary Function Diagnostics’.” &lt;em&gt;Respiratory Medicine&lt;/em&gt;, &lt;strong&gt;251&lt;/strong&gt;, 108557.&lt;br /&gt; Preprint available at &lt;em&gt;arXiv.org E-Print Archive&lt;/em&gt; arXiv:2512.09179 [stat.AP]. &lt;a href=&quot;https://doi.org/10.48550/arXiv.2512.09179&quot;&gt;doi:10.48550/arXiv.2512.09179&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;abstract&quot;&gt;Abstract&lt;/h2&gt; &lt;p&gt;We read with interest the above article by &lt;a href=&quot;https://doi.org/10.1016/j.rmed.2024.107836&quot;&gt;Zavorsky (2025, Respiratory Medicine)&lt;/a&gt; concerning reference equations for pulmonary function testing. The author compares a Generalized Additive Model for Location, Scale, and Shape (GAMLSS), which is the standard adopted by the Global Lung Function Initiative (GLI), with a segmented linear regression (SLR) model, for pulmonary function variables. The author presents an interesting comparison; however there are some fundamental issues with the approach. We welcome this opportunity for discussion of the issues that it raises. The author’s contention is that (1) SLR provides “prediction accuracies on par with GAMLSS”; and (2) the GAMLSS model equations are “complicated and require supplementary spline tables”, whereas the SLR is “more straightforward, parsimonious, and accessible to a broader audience”. We respectfully disagree with both of these points.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://arxiv.org/pdf/2512.09179&quot;&gt;Read full letter ›&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;replication&quot;&gt;Replication&lt;/h2&gt; &lt;p&gt;Zavorsky provides the data for his analysis in &lt;a href=&quot;https://data.mendeley.com/public-files/datasets/dwjykg3xww/files/2ac262d6-7d46-426d-a6f0-b5c6b61d796a/file_downloaded&quot;&gt;NHANES_2007_2012_Only_Acceptable_Spirometry_Values.csv&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Based on Zavorsky’s code we were able to replicate most of his analyses and were able to re-assess the models. The replication code for the letter to the editor is available in &lt;a href=&quot;https://www.zeileis.org/assets/posts/2025-12-12-undebunking/undebunking.R&quot;&gt;undebunking.R&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;For illustration one pair of models for one of the response variables (in the male subgroup) is fitted and assessed below.&lt;/p&gt; &lt;h2 id=&quot;illustration&quot;&gt;Illustration&lt;/h2&gt; &lt;p&gt;To illustrate the workflow we used for assessing the segmented linear regression (SLR) and the generalized additive models for location, scale, and shape (GAMLSS) for the spirometry data, we consider the response variable FEV1 for the male subgroup. FEV1 is the forced expiratory volume in 1 second in liters, i.e., the volume of air that can forcibly be blown out in the first second, after full inspiration.&lt;/p&gt; &lt;p&gt;After downloading the data (see link above) we can read the CSV, compute squared age (which Zavorsky uses somewhat confusingly instead of age in SLR), and select the male subset.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;d &amp;lt;- &quot;NHANES_2007_2012_Only_Acceptable_Spirometry_Values.csv&quot; |&amp;gt; read.csv() |&amp;gt; transform(Age2 = Age^2) |&amp;gt; subset(Sex == &quot;Male&quot;) &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Then we load the packages we need, all available from CRAN except the &lt;a href=&quot;https://topmodels.R-Forge.R-project.org/topmodels/&quot;&gt;topmodels&lt;/a&gt; package which is still on R-Forge.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;library(&quot;segmented&quot;) library(&quot;gamlss&quot;) library(&quot;distributions3&quot;) library(&quot;topmodels&quot;) &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;To enable the workflow for probabilistic forecasts and model assessments from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;distributions3&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;topmodels&lt;/code&gt; packages for the SLR models fitted by &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;segmented&lt;/code&gt; we need a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prodist()&lt;/code&gt; method to extract/predict the corresponding probability distribution. This does not work out of the box because &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;segmented&lt;/code&gt; employs just a single constant variance while Zavorsky employs a piecewise constant variance. The function below implements this for the special case of a segmented &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lm()&lt;/code&gt; with exactly one breakpoint.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;prodist.segmented &amp;lt;- function(object, newdata = NULL, ...) { ## in-sample and out-of-sample data fitdata &amp;lt;- model.frame(object) if (is.null(newdata)) newdata &amp;lt;- fitdata ## predicted mean (m) and standard deviation (s) m &amp;lt;- predict(object, newdata = newdata) psi &amp;lt;- object$psi[, &quot;Est.&quot;] seg &amp;lt;- as.numeric(fitdata[[object$nameUV$Z]] &amp;gt; psi) + 1 s &amp;lt;- sqrt(tapply(residuals(object)^2, seg, mean)) s &amp;lt;- s[as.numeric(newdata[[object$nameUV$Z]] &amp;gt; psi) + 1] ## Normal distribution object distributions3::Normal(mu = m, sigma = s) } &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Then we fit the SLR and GAMLSS models in the (somewhat complicated) specification that Zavorsky considered for his paper.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;lm_fev1_m &amp;lt;- lm(Baseline_FEV1_L ~ Age2 + Height + I(Height^2) + Weight, data = d) seg_fev1_m &amp;lt;- segmented(lm_fev1_m, seg.Z = ~ Age2, psi = 20^2) gam_fev1_m &amp;lt;- gamlss(Baseline_FEV1_L ~ log(Height) + log(Age) + cs(Age, df = 3), sigma.formula = ~ log(Age) + cs(Age, df = 3), family = BCCGo(mu.link = &quot;log&quot;), data = d, method = mixed(50, 500), control = gamlss.control(trace = TRUE, maxit = 500)) &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;To show that the SLR with the piecewise constant variance does not provide a satisfactory fit for several age groups we provide the visualization below. It depicts the empirical proportions of observations below the 5% lower limit of normal (LLN) across different age groups (7.5 year intervals). The red line with triangles corresponds to the SLR model and the blue line with circles corresponds to the GAMLSS. The corresponding point-wise 95% intervals are added in gray in each panel.&lt;/p&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2025-12-12-undebunking/lln.png&quot; alt=&quot;Empirical proportions of observations below the 5% lower limit of normal (LLN) across different age groups (7.5 year intervals).&quot; /&gt;&lt;/p&gt; &lt;p&gt;While the proportion of observations below the LLN roughly corresponds to the nominal 5% level for the GAMLSS in all age intervals, the SLR does rather poorly. For young persons (5 to 12.5 years of age) the LLN is much too small, so that very few observations actually fall below the LLN. Conversely, for persons between 12.5 and 20 years of age the LLN is too large and hence too many observations fall below the LLN. In other words, the SLR would fail to indicate critically low FEV1 values, either by producing too many or too few cases.&lt;/p&gt; &lt;p&gt;Using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;procast()&lt;/code&gt; function from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;topmodels&lt;/code&gt; we predict for each person the corresponding 5% LLN and then we aggregate the exceeding proportions in 7.5 year age intervals.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;## proportion of observations below LLN d_lln &amp;lt;- rbind( data.frame( proportion = d$Baseline_FEV1_L &amp;lt; procast(gam_fev1_m, type = &quot;quantile&quot;, at = 0.05, drop = TRUE), fit = &quot;gamlss&quot;, age = d$Age), data.frame( proportion = d$Baseline_FEV1_L &amp;lt; procast(seg_fev1_m, type = &quot;quantile&quot;, at = 0.05, drop = TRUE), fit = &quot;segmented&quot;, age = d$Age) ) ## set up age intervals breaks &amp;lt;- seq(5, 80, by = 7.5) mid &amp;lt;- (head(breaks, -1) + tail(breaks, -1))/2 d_lln &amp;lt;- transform(d_lln, age = cut(age, breaks, include.lowest = TRUE) ) ## aggregate in age groups a_lln &amp;lt;- d_lln |&amp;gt; aggregate(x = proportion ~ fit + age, FUN = mean) |&amp;gt; transform(nobs = aggregate(proportion ~ fit + age, data = d_lln, FUN = length)$proportion) |&amp;gt; transform(alpha = 0.05) |&amp;gt; transform(se = sqrt(alpha * (1 - alpha)/nobs)) |&amp;gt; transform(lower = alpha + qnorm(0.025) * se) |&amp;gt; transform(upper = alpha + qnorm(0.975) * se) |&amp;gt; transform(age = mid[age]) &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;To display these aggregated values in an appealing display we leverage the recent &lt;a href=&quot;https://grantmcdermott.com/tinyplot/&quot;&gt;tinyplot&lt;/a&gt; package.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;library(&quot;tinyplot&quot;) pal &amp;lt;- palette.colors()[6:7] tinytheme(&quot;clean2&quot;, grid.lwd = 1, grid.lty = 2) tinyplot( proportion ~ age | fit, data = a_lln, type = &quot;o&quot;, pch = c(19, 17), lwd = 1.5, cex = 1.5, col = pal, ylim = c(0.01, 0.09), yaxl = &quot;%&quot;, xlab = &quot;Age (mid points of 7.5 year intervals)&quot;, ylab = &quot;Proportion of observations below LLN&quot; ) tinyplot_add( alpha ~ age | fit, ymin = a_lln$lower, ymax = a_lln$upper, type = &quot;ribbon&quot;, col = &quot;gray&quot; ) tinyplot_add() &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;To bring out clearly that not only the 5% LLN is misspecified but that more generally the SLR provides a poor probabilistic fit for FEV1, we employ a detrended QQ plot (also known as worm plot) based on z-scores (also known as quantile residuals). Using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;qqrplot()&lt;/code&gt; function from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;topmodels&lt;/code&gt; this can be set up as follow.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;qqrplot(seg_fev1_m, detrend = TRUE, confint = &quot;polygon&quot;, col = pal[2], pch = 17, main = &quot;&quot;) qqrplot(gam_fev1_m, detrend = TRUE, plot = FALSE) |&amp;gt; points(col = pal[1], pch = 19) &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2025-12-12-undebunking/qq.png&quot; alt=&quot;QQ plots of the z-scores from the SLR and GAMLSS models for a selected age group.&quot; /&gt;&lt;/p&gt; &lt;p&gt;In the letter the QQ plots are used without detrending and are considered separately for different age groups.&lt;/p&gt; &lt;p&gt;See the full replication code (linked above) for further details.&lt;/p&gt;</content>
			
				<category term="news" />
			
			
				<category term="gamlss" />
			
				<category term="spirometry" />
			
				<category term="R" />
			
			<published>2025-12-12T00:00:00+01:00</published>
		</entry>
	
    
    
		<entry>
			<id>https://www.zeileis.org/news/weuro2025/</id>
			<title>Forecasting the UEFA Women&apos;s Euro 2025 with enhanced statistical learning</title>
			<link href="https://www.zeileis.org/news/weuro2025/" rel="alternate" type="text/html" title="Forecasting the UEFA Women&apos;s Euro 2025 with enhanced statistical learning" />
			<updated>2025-07-01T00:00:00+02:00</updated>
			
				
				<author>
					
						<name>Achim Zeileis</name>
					
					
						<email>Achim.Zeileis@R-project.org</email>
					
					
						<uri>https://www.zeileis.org/</uri>
					
				</author>
			
			<summary>Probabilistic forecasts for the UEFA Women&apos;s Euro 2025 are obtained by using a machine learning ensemble that combines statistically-enhanced features and other information about the teams. The favorite is Spain, followed by Germany, France, and England.</summary>
			<content type="html" xml:base="https://www.zeileis.org/news/weuro2025/">&lt;p&gt;Probabilistic forecasts for the UEFA Women&apos;s Euro 2025 are obtained by using a machine learning ensemble that combines statistically-enhanced features and other information about the teams. The favorite is Spain, followed by Germany, France, and England.&lt;/p&gt; &lt;div class=&quot;row t20 b20&quot;&gt; &lt;div class=&quot;small-8 medium-9 large-10 columns&quot;&gt; The UEFA Women&apos;s Euro 2025 will start tomorrow, hosted by Switzerland. An increasing number of football fans around the world are not just following men&apos;s but also women&apos;s tournaments. They look forward to seeing how 16 of the best European teams compete from 2 to 27 July to determine the new European Champion. In anticipation of the tournament the big question is who among the teams will succeed, who will drop out, and who will eventually prevail. While, of course, it is not yet possible to give &lt;em&gt;definitive&lt;/em&gt; answers to these questions, we are able to provide &lt;em&gt;probabilistic&lt;/em&gt; forecasts for all possible matches based on a combination of machine learning, statistics, and computing. This allows us to explore the likely course of the tournament by simulation. &lt;/div&gt; &lt;div class=&quot;small-4 medium-3 large-2 columns&quot;&gt; &lt;a href=&quot;https://www.uefa.com/womenseuro/&quot; alt=&quot;UEFA Women&apos;s Euro 2025 web page&quot;&gt;&lt;img src=&quot;https://upload.wikimedia.org/wikipedia/en/1/1c/UEFA_Women%27s_Euro_2025_logo.svg&quot; alt=&quot;UEFA Women&apos;s Euro 2025 logo&quot; /&gt;&lt;/a&gt; &lt;/div&gt; &lt;/div&gt; &lt;h2 id=&quot;winning-probabilities&quot;&gt;Winning probabilities&lt;/h2&gt; &lt;p&gt;The forecast is based on an ensemble of machine learners that blend three main sources of information: An ability estimate for every team based on historic matches; an ability estimate for every team based on odds from 24 bookmakers; and further team and country features (e.g., FIFA rank or GDP). An ensemble of machine learners is trained on the results of the UEFA Women’s Euro tournaments from 2013 to 2022 and then applied to obtain a forecast for the UEFA Women’s Euro 2025. More specifically, the ensemble estimates the predicted number of goals for all possible matches between all 16 teams in the tournament. Based on the predicted goals the probabilities for a &lt;em&gt;win&lt;/em&gt;, &lt;em&gt;draw&lt;/em&gt;, or &lt;em&gt;loss&lt;/em&gt; in each of these matches can be computed from a bivariate Poisson distribution. This allows us to simulate all matches in the group phase and which teams proceed to the knockout stage and who eventually wins the tournament. Repeating the simulation 100,000 times yields winning probabilities for each team. The results show that reigning World Champion Spain is also the favorite for the European title with a winning probability of 27.2%, followed by eight-time winner Germany with 23.0%, France with 17.6%, and defending champion England with 17.2%. The winning probabilities for all teams are shown in the barchart below with more information linked in the interactive full-width version.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2025-07-01-weuro2025/p_win.html&quot;&gt;Interactive full-width graphic&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2025-07-01-weuro2025/p_win.html&quot;&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2025-07-01-weuro2025/p_win.png&quot; alt=&quot;Barchart: Winning probabilities&quot; /&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;The methodology for this study was developed by an international collaboration of teams around &lt;a href=&quot;https://bd.statistik.tu-dortmund.de/professur/arbeitsgruppe/prof-dr-andreas-groll/&quot;&gt;Andreas Groll&lt;/a&gt; (TU Dortmund), &lt;a href=&quot;https://math.uni.lu/midas/people/dp/?christopheley&quot;&gt;Christophe Ley&lt;/a&gt; (University of Luxembourg), &lt;a href=&quot;https://www.sg.tum.de/epidemiologie/team/schauberger/&quot;&gt;Gunther Schauberger&lt;/a&gt; (TU München), &lt;a href=&quot;https://www.zeileis.org/&quot;&gt;Achim Zeileis&lt;/a&gt; (Universität Innsbruck). In this year, Marjan Farahani and &lt;a href=&quot;https://scholar.google.com/citations?user=G3JmOXUAAAAJ&quot;&gt;Rouven Michels&lt;/a&gt; also contributed to the study.&lt;/p&gt; &lt;p&gt;The basic idea for the forecast is to proceed in two steps. In the first step, two sophisticated statistical models are employed to determine the strengths of all teams using disparate sets of information. In the second step, a machine learner ensemble decides how to best combine the strength estimates with other information about the teams.&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt;&lt;em&gt;Historic match abilities:&lt;/em&gt;&lt;br /&gt; An ability estimate is obtained for every team based on “retrospective” data, namely all historic national matches over the last 8 years. A &lt;em&gt;bivariate Poisson model&lt;/em&gt; with team-specific fixed effects is fitted to the number of goals scored by both teams in each match. However, rather than equally weighting all matches to obtain &lt;em&gt;average&lt;/em&gt; team abilities (or team strengths) over the entire history period, an exponential weighting scheme is employed. This assigns more weight to more recent results and thus yields an estimate of &lt;em&gt;current&lt;/em&gt; team abilities. More details can be found in &lt;a href=&quot;https://doi.org/10.1177%2F1471082X18817650&quot;&gt;Ley, Van de Wiele, Van Eetvelde (2019)&lt;/a&gt;.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;&lt;em&gt;Bookmaker consensus abilities:&lt;/em&gt;&lt;br /&gt; Another ability estimate for every team is obtained based on “prospective” data, namely the odds of 24 international bookmakers that reflect their expert expectations for the tournament. Using the &lt;em&gt;bookmaker consensus model&lt;/em&gt; of &lt;a href=&quot;https://dx.doi.org/10.1016/j.ijforecast.2009.10.001&quot;&gt;Leitner, Zeileis, Hornik (2010)&lt;/a&gt;, the bookmaker odds are first adjusted for the bookmakers’ profit margins (“overround”) and then averaged (on a logit scale) to obtain a consensus for the winning probability of each team. To adjust for the effects of the tournament draw (that might have led to easier or harder groups for some teams), an “inverse” simulation approach is used to infer which team abilities are most likely to lead up to the consensus winning probabilities.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;&lt;em&gt;Machine learning ensemble:&lt;/em&gt;&lt;br /&gt; Finally, a machine learning ensemble, a so-called random forest, is used to combine these highly-aggregated and informative variables above along with various further relevant variables, yielding refined probabilistic forecasts for each match. Such an approach was first suggested by &lt;a href=&quot;https://doi.org/10.1515/jqas-2018-0060&quot;&gt;Groll, Ley, Schauberger, Van Eetvelde (2019)&lt;/a&gt; and subsequently improved collaboratively. The machine learning ensemble is trained to decide how to blend the different ability estimates with team-specific features that are typically less informative but still powerful enough to enhance the forecasts. The features considered comprise team- and country-specific details (e.g., FIFA rank, number of Champions League players, and GDP per capita). By combining a large ensemble of machine learners, each of which employs the available information somewhat differently, the relative importances of all the covariates can be inferred automatically. The resulting predicted number of goals for each team can then finally be used to simulate the entire tournament 100,000 times.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;h2 id=&quot;match-probabilities&quot;&gt;Match probabilities&lt;/h2&gt; &lt;p&gt;Using the forecasts from the machine learning ensemble yields the predicted number of goals for both teams in each possible match. The explanatory information used for this is the difference between the two teams in each of the variables listed above, i.e., the difference in historic match abilities (on a log scale), the difference in bookmaker consensus abilities (again on a log scale), difference in average player ratings of the teams, etc. Assuming a bivariate Poisson distribution with the predicted numbers of goals for both teams, we can compute the probability that a certain match ends in a &lt;em&gt;win&lt;/em&gt;, a &lt;em&gt;draw&lt;/em&gt;, or a &lt;em&gt;loss&lt;/em&gt;. The same can be repeated in overtime, if necessary, adjusting for the shorter time interval of 30 minutes and eventually a coin flip is used to decide penalties, if needed.&lt;/p&gt; &lt;p&gt;The following heatmap shows for each possible combination of teams the probability that one team beats the other team in a knockout match. The color scheme uses green vs. purple to signal probabilities above vs. below 50%, respectively. The tooltips for each match in the interactive version of the graphic also print the probabilities for the match to end in a &lt;em&gt;win&lt;/em&gt;, &lt;em&gt;draw&lt;/em&gt;, or &lt;em&gt;loss&lt;/em&gt; after normal time.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2025-07-01-weuro2025/p_match.html&quot;&gt;Interactive full-width graphic&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2025-07-01-weuro2025/p_match.html&quot;&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2025-07-01-weuro2025/p_match.png&quot; alt=&quot;Heatmap: Match probabilities&quot; /&gt;&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;performance-throughout-the-tournament&quot;&gt;Performance throughout the tournament&lt;/h2&gt; &lt;p&gt;As every single match can be simulated with the pairwise probabilities above, it is also straightfoward to simulate the entire tournament (here: 100,000 times) providing “survival” probabilities for each team across the different stages.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2025-07-01-weuro2025/p_surv.html&quot;&gt;Interactive full-width graphic&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2025-07-01-weuro2025/p_surv.html&quot;&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2025-07-01-weuro2025/p_surv.png&quot; alt=&quot;Line plot: Survival probabilities&quot; /&gt;&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;odds-and-ends&quot;&gt;Odds and ends&lt;/h2&gt; &lt;p&gt;All our forecasts are probabilistic, clearly below 100%, and by no means certain. Thus, although we can quantify this uncertainty in terms of probabilities from an ensemble of potential tournaments, it is far from being predetermined which of these potential tournaments we will eventually see during the actual tournament.&lt;/p&gt; &lt;p&gt;Nevertheless the probabilistic view provides us with some interesting insights: For example, while most bookmakers clearly favor Spain over Germany, France, and England, the differences are much smaller in our model. In a match between Spain and any of the other three co-favorites the probability of winning or losing is very close to a fair coin flip. This shows that the main reason for Spain’s high winning probability for the tournament is not so much that they are so much stronger than their co-favorites but that they were a bit more lucky in the tournament draw. Spain starts out in the somewhat weaker group B and will very likely proceed to the quarterfinal and face a team from the weakest group A, including host Switzerland. Thus, the expected course of the tournament is very different from that of co-favorites France and England who have been drawn together in the toughest group D, also including former European Champion Netherlands.&lt;/p&gt; &lt;p&gt;The four top teams are also most likely to be the competitors in the semifinals. However, the predicted probability of reaching the semifinal for host Switzerland is also moderately high (39.3%). This reflects that they have very good chances to proceed to the knockout stage and with a little bit of luck might be good for a surprise, even if the probability of going all the way and winning the title is rather low (3.4%).&lt;/p&gt; &lt;p&gt;In any case, all of this means that the probabilistic forecasts leave a lot of room for surprises and excitement during the UEFA Women’s Euro 2025. But what is absolutely certain is that we look forward to an entertaining tournament as football fans (much more than as professional forecasters).&lt;/p&gt;</content>
			
				<category term="news" />
			
			
				<category term="UEFA" />
			
				<category term="euro" />
			
				<category term="football" />
			
				<category term="forecasting" />
			
			<published>2025-07-01T00:00:00+02:00</published>
		</entry>
	
    
    
		<entry>
			<id>https://www.zeileis.org/news/fritz/</id>
			<title>Remembering Friedrich &quot;Fritz&quot; Leisch</title>
			<link href="https://www.zeileis.org/news/fritz/" rel="alternate" type="text/html" title="Remembering Friedrich &quot;Fritz&quot; Leisch" />
			<updated>2025-01-22T00:00:00+01:00</updated>
			
				
				<author>
					
						<name>Achim Zeileis</name>
					
					
						<email>Achim.Zeileis@R-project.org</email>
					
					
						<uri>https://www.zeileis.org/</uri>
					
				</author>
			
			<summary>Our friend and colleague Fritz Leisch died in April last year. In a new contribution to The R Journal we honor Fritz and commemorate his many contributions to science in general and to the R community in particular.</summary>
			<content type="html" xml:base="https://www.zeileis.org/news/fritz/">&lt;p&gt;Our friend and colleague Fritz Leisch died in April last year. In a new contribution to The R Journal we honor Fritz and commemorate his many contributions to science in general and to the R community in particular.&lt;/p&gt; &lt;h2 id=&quot;citation&quot;&gt;Citation&lt;/h2&gt; &lt;p&gt;Bettina Grün, Kurt Hornik, Torsten Hothorn, Theresa Scharl, Achim Zeileis (2025). “Remembering Friedrich “Fritz” Leisch.” &lt;em&gt;The R Journal&lt;/em&gt; &lt;strong&gt;16&lt;/strong&gt;(1), 5-14. &lt;a href=&quot;https://doi.org/10.32614/RJ-2024-001&quot;&gt;doi:10.32614/RJ-2024-001&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;abstract&quot;&gt;Abstract&lt;/h2&gt; &lt;p&gt;This article remembers our friend and colleague Fritz Leisch (1968-2024) who sadly died earlier this year. Many of the readers of The R Journal will know Fritz as a member of the R Core Team and for many of his contributions to the R community. For us, the co-authors of this article, he was an important companion on our journey with the R project and other scientific endeavours over the years. In the following, we provide a brief synopsis of his career, present his key contributions to the R project and to the scientific community more generally, acknowledge his academic service, and highlight his teaching and mentoring achievements.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://journal.R-project.org/articles/RJ-2024-001/&quot;&gt;Read full paper ›&lt;/a&gt;&lt;/p&gt;</content>
			
				<category term="news" />
			
			
				<category term="R" />
			
			<published>2025-01-22T00:00:00+01:00</published>
		</entry>
	
    
    
		<entry>
			<id>https://www.zeileis.org/news/exams/</id>
			<title>Examining exams using Rasch models and assessment of measurement invariance</title>
			<link href="https://www.zeileis.org/news/exams/" rel="alternate" type="text/html" title="Examining exams using Rasch models and assessment of measurement invariance" />
			<updated>2024-10-01T00:00:00+02:00</updated>
			
				
				<author>
					
						<name>Achim Zeileis</name>
					
					
						<email>Achim.Zeileis@R-project.org</email>
					
					
						<uri>https://www.zeileis.org/</uri>
					
				</author>
			
			<summary>Models from psychometric item response theory are used to analyze the results from a large introductory mathematics exams in order to gain insights about student abilities, question difficulties, and heterogeneities of these in subgroups.</summary>
			<content type="html" xml:base="https://www.zeileis.org/news/exams/">&lt;p&gt;Models from psychometric item response theory are used to analyze the results from a large introductory mathematics exams in order to gain insights about student abilities, question difficulties, and heterogeneities of these in subgroups.&lt;/p&gt; &lt;h2 id=&quot;citation&quot;&gt;Citation&lt;/h2&gt; &lt;p&gt;Achim Zeileis (2024). “Examining Exams Using Rasch Models and Assessment of Measurement Invariance.” &lt;em&gt;arXiv.org E-Print Archive&lt;/em&gt; arXiv:2409.19522 [stat.AP]. &lt;a href=&quot;https://doi.org/10.48550/arXiv.2409.19522&quot;&gt;doi:10.48550/arXiv.2409.19522&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;abstract&quot;&gt;Abstract&lt;/h2&gt; &lt;p&gt;Many statisticians regularly teach large lecture courses on statistics, probability, or mathematics for students from other fields such as business and economics, social sciences and psychology, etc. The corresponding exams often use a multiple-choice or single-choice format and are typically evaluated and graded automatically, either by scanning printed exams or via online learning management systems. Although further examinations of these exams would be of interest, these are frequently not carried out. For example a measurement scale for the difficulty of the questions (or items) and the ability of the students (or subjects) could be established using psychometric item response theory (IRT) models. Moreover, based on such a model it could be assessed whether the exam is really fair for all participants or whether certain items are easier (or more difficult) for certain subgroups of students.&lt;/p&gt; &lt;p&gt;Here, several recent methods for assessing &lt;em&gt;measurement invariance&lt;/em&gt; and for detecting &lt;em&gt;differential item functioning&lt;/em&gt; in the Rasch IRT model are discussed and applied to results from a first-year mathematics exam with single-choice items. Several categorical, ordered, and numeric covariates like gender, prior experience, and prior mathematics knowledge are available to form potential subgroups with differential item functioning. Specifically, all analyses are demonstrated with a hands-on R tutorial using the &lt;em&gt;psycho*&lt;/em&gt; family of R packages (&lt;em&gt;psychotools&lt;/em&gt;, &lt;em&gt;psychotree&lt;/em&gt;, &lt;em&gt;psychomix&lt;/em&gt;) which provide a unified approach to estimating, visualizing, testing, mixing, and partitioning a range of psychometric models.&lt;/p&gt; &lt;p&gt;The paper is dedicated to the memory of Fritz Leisch (1968-2024) and his contributions to various aspects of this work are highlighted.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://arxiv.org/pdf/2409.19522&quot;&gt;Read full paper ›&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;software&quot;&gt;Software&lt;/h2&gt; &lt;p&gt;R packages:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;&lt;a href=&quot;https://doi.org/10.32614/CRAN.package.psychotools&quot;&gt;doi:10.32614/CRAN.package.psychotools&lt;/a&gt;&lt;/li&gt; &lt;li&gt;&lt;a href=&quot;https://doi.org/10.32614/CRAN.package.psychotree&quot;&gt;doi:10.32614/CRAN.package.psychotree&lt;/a&gt;&lt;/li&gt; &lt;li&gt;&lt;a href=&quot;https://doi.org/10.32614/CRAN.package.psychomix&quot;&gt;doi:10.32614/CRAN.package.psychomix&lt;/a&gt;&lt;/li&gt; &lt;/ul&gt; &lt;h2 id=&quot;illustration&quot;&gt;Illustration&lt;/h2&gt; &lt;p&gt;The strategies for analyzing exam results using psychometric item response theory (IRT) models are illustrated with Rasch models fitted to the results from a large introductory mathematics exam for economics and business students. Here, only a quick teaser is provided that shows how to quickly visualize simple exploratory statistics and some model-based results. For the full analysis of the data that gives special emphasis to the assessment of so-called measurement invariance, see the full paper linked above. The full replication code for all results in the paper is provided in: &lt;a href=&quot;https://www.zeileis.org/assets/posts/2024-10-01-exams/exams.R&quot;&gt;exams.R&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;The data are available as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MathExam14W&lt;/code&gt; in the &lt;a href=&quot;https://CRAN.R-project.org/package=psychotools&quot;&gt;psychotools&lt;/a&gt; package. The code below excludes the students which solved none or all of the exercises, thus not discriminating between the exercises in terms of their difficulty. The response variable is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;solved&lt;/code&gt; which is an object of class &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;itemresp&lt;/code&gt;. Internally, it is essentially a 729 x 13 matrix with binary 0/1 coding plus some metainformation. As a first exploratory graphic the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;plot()&lt;/code&gt; method shows a bar plot with empirical frequencies of correctly solving each of the 13 exercises.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;library(&quot;psychotools&quot;) data(&quot;MathExam14W&quot;, package = &quot;psychotools&quot;) mex &amp;lt;- subset(MathExam14W, nsolved &amp;gt; 0 &amp;amp; nsolved &amp;lt; 13) plot(mex$solved) &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-10-01-exams/itemresp.png&quot; alt=&quot;Bar plot with empirical frequencies of correctly solving each of the 13 exercises from the mathematics exam data.&quot; /&gt;&lt;/p&gt; &lt;p&gt;The plot demonstrates that most items have been solved correctly by about 40 to 80 percent of the students. The main exception is the payflow exercise (for which a certain integral had to be computed) which was solved correctly by less than 15 percent of the students.&lt;/p&gt; &lt;p&gt;To establish a formal IRT model for this data, we employ a Rasch model that uses the differences between person abilities &lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;italic&quot;&gt;&amp;theta;&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/math&gt; and item difficulties &lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;italic&quot;&gt;&amp;beta;&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/math&gt; for describing the logit of the probability &lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;italic&quot;&gt;&amp;pi;&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;italic&quot;&gt;ij&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/math&gt; that person &lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt; correctly solves item &lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;mrow&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/mrow&gt;&lt;/math&gt;.&lt;/p&gt; &lt;table&gt; &lt;tr&gt; &lt;td align=&quot;right&quot;&gt; &lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;italic&quot;&gt;&amp;pi;&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;italic&quot;&gt;ij&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/math&gt; &lt;br /&gt; &lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;mrow&gt;&lt;mtext&gt;logit&lt;/mtext&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;italic&quot;&gt;&amp;pi;&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;italic&quot;&gt;ij&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/math&gt; &lt;/td&gt; &lt;td align=&quot;center&quot;&gt; &lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/math&gt; &lt;br /&gt; &lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/math&gt; &lt;/td&gt; &lt;td align=&quot;left&quot;&gt; &lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;mrow&gt;&lt;mtext&gt;Pr&lt;/mtext&gt;&lt;mo stretchy=&quot;false&quot;&gt;(&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi&gt;y&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;italic&quot;&gt;ij&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;=&lt;/mo&gt;&lt;mn&gt;1&lt;/mn&gt;&lt;mo stretchy=&quot;false&quot;&gt;)&lt;/mo&gt;&lt;/mrow&gt;&lt;/math&gt; &lt;br /&gt; &lt;math xmlns=&quot;http://www.w3.org/1998/Math/MathML&quot;&gt;&lt;mrow&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;italic&quot;&gt;&amp;theta;&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;i&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;mo&gt;-&lt;/mo&gt;&lt;msub&gt;&lt;mrow&gt;&lt;mi mathvariant=&quot;italic&quot;&gt;&amp;beta;&lt;/mi&gt;&lt;/mrow&gt;&lt;mrow&gt;&lt;mi&gt;j&lt;/mi&gt;&lt;/mrow&gt;&lt;/msub&gt;&lt;/mrow&gt;&lt;/math&gt; &lt;/td&gt; &lt;/tr&gt; &lt;/table&gt; &lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;raschmodel()&lt;/code&gt; function estimates the item difficulties using conditional maximum likelihood and the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;plot()&lt;/code&gt; method then shows the corresponding person abilities (as a bar plot) along with the item difficulties (as a dot chart) on the same latent trait scale.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;mr &amp;lt;- raschmodel(mex$solved) plot(mr, type = &quot;piplot&quot;) &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-10-01-exams/piplot.png&quot; alt=&quot;Person-item plot based on the full-sample Rasch model for the mathematics exam data.&quot; /&gt;&lt;/p&gt; &lt;p&gt;Qualitatively, the Rasch model-based person-item plot shows a similar pattern as the exploratory bar plot. However, due to the latent logistic scale the most difficult item (payflow) and the easiest item (hesse) are brought out even more clearly. Also the majority of the item difficulties are close to the median ability in this sample. Thus, the exam discrimenates more sharply at the median difficulty and less sharply in the tails at very high or very low ability.&lt;/p&gt; &lt;p&gt;So far so good. However, the interpretation above is only reliable if all item difficulties are indeed the same for all students in the sample. If this is not the case, differences in the item responses would not necessarily be caused by differences in mathematics ability. The fundamental assumption that the difficulties are constant across all persons is a special case of so-called &lt;em&gt;measurement invariance&lt;/em&gt;. And a violation of this assumption is known as &lt;em&gt;differential item functioning&lt;/em&gt; (DIF), i.e., some item(s) is/are relatively easier for some subgroup of persons compared to others.&lt;/p&gt; &lt;p&gt;The main contribution of the paper is to detect such differential item functioning and investigate the potential sources of it. See the arXiv paper for all details and the full analysis.&lt;/p&gt;</content>
			
				<category term="news" />
			
			
				<category term="psychometrics" />
			
				<category term="IRT" />
			
				<category term="exams" />
			
				<category term="R" />
			
			<published>2024-10-01T00:00:00+02:00</published>
		</entry>
	
    
    
		<entry>
			<id>https://www.zeileis.org/news/lossaversion/</id>
			<title>Modeling loss aversion with extended-support beta regression</title>
			<link href="https://www.zeileis.org/news/lossaversion/" rel="alternate" type="text/html" title="Modeling loss aversion with extended-support beta regression" />
			<updated>2024-09-23T00:00:00+02:00</updated>
			
				
				<author>
					
						<name>Achim Zeileis</name>
					
					
						<email>Achim.Zeileis@R-project.org</email>
					
					
						<uri>https://www.zeileis.org/</uri>
					
				</author>
			
			<summary>The recently-proposed extended-support beta regression model in R package betareg is illustrated by simultaneously modeling the occurrence and extent of loss aversion in a behavioral economics experiment.</summary>
			<content type="html" xml:base="https://www.zeileis.org/news/lossaversion/">&lt;p&gt;The recently-proposed extended-support beta regression model in R package betareg is illustrated by simultaneously modeling the occurrence and extent of loss aversion in a behavioral economics experiment.&lt;/p&gt; &lt;h2 id=&quot;motivation&quot;&gt;Motivation&lt;/h2&gt; &lt;p&gt;To illustrate the benefits of &lt;a href=&quot;https://www.zeileis.org/news/xbx/&quot;&gt;extended-support beta regression&lt;/a&gt; models, suggested in a recent arXiv paper with &lt;a href=&quot;https://ikosmidis.com/&quot;&gt;Ioannis Kosmidis&lt;/a&gt;, we revisit the analysis of a behavioral economics experiment conducted and published by Glätzle-Rützler &lt;em&gt;et al.&lt;/em&gt; (2015, &lt;em&gt;Journal of Economic Behavior &amp;amp; Organization&lt;/em&gt;, &lt;a href=&quot;https://doi.org/10.1016/j.jebo.2014.12.021&quot;&gt;doi:10.1016/j.jebo.2014.12.021&lt;/a&gt;). The outcome variable is the proportion of tokens invested by high-school students in a risky lottery with positive expected payouts. Glätzle-Rützler &lt;em&gt;et al.&lt;/em&gt; focused on the effects of several experimental factors on the mean investments, which reflect the players’ willingness to take risks. In their study they employed linear regression models, estimated by ordinary least squares (OLS) with standard errors adjusted for potential clustering and heteroscedasticity.&lt;/p&gt; &lt;p&gt;Here, we extend the analysis from Glätzle-Rützler &lt;em&gt;et al.&lt;/em&gt; by employing a similar model for the mean investments but additionally exploring distributional specifications that allow for a probabilistic, rather than mean-only, interpretation of the effects. From an economic perspective this is of interest because it allows to interpret both the mean willingness to take risks in this experiment, and the probability to behave like a rational &lt;em&gt;Homo oeconomicus&lt;/em&gt;, who would invest (almost) all tokens in this lottery because it has positive expected payouts.&lt;/p&gt; &lt;p&gt;The full replication code for the analyses from the arXiv paper is available in &lt;a href=&quot;https://www.zeileis.org/assets/posts/2024-09-23-lossaversion/lossaversion.R&quot;&gt;lossaversion.R&lt;/a&gt; with some auxiliary functions in &lt;a href=&quot;https://www.zeileis.org/assets/posts/2024-09-23-lossaversion/beta01.R&quot;&gt;beta01.R&lt;/a&gt;. Below we only provide the most important R snippets to provide a feeling for the workflow in R. The rest of the discussion here highlights the main insights from the analysis.&lt;/p&gt; &lt;p&gt;An aggregated version of the data from all nine rounds of the experiment is available as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LossAversion&lt;/code&gt; in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;betareg&lt;/code&gt; package. Interest is in linking the variable &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;invest&lt;/code&gt; with the proportion of the total tokens invested in all nine rounds to explanatory information:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;grade&lt;/code&gt;: Is the player from lower grades 6-8 or upper grades 10-12?&lt;/li&gt; &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;arrangement&lt;/code&gt;: Is the player an individual or a team of two?&lt;/li&gt; &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;male&lt;/code&gt;: Is (at least one of) the player(s) male?&lt;/li&gt; &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;age&lt;/code&gt;: (Average) age of the player(s) in years.&lt;/li&gt; &lt;/ul&gt; &lt;h2 id=&quot;models&quot;&gt;Models&lt;/h2&gt; &lt;p&gt;We compare four different models for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;invest&lt;/code&gt; which all employ the same equation for the mean submodel. And all except the OLS reference model employ the main effects of the three experimental factors for the dispersion submodel.&lt;/p&gt; &lt;ol&gt; &lt;li&gt; &lt;p&gt;&lt;strong&gt;Normal linear model (N)&lt;/strong&gt; with constant variance, corresponding to the OLS approach from the original study. In R, this can be fitted with the base &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;lm()&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;glm()&lt;/code&gt; function.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;data(&quot;LossAversion&quot;, package = &quot;betareg&quot;) la_ols &amp;lt;- glm(invest ~ grade * (arrangement + age) + male, data = LossAversion) summary(la_ols) &lt;/code&gt;&lt;/pre&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;&lt;strong&gt;Heteroscedastic censored normal model (CN)&lt;/strong&gt;, also known as heteroscedastic two-limit tobit model in econometrics. This can be fitted with the &lt;a href=&quot;https://topmodels.R-Forge.R-project.org/crch/&quot;&gt;crch&lt;/a&gt; package (for censored regression with conditional heteroscedasticity).&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;library(&quot;crch&quot;) la_htobit &amp;lt;- crch(invest ~ grade * (arrangement + age) + male | arrangement + male + grade, data = LossAversion, left = 0, right = 1) summary(la_htobit) &lt;/code&gt;&lt;/pre&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;&lt;strong&gt;Beta regression (B)&lt;/strong&gt; after ad-hoc scaling of the investments to the open unit interval (to avoid the boundary observations). This can be fitted with the &lt;a href=&quot;https://topmodels.R-Forge.R-project.org/betareg/&quot;&gt;betareg&lt;/a&gt; package.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;library(&quot;betareg&quot;) LossAversion$invests &amp;lt;- (LossAversion$invest * (nrow(LossAversion) - 1) + 0.5)/ nrow(LossAversion) la_beta &amp;lt;- betareg(invests ~ grade * (arrangement + age) + male | arrangement + male + grade, data = LossAversion) summary(la_beta) &lt;/code&gt;&lt;/pre&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;&lt;strong&gt;Extended-support beta mixture model (XBX)&lt;/strong&gt; with the same specification as B but adding an extra exceedance parameter to be estimated (instead of the ad-hoc scaling). This can also be fitted with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;betareg&lt;/code&gt; since version 3.2-0 with XBX regression being automatically selected in case of boundary observations in the response.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;la_xbx &amp;lt;- betareg(invest ~ grade * (arrangement + age) + male | arrangement + male + grade, data = LossAversion) summary(la_xbx) &lt;/code&gt;&lt;/pre&gt; &lt;/li&gt; &lt;/ol&gt; &lt;p&gt;If you run the code and compare the model summaries, note that the coefficients from N and CN use an identity link for the mean parameter whereas B and XBX use a logit link. In addition, the log-likelihood, and, hence, AIC and BIC, are comparable only between CN and XBX because those two models have the same support for the response variable, that is the unit interval with point masses at 0 and 1. See also the accompanying arXiv paper for the full summary tables. By and large, the mean parameters for the N and CN models are rather similar and those for B and XBX are rather similar, only with some differences that occur due to using models with point masses on the boundaries (CN and XBX) or not (N and B).&lt;/p&gt; &lt;p&gt;However, instead of studying the individual estimated coefficients in more detail we rather assess the models graphically by visualizing their goodness of fit and different types of fitted effects.&lt;/p&gt; &lt;h2 id=&quot;goodness-of-fit&quot;&gt;Goodness of fit&lt;/h2&gt; &lt;p&gt;To illustrate how different the fitted probability distributions of the four models are, we employ so-called hanging rootograms. These compare the empirical marginal distributions of the response variable (proportion of tokens invested) to the aggregated fitted distributions from the models. The quality of the fit can be judged by the deviation of the hanging bars from the zero reference line.&lt;/p&gt; &lt;p&gt;An object-oriented implementation of rootograms is available, along with other tools for working with probabilistic models, in the &lt;a href=&quot;https://topmodels.R-Forge.R-project.org/topmodels/&quot;&gt;topmodels&lt;/a&gt; package on R-Forge (hopefully soon to be submitted to CRAN). You can install it from R-Forge or R-universe and then create the rootograms for the four models:&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;install.packages(&quot;topmodels&quot;, repos = &quot;https://zeileis.R-universe.dev&quot;) library(&quot;topmodels&quot;) rootogram(la_ols, breaks = -6:16 / 10, main = &quot;N&quot;) rootogram(la_htobit, main = &quot;CN&quot;) rootogram(la_beta, main = &quot;B&quot;) rootogram(la_xbx, main = &quot;XBX&quot;) &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;A more refined version of the plots is shown below. See the full replication script linked above for the code details.&lt;/p&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-09-23-lossaversion/lossaversion-rootogram.png&quot; alt=&quot;Hanging rootograms for assessing the goodness of fit of all four models.&quot; /&gt;&lt;/p&gt; &lt;p&gt;The square root of the expected frequencies are shown as red dots and the square root of the observed frequencies are hanging from the points as gray bars. The dashed lines are the Tukey warning limits at +/- 1. These plots show that models N and B fit poorly in the tails. In contrast, models CN and XBX fit very well with almost all bars hanging close to the zero reference line. The fit for XBX appears to be slightly better than for CN.&lt;/p&gt; &lt;h2 id=&quot;effects&quot;&gt;Effects&lt;/h2&gt; &lt;p&gt;While models XBX and CN provide a much better probabilistic fit than their uncensored counterparts B and N, it turns out that the predicted mean investments from all four models are still very similar. But XBX and CN allow for interpretations beyond the mean, including economically relevant interpretations of probability effects.&lt;/p&gt; &lt;p&gt;For illustration, we focus on the team arrangement effect for a subsample with a large share of very rational subjects: male players or teams with at least one male, in grades 10-12, and between 15 and 17 years of age. The figure below shows the estimated arrangement effect for the mean E(Y), i.e., the expected proportion of tokens invested, and for the probability to behave very rationally and invest almost everything, i.e., P(Y &amp;gt; 0.95). The empirical quantities are shown in black for the subsample between 15 and 17 years of age while the model-based effects are shown at an age of 16.&lt;/p&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-09-23-lossaversion/lossaversion-meanprob.png&quot; alt=&quot;Estimated arrangement effects for subjects that are male or teams with at least one male, in grade 10-12 and 16 years of age.&quot; /&gt;&lt;/p&gt; &lt;p&gt;The graphic shows that all models do a reasonable job in estimating E(Y) but the censored models XBX and CN are much better at estimating the probability to behave very rationally. Similarly, it could be shown that the fit for the probability P(Y &amp;lt; 0.05) is also much better for XBX and CN than for B and N but this is not done here, because that probability does not have such an appealing economic interpretation like P(Y &amp;gt; 0.95).&lt;/p&gt; &lt;p&gt;For obtaining the model-based effects, as shown above, the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;procast()&lt;/code&gt; function (for probabilistic forecasts) from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;topmodels&lt;/code&gt; package can be used. This is again an object-oriented implementation that facilitates obtaining not only moments (such as means and variances) but also entire probability distributions (as S3 objects) and corresponding probabilities, densities, and quantiles.&lt;/p&gt; &lt;p&gt;Here, we only briefly show the code for the fitted XBX model but the same function calls can be applied to the other fitted model objects. First, we set up the new data that only varies &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;arrangement&lt;/code&gt; from single to team but keeps all other variables fixed. Then, both kinds of effects are computed with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;procast()&lt;/code&gt;.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{r}&quot;&gt;la_nd &amp;lt;- data.frame(arrangement = c(&quot;single&quot;, &quot;team&quot;), male = &quot;yes&quot;, age = 16, grade = &quot;10-12&quot;) procast(la_xbx, newdata = la_nd, type = &quot;mean&quot;) ## mean ## 1 0.4713 ## 2 0.6861 procast(la_xbx, newdata = la_nd, type = &quot;cdf&quot;, at = 0.95, lower.tail = FALSE) ## probability ## 1 0.07161 ## 2 0.18501 &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Thus, the mean invested proportion goes up from 47.1% to 68.6% for teams vs. single players in this setting, while the probability to behave almost fully rationally increases from 7.2% to 18.5%.&lt;/p&gt; &lt;p&gt;Again, the full code for creating the figure and underlying table is provided in the replication script linked above.&lt;/p&gt; &lt;p&gt;The script also inlucdes some further illustrations, e.g., the comparison with three-part hurdle models for “zero-and-one-inflated” beta regression. However, these models do not work well here: unappealing interpretation, too many parameters, quasi-complete separation of boundary and non-boundary observations. Hence, we do not show the details in this post.&lt;/p&gt;</content>
			
				<category term="news" />
			
			
				<category term="betareg" />
			
				<category term="regression" />
			
				<category term="R" />
			
				<category term="econometrics" />
			
			<published>2024-09-23T00:00:00+02:00</published>
		</entry>
	
    
    
		<entry>
			<id>https://www.zeileis.org/news/xbx/</id>
			<title>Extended-support beta regression for [0,&amp;nbsp;1] responses</title>
			<link href="https://www.zeileis.org/news/xbx/" rel="alternate" type="text/html" title="Extended-support beta regression for [0,&amp;nbsp;1] responses" />
			<updated>2024-09-16T00:00:00+02:00</updated>
			
				
				<author>
					
						<name>Achim Zeileis</name>
					
					
						<email>Achim.Zeileis@R-project.org</email>
					
					
						<uri>https://www.zeileis.org/</uri>
					
				</author>
			
			<summary>New arXiv working paper introducing extended-support beta regression models which can capture probabilities for boundary observations at 0 and/or 1. It is available in the latest R package betareg, also accompanied by a new altdoc web page.</summary>
			<content type="html" xml:base="https://www.zeileis.org/news/xbx/">&lt;p&gt;New arXiv working paper introducing extended-support beta regression models which can capture probabilities for boundary observations at 0 and/or 1. It is available in the latest R package betareg, also accompanied by a new altdoc web page.&lt;/p&gt; &lt;h2 id=&quot;citation&quot;&gt;Citation&lt;/h2&gt; &lt;p&gt;Ioannis Kosmidis, Achim Zeileis (2024). “Extended-Support Beta Regression for [0, 1] Responses.” &lt;em&gt;arXiv.org E-Print Archive&lt;/em&gt; arXiv:2409.07233 [stat.ME]. &lt;a href=&quot;https://doi.org/10.48550/arXiv.2409.07233&quot;&gt;doi:10.48550/arXiv.2409.07233&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;abstract&quot;&gt;Abstract&lt;/h2&gt; &lt;p&gt;We introduce the XBX regression model, a continuous mixture of extended-support beta regressions for modeling bounded responses with or without boundary observations. The core building block of the new model is the extended-support beta distribution, which is a censored version of a four-parameter beta distribution with the same exceedance on the left and right of (0,1). Hence, XBX regression is a direct extension of beta regression. We prove that both beta regression with dispersion effects and heteroscedastic normal regression with censoring at both 0 and 1 – known as the heteroscedastic two-limit tobit model in the econometrics literature – are special cases of the extended-support beta regression model, depending on whether a single extra parameter is zero or infinity, respectively. To overcome identifiability issues that may arise in estimating the extra parameter due to the similarity of the beta and normal distribution for certain parameter settings, we assume that the additional parameter has an exponential distribution with an unknown mean. The associated marginal likelihood can be conveniently and accurately approximated using a Gauss-Laguerre quadrature rule, resulting in efficient estimation and inference procedures. The new model is used to analyze investment decisions in a behavioral economics experiment, where the occurrence and extent of loss aversion is of interest. In contrast to standard approaches, XBX regression can simultaneously capture the probability of rational behavior as well as the mean amount of loss aversion. Moreover, the effectiveness of the new model is illustrated through extensive numerical comparisons with alternative models.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://arxiv.org/pdf/2409.07233&quot;&gt;Read full paper ›&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;software&quot;&gt;Software&lt;/h2&gt; &lt;ul&gt; &lt;li&gt;R package: &lt;a href=&quot;https://CRAN.R-project.org/package=betareg&quot;&gt;https://CRAN.R-project.org/package=betareg&lt;/a&gt;&lt;/li&gt; &lt;li&gt;Documentation: &lt;a href=&quot;https://topmodels.R-Forge.R-project.org/betareg/&quot;&gt;https://topmodels.R-Forge.R-project.org/betareg/&lt;/a&gt;&lt;/li&gt; &lt;/ul&gt; &lt;h2 id=&quot;illustration&quot;&gt;Illustration&lt;/h2&gt; &lt;p&gt;The data for modeling the occurrence and extent of loss aversion in a behavioral economics experiment is available as &lt;a href=&quot;https://topmodels.r-forge.r-project.org/betareg/man/LossAversion.html&quot;&gt;LossAversion&lt;/a&gt; in the package. The corresponding examples also replicate some of the models from the paper. The full replication of the case study will be discussed in another forthcoming blog post.&lt;/p&gt;</content>
			
				<category term="news" />
			
			
				<category term="betareg" />
			
				<category term="regression" />
			
				<category term="R" />
			
			<published>2024-09-16T00:00:00+02:00</published>
		</entry>
	
    
    
		<entry>
			<id>https://www.zeileis.org/news/python_colorspace/</id>
			<title>colorspace: A Python toolbox for colors and palettes</title>
			<link href="https://www.zeileis.org/news/python_colorspace/" rel="alternate" type="text/html" title="colorspace: A Python toolbox for colors and palettes" />
			<updated>2024-07-30T00:00:00+02:00</updated>
			
				
				<author>
					
						<name>Achim Zeileis</name>
					
					
						<email>Achim.Zeileis@R-project.org</email>
					
					
						<uri>https://www.zeileis.org/</uri>
					
				</author>
			
			<summary>Python package &apos;colorspace&apos; with tools for manipulating and assessing colors and palettes is now available from PyPI, accompanied by a documentation web page and an arXiv paper.</summary>
			<content type="html" xml:base="https://www.zeileis.org/news/python_colorspace/">&lt;p&gt;Python package &apos;colorspace&apos; with tools for manipulating and assessing colors and palettes is now available from PyPI, accompanied by a documentation web page and an arXiv paper.&lt;/p&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-07-30-python_colorspace/logo-wide.png&quot; alt=&quot;Python colorspace logo&quot; width=&quot;400px&quot; /&gt;&lt;/p&gt; &lt;h2 id=&quot;citation&quot;&gt;Citation&lt;/h2&gt; &lt;p&gt;Reto Stauffer, Achim Zeileis (2024). “colorspace: A Python Toolbox for Manipulating and Assessing Colors and Palettes.” &lt;em&gt;arXiv.org E-Print Archive&lt;/em&gt; arXiv:2407.19921 [cs.GR]. &lt;a href=&quot;https://doi.org/10.48550/arXiv.2407.19921&quot;&gt;doi:10.48550/arXiv.2407.19921&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;abstract&quot;&gt;Abstract&lt;/h2&gt; &lt;p&gt;The Python &lt;em&gt;colorspace&lt;/em&gt; package provides a toolbox for mapping between different color spaces which can then be used to generate a wide range of perceptually-based color palettes for qualitative or quantitative (sequential or diverging) information. These palettes (as well as any other sets of colors) can be visualized, assessed, and manipulated in various ways, e.g., by color swatches, emulating the effects of color vision deficiencies, or depicting the perceptual properties. Finally, the color palettes generated by the package can be easily integrated into standard visualization workflows in Python, e.g., using &lt;em&gt;matplotlib&lt;/em&gt;, &lt;em&gt;seaborn&lt;/em&gt;, or &lt;em&gt;plotly&lt;/em&gt;.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://arxiv.org/pdf/2407.19921&quot;&gt;Read full paper ›&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;software&quot;&gt;Software&lt;/h2&gt; &lt;ul&gt; &lt;li&gt;Package (PyPI): &lt;a href=&quot;https://pypi.org/project/colorspace/&quot;&gt;https://pypi.org/project/colorspace/&lt;/a&gt;&lt;/li&gt; &lt;li&gt;Documentation: &lt;a href=&quot;https://retostauffer.github.io/python-colorspace/&quot;&gt;https://retostauffer.github.io/python-colorspace/&lt;/a&gt;&lt;/li&gt; &lt;li&gt;Interactive apps: &lt;a href=&quot;https://hclwizard.org/&quot;&gt;https://hclwizard.org/&lt;/a&gt;&lt;/li&gt; &lt;li&gt;Repository (GitHub): &lt;a href=&quot;https://github.com/retostauffer/python-colorspace/&quot;&gt;https://github.com/retostauffer/python-colorspace/&lt;/a&gt;&lt;/li&gt; &lt;/ul&gt; &lt;h2 id=&quot;motivation&quot;&gt;Motivation&lt;/h2&gt; &lt;p&gt;Color is an integral element of visualizations and graphics and is essential for communicating (scientific) information. However, colors need to be chosen carefully so that they support the information displayed for all viewers (see e.g., Tufte 1990; Ware 2004; Wilke 2019). Therefore, suitable color palettes have been proposed in the literature (e.g., Brewer 1999; Ihaka 2003; Crameri, Shephard, and Heron 2020) and many software packages transitioned to better color defaults over the last decade. A prominent example from the Python community is &lt;em&gt;matplotlib&lt;/em&gt; 2.0 (Hunter, Dale, Firing, Droettboom, and the Matplotlib Development Team 2017) which replaced the classic “jet” palette (a variation of the infamous “rainbow”) by the perceptually-based “viridis” palette. Hence a wide range of useful palettes for different purposes is provided in a number of Python packages today, including &lt;em&gt;cmcramery&lt;/em&gt; (Rollo 2024), &lt;em&gt;colormap&lt;/em&gt; (Cokelaer 2024), &lt;em&gt;colormaps&lt;/em&gt; (Patel 2024), &lt;em&gt;matplotlib&lt;/em&gt; (Hunter 2007), &lt;em&gt;palettable&lt;/em&gt; (Davis 2023), or &lt;em&gt;seaborn&lt;/em&gt; (Waskom 2021).&lt;/p&gt; &lt;p&gt;However, in most graphics packages colors are provided as a fixed set. While this makes it easy to use them in different applications, it is usually not easy to modify the perceptual properties or to set up new palettes following the same principles. The &lt;em&gt;colorspace&lt;/em&gt; package addresses this by supporting color descriptions using different color spaces (hence the package name), including some that are based on human color perception. One notable example is the Hue-Chroma-Luminance (HCL) model which represents colors by coordinates on three perceptually-based axes: Hue (type of color), chroma (colorfulness), and luminance (brightness). Selecting colors along paths along these axes allows for intuitive construction of palettes that closely match many of the palettes provided in the packages listed above.&lt;/p&gt; &lt;p&gt;In addition to functions and interactive apps for HCL-based colors, the &lt;em&gt;colorspace&lt;/em&gt; package also offers functions and classes for handling, transforming, and visualizing color palettes (from any source). In particular, this includes the simulation of color vision deficiencies (Machado Oliviera, and Fernandes 2009) but also contrast ratios, desaturation, lightening/darkening, etc.&lt;/p&gt; &lt;p&gt;The &lt;em&gt;colorspace&lt;/em&gt; Python package was inspired by the eponymous R package (Zeileis, Fisher, Hornik, Ihaka, McWhite, Murrell, Stauffer, and Wilke 2020). It comes with extensive documentation at &lt;a href=&quot;https://retostauffer.github.io/python-colorspace/&quot;&gt;https://retostauffer.github.io/python-colorspace/&lt;/a&gt;, including many practical examples. Selected highlights are presented in the following.&lt;/p&gt; &lt;h2 id=&quot;key-functionality&quot;&gt;Key functionality&lt;/h2&gt; &lt;h3 id=&quot;hcl-based-color-palettes&quot;&gt;HCL-based color palettes&lt;/h3&gt; &lt;p&gt;The key functions and classes for constructing color palettes using hue-chroma-luminance paths (and then mapping these to hex codes) are:&lt;/p&gt; &lt;ul&gt; &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;qualitative_hcl&lt;/code&gt;: For qualitative or unordered categorical information, where every color should receive a similar perceptual weight.&lt;/li&gt; &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sequential_hcl&lt;/code&gt;: For ordered/numeric information from high to low (or vice versa).&lt;/li&gt; &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;diverging_hcl&lt;/code&gt;: For ordered/numeric information around a central neutral value, where colors diverge from neutral to two extremes.&lt;/li&gt; &lt;/ul&gt; &lt;p&gt;These functions provide a range of named palettes inspired by well-established packages but actually implemented using HCL paths. Additionally, the HCL parameters can be modified or new palettes can be created from scratch.&lt;/p&gt; &lt;p&gt;As an example, the figure below depicts color swatches for four viridis variations. The first &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pal1&lt;/code&gt; sets up the palette from its name. It is identical to the second &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pal2&lt;/code&gt; which employes the HCL specification directly: The hue ranges from purple (300) to yellow (75), colorfulness (chroma) increases from 40 to 95, and luminance (brightness) from dark (15) to light (90). The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;power&lt;/code&gt; parameter chooses a linear change in chroma and a slightly nonlinear path for luminance.&lt;/p&gt; &lt;p&gt;In &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pal3&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pal4&lt;/code&gt; the most HCL properties are kept the same but some are modified: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pal3&lt;/code&gt; uses a triangular chroma path from 40 via 90 to 20, yielding muted colors at the end of the palette. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pal4&lt;/code&gt; just changes the starting hue for the palette to green (200) instead of purple. All four palettes are visualized by the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;swatchplot&lt;/code&gt; function from the package.&lt;/p&gt; &lt;p&gt;The objects returned by the palette functions provide a series of methods, e.g., &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pal1.settings&lt;/code&gt; for displaying the HCL parameters, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pal1(3)&lt;/code&gt; for obtaining a number of hex colors, or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pal1.cmap()&lt;/code&gt; for setting up a &lt;em&gt;matplotlib&lt;/em&gt; color map, among others.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{python}&quot;&gt;from colorspace import palette, sequential_hcl, swatchplot pal1 = sequential_hcl(palette = &quot;viridis&quot;) pal2 = sequential_hcl(h = [300, 75], c = [40, 95], l = [15, 90], power = [1., 1.1]) pal3 = sequential_hcl(palette = &quot;viridis&quot;, cmax = 90, c2 = 20) pal4 = sequential_hcl(palette = &quot;viridis&quot;, h1 = 200) swatchplot({&quot;Viridis (and altered versions of it)&quot;: [ palette(pal1(7), &quot;By name&quot;), palette(pal2(7), &quot;By hand&quot;), palette(pal3(7), &quot;With triangular chroma&quot;), palette(pal4(7), &quot;With smaller hue range&quot;) ]}, figsize = (8, 1.75)); &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-07-30-python_colorspace/fig-chosing-palettes.png&quot; alt=&quot;Swatches of four HCL-based sequential palettes: `pal1` is the predefined HCL-based viridis palette, `pal2` is identical to `pal2` but created &amp;quot;by hand&amp;quot; and `pal3` and `pal4` are modified versions with a triangular chroma paths and reduced hue range, respectively.&quot; /&gt;&lt;/p&gt; &lt;p&gt;An overview of the named HCL-based palettes in &lt;em&gt;colorspace&lt;/em&gt; is depicted below.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{python}&quot;&gt;from colorspace import hcl_palettes hcl_palettes(plot = True, figsize = (20, 15)) &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-07-30-python_colorspace/fig-hcl-palettes.png&quot; alt=&quot;Overview of the predefined (fully customizable) HCL color palettes.&quot; /&gt;&lt;/p&gt; &lt;h3 id=&quot;palette-visualization-and-assessment&quot;&gt;Palette visualization and assessment&lt;/h3&gt; &lt;p&gt;To better understand the properties of palette &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pal4&lt;/code&gt;, defined above, the following figure shows its HCL spectrum (left) and the corresponding path through the HCL space (right).&lt;/p&gt; &lt;p&gt;The spectrum in the first panel shows how the hue (right axis) changes from about 200 (green) to 75 (yellow), while chroma and luminance (left axis) increase from about 20 to 95. Note that the kink in the chroma curve for the greenish colors occurs because such dark greens cannot have higher chromas when represented through RGB-based hex codes. The same is visible in the second panel where the path moves along the outer edge of the HCL space.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{python}&quot;&gt;pal4.specplot(figsize = (5, 5)); pal4.hclplot(n = 7, figsize = (5, 5)); &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-07-30-python_colorspace/fig-specplot-hclplot.png&quot; alt=&quot;Hue-chroma-luminance spectrum plot (left) and corresponding path in the chroma-luminance coordinate system (where hue changes with luminance) for the custom sequential palette `pal4`.&quot; /&gt;&lt;/p&gt; &lt;h3 id=&quot;color-vision-deficiency&quot;&gt;Color vision deficiency&lt;/h3&gt; &lt;p&gt;Another important assessment of a color palette is how well it works for viewers with color vision deficiencies. This is exemplified below by depicting a demo plot (heatmap) under “normal” vision (left), deuteranomaly (colloquially known as “red-green color blindness”, center), and desaturated (gray scale, right). The palette in the top row is the traditional fully-saturated RGB rainbow, deliberately selected here as a palette with poor perceptual properties. It is contrasted with a perceptually-based sequential blue-yellow HCL palette in the bottom row.&lt;/p&gt; &lt;p&gt;The sequential HCL palette is monotonic in luminance so that it is easy to distinguish high-density and low-density regions under deuteranomaly and desaturation. However, the rainbow is non-monotonic in luminance and parts of the red-green contrasts collapse under deuteranomaly, making it much harder to interpret correctly.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{python}&quot;&gt;from colorspace import rainbow, sequential_hcl col1 = rainbow(end = 2/3, rev = True)(7) col2 = sequential_hcl(&quot;Blue-Yellow&quot;, rev = True)(7) from colorspace import demoplot, deutan, desaturate import matplotlib.pyplot as plt fig, ax = plt.subplots(2, 3, figsize = (9, 4)) demoplot(col1, &quot;Heatmap&quot;, ax = ax[0,0], ylabel = &quot;Rainbow&quot;, title = &quot;Original&quot;) demoplot(col2, &quot;Heatmap&quot;, ax = ax[1,0], ylabel = &quot;HCL (Blue-Yellow)&quot;) demoplot(deutan(col1), &quot;Heatmap&quot;, ax = ax[0,1], title = &quot;Deuteranope&quot;) demoplot(deutan(col2), &quot;Heatmap&quot;, ax = ax[1,1]) demoplot(desaturate(col1), &quot;Heatmap&quot;, ax = ax[0,2], title = &quot;Desaturated&quot;) demoplot(desaturate(col2), &quot;Heatmap&quot;, ax = ax[1,2]) plt.show() &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-07-30-python_colorspace/fig-cvd.png&quot; alt=&quot;Example of color vision deficiency emulation and color manipulation using a heatmap. Top/bottom: RGB rainbow based palette and HCL based sequential palette. Left to right: Original colors, deuteranope color vision, and desaturated representation.&quot; /&gt;&lt;/p&gt; &lt;h3 id=&quot;integration-with-python-graphics-packages&quot;&gt;Integration with Python graphics packages&lt;/h3&gt; &lt;p&gt;To illustrate that &lt;em&gt;colorspace&lt;/em&gt; can be easily combined with different graphics workflows in Python, the code below shows a heatmap (two-dimensional histogram) from &lt;em&gt;matplotlib&lt;/em&gt; and multi-group density from &lt;em&gt;seaborn&lt;/em&gt;. The code below employs an example data set from the package (using &lt;em&gt;pandas&lt;/em&gt;) with daily maximum and minimum temperature. For &lt;em&gt;matplotlib&lt;/em&gt; the colormap (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.cmap()&lt;/code&gt;; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LinearSegmentedColormap&lt;/code&gt;) is extracted from the adapted viridis palette &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pal3&lt;/code&gt; defined above. For &lt;em&gt;seaborn&lt;/em&gt; the hex codes from a custom qualitative palette are extracted via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.colors(4)&lt;/code&gt;.&lt;/p&gt; &lt;pre&gt;&lt;code class=&quot;language-{python}&quot;&gt;from colorspace import dataset, qualitative_hcl import matplotlib.pyplot as plt import seaborn as sns df = dataset(&quot;HarzTraffic&quot;) fig = plt.hist2d(df.tempmin, df.tempmax, bins = 20, cmap = pal3.cmap().reversed()) plt.title(&quot;Joint density daily min/max temperature&quot;) plt.xlabel(&quot;minimum temperature [deg C]&quot;) plt.ylabel(&quot;maximum temperature [deg C]&quot;) plt.show() pal = qualitative_hcl(&quot;Dark 3&quot;, h1 = -180, h2 = 100) g = sns.displot(data = df, x = &quot;tempmax&quot;, hue = &quot;season&quot;, fill = &quot;season&quot;, kind = &quot;kde&quot;, rug = True, height = 4, aspect = 1, palette = pal.colors(4)) g.set_axis_labels(&quot;temperature [deg C]&quot;) g.set(title = &quot;Distribution of daily maximum temperature given season&quot;) plt.show() &lt;/code&gt;&lt;/pre&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-07-30-python_colorspace/fig-plotting.png&quot; alt=&quot;Example of a `matplotlib` heatmap and a `seaborn` density using custom HCL-based colors.&quot; /&gt;&lt;/p&gt; &lt;h2 id=&quot;dependencies-and-availability&quot;&gt;Dependencies and availability&lt;/h2&gt; &lt;p&gt;The &lt;em&gt;colorspace&lt;/em&gt; is available from PyPI at &lt;a href=&quot;https://pypi.org/project/colorspace&quot;&gt;https://pypi.org/project/colorspace&lt;/a&gt;. It is designed to be lightweight, requiring only &lt;em&gt;numpy&lt;/em&gt; (Harris &lt;em&gt;et al.&lt;/em&gt; 2020) for the core functionality. Only a few features rely on &lt;em&gt;matplotlib&lt;/em&gt;, &lt;em&gt;imageio&lt;/em&gt; (Klein &lt;em&gt;et al.&lt;/em&gt; 2024), and &lt;em&gt;pandas&lt;/em&gt; (The Pandas Development Team 2024). More information and an interactive interface can be found on &lt;a href=&quot;https://hclwizard.org/&quot;&gt;https://hclwizard.org/&lt;/a&gt;. Package development is hosted on GitHub at &lt;a href=&quot;https://github.com/retostauffer/python-colorspace&quot;&gt;https://github.com/retostauffer/python-colorspace&lt;/a&gt;. Bug reports, code contributions, and feature requests are warmly welcome.&lt;/p&gt; &lt;h2 id=&quot;references&quot;&gt;References&lt;/h2&gt; &lt;ul&gt; &lt;li&gt;Brewer CA (1999). “Color Use Guidelines for Data Representation.” In Proceedings of the Section on Statistical Graphics, American Statistical Association, pp. 55–60. Alexandria, VA.&lt;/li&gt; &lt;li&gt;Cokelaer T (2024). Colormap. Version 1.1.0, Python Package Index (PyPI), URL &lt;a href=&quot;https://pypi.org/project/colormap/&quot;&gt;https://pypi.org/project/colormap/&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Crameri F, Shephard GE, Heron PJ (2020). “The Misuse of Colour in Science Communication.” Nature Communications, 11(5444), 1–10. &lt;a href=&quot;https://doi.org/10.1038/s41467-020-19160-7&quot;&gt;doi:10.1038/s41467-020-19160-7&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Davis M (2023). palettable: Color Palettes for Python. Version 3.3.3, Python Package Index (PyPI), URL &lt;a href=&quot;https://pypi.org/project/palettable/&quot;&gt;https://pypi.org/project/palettable/&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R, Picus M, Hoyer S, van Kerkwijk MH, Brett M, Haldane A, del Río JF, Wiebe M, Peterson P, Gérard-Marchant P, Sheppard K, Reddy T, Weckesser W, Abbasi H, Gohlke C, Oliphant TE (2020). “Array Programming with NumPy.” Nature, 585(7825), 357–362. &lt;a href=&quot;https://doi.org/10.1038/s41586-020-2649-2&quot;&gt;doi:10.1038/s41586-020-2649-2&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Hunter JD (2007). “Matplotlib: A 2D Graphics Environment.” Computing in Science &amp;amp; Engineering, 9(3), 90–95. &lt;a href=&quot;https://doi.org/10.1109/mcse.2007.55&quot;&gt;doi:10.1109/mcse.2007.55&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Hunter JD, Dale D, Firing E, Droettboom M, the Matplotlib Development Team (2017). “What’s New in Matplotlib 2.0 (Jan 17, 2017), Changes to the Default Style.” Accessed 2024-07-22, URL &lt;a href=&quot;https://matplotlib.org/stable/users/prev_whats_new/dflt_style_changes.html&quot;&gt;https://matplotlib.org/stable/users/prev_whats_new/dflt_style_changes.html&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Ihaka R (2003). “Colour for Presentation Graphics.” In K Hornik, F Leisch, A Zeileis (eds.), Proceedings of the 3rd International Workshop on Distributed Statistical Computing, Vienna, Austria. ISSN 1609-395X, URL &lt;a href=&quot;https://www.R-project.org/conferences/DSC-2003/Proceedings/Ihaka.pdf&quot;&gt;https://www.R-project.org/conferences/DSC-2003/Proceedings/Ihaka.pdf&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Klein A, Wallkötter S, Silvester S, Rynes A, actions-user, Müller P, Nunez-Iglesias J, Harfouche M, Schrangl L, Dennis, Lee A, Pandede, McCormick M, OrganicIrradiation, Rai A, Ladegaard A, van Kemenade H, Smith TD, Vaillant G, jackwalker64, Nises J, Komarčevič M, rreilink, Barnes C, Zulko, Hsieh PC, Rosenstein N, Górny M, scivision, Singleton J (2024). Imageio/Imageio: V2.34.2. &lt;a href=&quot;https://doi.org/10.5281/zenodo.12514964&quot;&gt;doi:10.5281/zenodo.12514964&lt;/a&gt;. Version 2.34.2, Zenodo.&lt;/li&gt; &lt;li&gt;Machado GM, Oliviera MM, Fernandes LAF (2009). “A Physiologically-Based Model for Simulation of Color Vision Deficiency.” IEEE Transactions on Visualization and Computer Graphics, 15(6), 1291–1298. &lt;a href=&quot;https://doi.org/10.1109/tvcg.2009.113&quot;&gt;doi:10.1109/tvcg.2009.113&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Patel P (2024). Colormaps. Version 0.4.2, Python Package Index (PyPI), URL &lt;a href=&quot;https://pypi.org/project/colormaps/&quot;&gt;https://pypi.org/project/colormaps/&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Rollo C (2024). cmcrameri: Python Wrapper around Fabio Crameri’s Perceptually Uniform Colormaps. Version 1.9, Python Package Index (PyPI), URL &lt;a href=&quot;https://pypi.org/project/cmcrameri/&quot;&gt;https://pypi.org/project/cmcrameri/&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;The Pandas Development Team (2024). pandas-Dev/Pandas: Pandas. &lt;a href=&quot;https://doi.org/10.5281/zenodo.10957263&quot;&gt;doi:10.5281/zenodo.10957263&lt;/a&gt;. Version 2.2.2, Zenodo.&lt;/li&gt; &lt;li&gt;Tufte E (1990). Envisioning Information. Graphics Press, Cheshire.&lt;/li&gt; &lt;li&gt;Ware C (2004). “Color.” In Information Visualization: Perception for Design, chapter 4, pp. 103–149. Morgan Kaufmann Publishers Inc.&lt;/li&gt; &lt;li&gt;Waskom ML (2021). “seaborn: Statistical Data Visualization.” Journal of Open Source Software, 6(60), 3021. &lt;a href=&quot;https://doi.org/10.21105/joss.03021&quot;&gt;doi:10.21105/joss.03021&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Wilke CO (2019). Fundamentals of Data Visualization. O’Reilly Media. ISBN 1492031089. URL &lt;a href=&quot;https://clauswilke.com/dataviz/color-basics.html&quot;&gt;https://clauswilke.com/dataviz/color-basics.html&lt;/a&gt;.&lt;/li&gt; &lt;li&gt;Zeileis A, Fisher JC, Hornik K, Ihaka R, McWhite CD, Murrell P, Stauffer R, Wilke CO (2020). “colorspace: A Toolbox for Manipulating and Assessing Colors and Palettes.” Journal of Statistical Software, 96(1), 1–49. &lt;a href=&quot;https://doi.org/10.18637/jss.v096.i01&quot;&gt;doi:10.18637/jss.v096.i01&lt;/a&gt;.&lt;/li&gt; &lt;/ul&gt;</content>
			
				<category term="news" />
			
			
				<category term="color" />
			
				<category term="palettes" />
			
				<category term="Python" />
			
			<published>2024-07-30T00:00:00+02:00</published>
		</entry>
	
    
    
		<entry>
			<id>https://www.zeileis.org/news/euro2024group/</id>
			<title>Evaluation of the UEFA Euro 2024 group stage forecast</title>
			<link href="https://www.zeileis.org/news/euro2024group/" rel="alternate" type="text/html" title="Evaluation of the UEFA Euro 2024 group stage forecast" />
			<updated>2024-06-28T00:00:00+02:00</updated>
			
				
				<author>
					
						<name>Achim Zeileis</name>
					
					
						<email>Achim.Zeileis@R-project.org</email>
					
					
						<uri>https://www.zeileis.org/</uri>
					
				</author>
			
			<summary>A look back on the group stage of the UEFA Euro 2024 to check whether our ensemble machine learning forecasts based were any good...</summary>
			<content type="html" xml:base="https://www.zeileis.org/news/euro2024group/">&lt;p&gt;A look back on the group stage of the UEFA Euro 2024 to check whether our ensemble machine learning forecasts based were any good...&lt;/p&gt; &lt;h2 id=&quot;how-surprising-was-the-group-stage&quot;&gt;How surprising was the group stage?&lt;/h2&gt; &lt;p&gt;This week the group stage of the UEFA Euro 2024 was concluded so that all pairings for the round of 16 are fixed now. Therefore, today we want to do address two questions regarding our own &lt;a href=&quot;https://www.zeileis.org/news/euro2024/&quot;&gt;probabilistic forecast for the UEFA Euro 2024&lt;/a&gt; based on a ensemble machine learning model that we have published prior to the tournament:&lt;/p&gt; &lt;ol&gt; &lt;li&gt;How good were the predictions for the group stage? Were the actual outcomes surprising?&lt;/li&gt; &lt;li&gt;How does the outcome of the group stage change the predicted winning probabilities for the tournament?&lt;/li&gt; &lt;/ol&gt; &lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;&lt;/p&gt; &lt;ul&gt; &lt;li&gt;All of our predictions worked quite well and most results were within the expected range of random variation. All tournament favorites proceeded to the round of 16 and mostly the weaker teams dropped out of the tournament.&lt;/li&gt; &lt;li&gt;The biggest surprise was probably that Austria not only proceeded to the round of 16 but ranked first in arguably the strongest Group D, even surpassing France.&lt;/li&gt; &lt;li&gt;Smaller surprises were that Croatia dropped out in Group B, that Belgium came second behind Romania in Group E, and that Georgia prevailed in Group F.&lt;/li&gt; &lt;li&gt;However, some of the more interesting surprises did not really have big consequences, yet, in particular the poor scoring of top favorites France and England. Both of the teams made it to the knockout stage and it will be very interesting to see whether they will be able to boost their performance in the next game(s)!&lt;/li&gt; &lt;li&gt;If England is indeed able to unleash their full potential, they will profit most from the group stage because all the other top favorites (in particular France) are in the other arm of the tournament draw now.&lt;/li&gt; &lt;/ul&gt; &lt;h2 id=&quot;group-stage-results&quot;&gt;Group stage results&lt;/h2&gt; &lt;p&gt;First, we look at the results in terms of which teams successfully advanced from the group stage to the round of 16. The barplots below shows all teams along with their &lt;strong&gt;predicted&lt;/strong&gt; probability to proceed to the round of 16, in the &lt;strong&gt;observed&lt;/strong&gt; ranking order, with the color highlighting which teams advanced to the knockout stage.&lt;/p&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-06-28-euro2024group/barplot.png&quot; alt=&quot;Predicted probabilities to advance to the knockout stage, shaded by actual outcome&quot; /&gt;&lt;/p&gt; &lt;p&gt;Clearly, all group favorites made the cut and mostly teams with lower probabilities dropped out. It may seem somewhat surprising that some of the weaker teams (especially Georgia) “survived” the group stage but with four out of six third-ranked teams advancing to the round of 16 this is not completely unexpected. The results in Groups D and E are probably more surprising: Austria came first in Group D and top favorite France only second. Similarly, Romania took the group victory in Group E behind the higher-ranked team from Belgium.&lt;/p&gt; &lt;h2 id=&quot;match-results&quot;&gt;Match results&lt;/h2&gt; &lt;p&gt;Next, we take a closer look at the 36 individual group-stage matches to check whether we had any major surprises. The stacked bar plot below groups all match results into three categories by their predicted goal difference for the stronger vs. the weaker team.&lt;/p&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-06-28-euro2024group/spineplot.png&quot; alt=&quot;Observed match outcome vs. predicted goal difference&quot; /&gt;&lt;/p&gt; &lt;p&gt;In the first bar the stronger team was predicted to be only slightly better, with 0 to 0.6 more predicted goals on average. In this bar we see that the stronger team won less than half of the matches (6 out of 15) while the other matches were either lost (2 matches) or ended in a draw (7 matches). Thus, the distribution roughly matches the predictions albeit the number of draws is somewhat higher than expected.&lt;/p&gt; &lt;p&gt;The picture is similar in the second bar where the predicted goal difference for the stronger team was between 0.6 and 1. The stronger team won 5 out of 11 matches, lost 1, and more than expected (5) ended in a draw.&lt;/p&gt; &lt;p&gt;Only in the last bar with the highest predicted goal differences (between 1 and 2 goals) there were fewer draws (2 out of 10). Here the distribution matches closely the expectations with 7 wins for the stronger team and only 1 loss.&lt;/p&gt; &lt;p&gt;As a final evaluation we check whether the observed number of goals per team in each match conforms with the expected distribution based on the Poisson model employed. This is brought out graphically by a so-called &lt;a href=&quot;https://dx.doi.org/10.1080/00031305.2016.1173590&quot;&gt;hanging rootogram&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-06-28-euro2024group/rootogram.png&quot; alt=&quot;Hanging rootogram with observed and expected frequencies of number of goals&quot; /&gt;&lt;/p&gt; &lt;p&gt;The red line shows the square root of the expected frequencies while the “hanging” gray bars represent the square root of the observed frequencies. This shows that the predictions conform closely with the actual observations. There were only a few more occurrences of single goals and fewer results with four goals (none) than expected in our forecast.&lt;/p&gt; &lt;h2 id=&quot;updated-knockout-stage-predictions&quot;&gt;Updated knockout stage predictions&lt;/h2&gt; &lt;p&gt;Finally, we want to look ahead and explore how the realized tournament draw based on the group stage results changes the predicted winning probabilities for the UEFA Euro 2024. We do so under the assumption that all results so far are within the range of random variation and that we do &lt;strong&gt;not&lt;/strong&gt; need to adapt the &lt;a href=&quot;https://www.zeileis.org/news/euro2024/#match-probabilities&quot;&gt;predictions for all possible matches&lt;/a&gt;. In other words, the simulation is based on the expectation that especially the top favorites France and England can still reach their full potential in the upcoming matches.&lt;/p&gt; &lt;p&gt;Simulating the knockout stage 100,000 times then leads to the following winning probabilities for the tournament. (The barplot preserves the ordering of the teams from the &lt;a href=&quot;https://www.zeileis.org/news/euro2024/#winning-probabilities&quot;&gt;original prediction&lt;/a&gt;.)&lt;/p&gt; &lt;p&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-06-28-euro2024group/knockout.png&quot; alt=&quot;Barplot with updated winning probabilities for the tournament&quot; /&gt;&lt;/p&gt; &lt;p&gt;This shows clearly that England profits most and increases its winning probability for the title to 22.1% (from 16.7%). For the other five top teams France, Germany, Spain, Portugal, and the Netherlands the winning probabilities are almost equal now and all around 13%. Four of these five teams are now all in the same arm of the tournament (which has also been dubbed the “shark tank” in the media) and it will certainly be exciting who will eventually make it to the final. In the other arm England and the Netherlands are now the teams with the highest winning probability but we should keep in mind that Austria has already beaten the Netherlands once. Only the next matches will show whether they will be able to do it again, should both teams be able to advance to the quarterfinal.&lt;/p&gt; &lt;p&gt;In any case, the most exciting part of the UEFA Euro 2024 is only starting now and we can all be curious what is going to happen. Everything is still possible!&lt;/p&gt;</content>
			
				<category term="news" />
			
			
				<category term="UEFA" />
			
				<category term="euro" />
			
				<category term="football" />
			
				<category term="forecasting" />
			
			<published>2024-06-28T00:00:00+02:00</published>
		</entry>
	
    
    
		<entry>
			<id>https://www.zeileis.org/news/euro2024nedaut/</id>
			<title>UEFA Euro 2024 forecast: Netherlands vs. Austria</title>
			<link href="https://www.zeileis.org/news/euro2024nedaut/" rel="alternate" type="text/html" title="UEFA Euro 2024 forecast: Netherlands vs. Austria" />
			<updated>2024-06-25T00:00:00+02:00</updated>
			
				
				<author>
					
						<name>Achim Zeileis</name>
					
					
						<email>Achim.Zeileis@R-project.org</email>
					
					
						<uri>https://www.zeileis.org/</uri>
					
				</author>
			
			<summary>Detailed probabilistic forecast for the match Netherlands vs. Austria at UEFA Euro 2024 in Group D, accompanying a conference presentation at Imagine 2024.</summary>
			<content type="html" xml:base="https://www.zeileis.org/news/euro2024nedaut/">&lt;p&gt;Detailed probabilistic forecast for the match Netherlands vs. Austria at UEFA Euro 2024 in Group D, accompanying a conference presentation at Imagine 2024.&lt;/p&gt; &lt;h2 id=&quot;machine-learning-ensemble&quot;&gt;Machine learning ensemble&lt;/h2&gt; &lt;p&gt;In a recent blog post, prior to the start of the tournament, &lt;a href=&quot;https://www.zeileis.org/news/euro2024/&quot;&gt;probabilistic forecasts for the UEFA Euro 2024&lt;/a&gt; were provided based on a machine learning approach. In short, the approach obtained a number of highly informative inputs about the 24 participating teams before the start of the tournament: &lt;em&gt;Historic match abilities&lt;/em&gt; from all national matches in 8 years, &lt;em&gt;bookmaker consensus abilities&lt;/em&gt; based on quoted odds from 28 bookmakers, &lt;em&gt;average player ratings&lt;/em&gt; from goal contributions of individual players in club and national matches, as well as further team-specific information like &lt;em&gt;market value&lt;/em&gt; or &lt;em&gt;FIFA rank&lt;/em&gt; etc. Then an ensemble of a random forest, a lasso, and an XGBoost learner were trained on matches from the UEFA Euro 2004–2020. The outcome was a prediction for the mean goals for both teams in all potential matches at the UEFA Euro 2024. Based on these predictions the entire tournament was simulated 100,000 times yielding probabilities for all possible outcomes of the tournament.&lt;/p&gt; &lt;h2 id=&quot;match-forecast&quot;&gt;Match forecast&lt;/h2&gt; &lt;p&gt;The prediction from the machine learning ensemble above for the match Netherlands vs. Austria is summarized in the following table.&lt;/p&gt; &lt;table&gt; &lt;thead&gt; &lt;tr&gt; &lt;th style=&quot;text-align: left&quot;&gt; &lt;/th&gt; &lt;th style=&quot;text-align: right&quot;&gt;Mean goals&lt;/th&gt; &lt;th style=&quot;text-align: right&quot;&gt;Win probability&lt;/th&gt; &lt;/tr&gt; &lt;/thead&gt; &lt;tbody&gt; &lt;tr&gt; &lt;td style=&quot;text-align: left&quot;&gt;🇳🇱&lt;/td&gt; &lt;td style=&quot;text-align: right&quot;&gt;1.3&lt;/td&gt; &lt;td style=&quot;text-align: right&quot;&gt;48.6%&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td style=&quot;text-align: left&quot;&gt;Draw&lt;/td&gt; &lt;td style=&quot;text-align: right&quot;&gt;–&lt;/td&gt; &lt;td style=&quot;text-align: right&quot;&gt;28.1%&lt;/td&gt; &lt;/tr&gt; &lt;tr&gt; &lt;td style=&quot;text-align: left&quot;&gt;🇦🇹&lt;/td&gt; &lt;td style=&quot;text-align: right&quot;&gt;0.8&lt;/td&gt; &lt;td style=&quot;text-align: right&quot;&gt;23.4%&lt;/td&gt; &lt;/tr&gt; &lt;/tbody&gt; &lt;/table&gt; &lt;p&gt;This means that if the Netherlands were to play Austria in lots of matches, the Netherlands are predicted to score 1.3 goals on average in these matches while Austria scores an average of 0.8 goals. Assuming a certain probability distribution for the goals per team in each match, not only the mean goals can be predicted but also the probability for each possible combination of goals by the two teams. The probability distribution employed here is a bivariate independent Poisson model, a relatively simple and standard model that fits empirical scores in football matches very well. The resulting probabilities (for up to five goals per team) are displayed in the heatmap below. Aggregating all probabilities for a Dutch win, a draw, or an Austrian win yields the probabilities shown in the table above (which do not sum to 100% exactly due to rounding).&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2024-06-25-euro2024nedaut/match-nedaut.png&quot;&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-06-25-euro2024nedaut/match-nedaut.png&quot; alt=&quot;Heatmap: Probabilistic forecasts for the possible outcomes of the match&quot; /&gt;&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;conference-presentation&quot;&gt;Conference presentation&lt;/h2&gt; &lt;p&gt;The &lt;a href=&quot;https://www.imagine-ikt.at/programm-2024/&quot;&gt;Imagine conference&lt;/a&gt; hosted by the &lt;a href=&quot;https://bmk.gv.at/&quot;&gt;Austrian Ministry of Climate Action, Environment, Energy, Mobility, Innovation and Technology&lt;/a&gt; celebrates its 10th birthday today. The final highlight of the conference program is a public viewing of the match Netherlands vs. Austria where the forecast above will be presented alongside a live data-driven analysis by colleagues from the Rotterdam University of Applied Sciences. The presentation slides are linked from the screenshot below.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/papers/Imagine-2024.html&quot;&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-06-25-euro2024nedaut/slides.png&quot; alt=&quot;HTML presentation slides&quot; /&gt;&lt;/a&gt;&lt;/p&gt;</content>
			
				<category term="news" />
			
			
				<category term="UEFA" />
			
				<category term="euro" />
			
				<category term="football" />
			
				<category term="forecasting" />
			
			<published>2024-06-25T00:00:00+02:00</published>
		</entry>
	
    
    
		<entry>
			<id>https://www.zeileis.org/news/euro2024/</id>
			<title>Forecasting the UEFA Euro 2024 with a machine learning ensemble</title>
			<link href="https://www.zeileis.org/news/euro2024/" rel="alternate" type="text/html" title="Forecasting the UEFA Euro 2024 with a machine learning ensemble" />
			<updated>2024-06-10T00:00:00+02:00</updated>
			
				
				<author>
					
						<name>Achim Zeileis</name>
					
					
						<email>Achim.Zeileis@R-project.org</email>
					
					
						<uri>https://www.zeileis.org/</uri>
					
				</author>
			
			<summary>Probabilistic forecasts for the UEFA Euro 2024 are obtained by using a hybrid model that combines data from four advanced statistical models. The favorite is France, followed by England and host Germany.</summary>
			<content type="html" xml:base="https://www.zeileis.org/news/euro2024/">&lt;p&gt;Probabilistic forecasts for the UEFA Euro 2024 are obtained by using a hybrid model that combines data from four advanced statistical models. The favorite is France, followed by England and host Germany.&lt;/p&gt; &lt;div class=&quot;row t20 b20&quot;&gt; &lt;div class=&quot;small-8 medium-9 large-10 columns&quot;&gt; Football fans around the world are looking forward to the kick off to the UEFA Euro 2024 in Germany later this week. 24 of the best European teams will compete from 14 June to 14 July to determine the new European Champion. In anticipation of the tournament the big question is who among the teams will succeed, who will drop out, and who will eventually prevail. While it is, of course, not yet possible to give &lt;em&gt;definitive&lt;/em&gt; answers to these questions, we are able to provide &lt;em&gt;probabilistic&lt;/em&gt; forecasts for all possible matches based on a refined machine learning approach. This allows us to explore the likely course of the tournament by simulation. &lt;/div&gt; &lt;div class=&quot;small-4 medium-3 large-2 columns&quot;&gt; &lt;a href=&quot;https://www.uefa.com/euro2024/&quot; alt=&quot;UEFA Euro 2024 web page&quot;&gt;&lt;img src=&quot;https://upload.wikimedia.org/wikipedia/en/2/26/UEFA_Euro_2024_Logo.svg&quot; alt=&quot;UEFA Euro 2024 logo&quot; /&gt;&lt;/a&gt; &lt;/div&gt; &lt;/div&gt; &lt;h2 id=&quot;winning-probabilities&quot;&gt;Winning probabilities&lt;/h2&gt; &lt;p&gt;The forecast is based on an ensemble of machine learners that blend four main sources of information: An ability estimate for every team based on historic matches; an ability estimate for every team based on odds from 28 bookmakers; average ratings of the players in each team based on their individual performances in their home clubs and national teams; further team and country covariates (e.g., market value or GDP). An ensemble of machine learners is trained on the results of the UEFA Euro tournaments from 2004 to 2020 and then applied to current information to obtain a forecast for the UEFA Euro 2024. More specifically, the ensemble estimates the predicted number of goals for all possible matches between all 24 teams in the tournament. Based on the predicted goals the probabilities for a &lt;em&gt;win&lt;/em&gt;, &lt;em&gt;draw&lt;/em&gt;, or &lt;em&gt;loss&lt;/em&gt; in each of these matches can be computed from a bivariate Poisson distribution. This allows us to simulate all matches in the group phase and which teams proceed to the knock out stage and who eventually wins. Repeating the simulation 100,000 times yields winning probabilities for each team. The results show that France is the favorite for the European title with a winning probability of 19.2%, followed by England with 16.7%, and host Germany with 13.7%. The winning probabilities for all teams are shown in the barchart below with more information linked in the interactive full-width version.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2024-06-10-euro2024/p_win.html&quot;&gt;Interactive full-width graphic&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2024-06-10-euro2024/p_win.html&quot;&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-06-10-euro2024/p_win.png&quot; alt=&quot;Barchart: Winning probabilities&quot; /&gt;&lt;/a&gt;&lt;/p&gt; &lt;p&gt;The study has been conducted by an international team of researchers: &lt;a href=&quot;https://math.uni.lu/midas/people/dp/?florianfelice&quot;&gt;Florian Felice&lt;/a&gt;, &lt;a href=&quot;https://bd.statistik.tu-dortmund.de/professur/arbeitsgruppe/prof-dr-andreas-groll/&quot;&gt;Andreas Groll&lt;/a&gt;, &lt;a href=&quot;https://home.himolde.no/hvattum/&quot;&gt;Lars Magnus Hvattum&lt;/a&gt;, &lt;a href=&quot;https://math.uni.lu/midas/people/dp/?christopheley&quot;&gt;Christophe Ley&lt;/a&gt;, &lt;a href=&quot;https://www.sg.tum.de/epidemiologie/team/schauberger/&quot;&gt;Gunther Schauberger&lt;/a&gt;, Jonas Sternemann, &lt;a href=&quot;https://www.zeileis.org/&quot;&gt;Achim Zeileis&lt;/a&gt;. The basic idea for the forecast is to proceed in two steps. In the first step, three sophisticated statistical models are employed to determine the strengths of all teams and their players using disparate sets of information. In the second step, an ensemble of machine learners decide how to best combine the three strength estimates with other information about the teams.&lt;/p&gt; &lt;ul&gt; &lt;li&gt; &lt;p&gt;&lt;em&gt;Historic match abilities:&lt;/em&gt;&lt;br /&gt; An ability estimate is obtained for every team based on “retrospective” data, namely all historic national matches over the last 8 years. A &lt;em&gt;bivariate Poisson model&lt;/em&gt; with team-specific fixed effects is fitted to the number of goals scored by both teams in each match. However, rather than equally weighting all matches to obtain &lt;em&gt;average&lt;/em&gt; team abilities (or team strengths) over the entire history period, an exponential weighting scheme is employed. This assigns more weight to more recent results and thus yields an estimate of &lt;em&gt;current&lt;/em&gt; team abilities. More details can be found in &lt;a href=&quot;https://doi.org/10.1177%2F1471082X18817650&quot;&gt;Ley, Van de Wiele, Van Eetvelde (2019)&lt;/a&gt;.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;&lt;em&gt;Bookmaker consensus abilities:&lt;/em&gt;&lt;br /&gt; Another ability estimate for every team is obtained based on “prospective” data, namely the odds of 28 international bookmakers that reflect their expert expectations for the tournament. Using the &lt;em&gt;bookmaker consensus model&lt;/em&gt; of &lt;a href=&quot;https://dx.doi.org/10.1016/j.ijforecast.2009.10.001&quot;&gt;Leitner, Zeileis, Hornik (2010)&lt;/a&gt;, the bookmaker odds are first adjusted for the bookmakers’ profit margins (“overround”) and then averaged (on a logit scale) to obtain a consensus for the winning probability of each team. To adjust for the effects of the tournament draw (that might have led to easier or harder groups for some teams), an “inverse” simulation approach is used to infer which team abilities are most likely to lead up to the consensus winning probabilities.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;&lt;em&gt;Average player ratings:&lt;/em&gt;&lt;br /&gt; To infer the contributions of individual players in a match, the &lt;em&gt;plus-minus player ratings&lt;/em&gt; of &lt;a href=&quot;https://doi.org/10.1007/s11750-020-00584-9&quot;&gt;Pantuso &amp;amp; Hvattum (2021)&lt;/a&gt; dissect all matches with a certain player (both on club and on national level) into segments, e.g., between substitutions. Subsequently, the goal difference achieved in these segments is linked to the presence of the individual players during that segment. This yields individual ratings for all players that can be aggregated to average player ratings for each team.&lt;/p&gt; &lt;/li&gt; &lt;li&gt; &lt;p&gt;&lt;em&gt;Machine learning ensemble:&lt;/em&gt;&lt;br /&gt; Finally, an ensemble of different machine learning methods is used to combine these three highly aggregated and informative variables above along with various further relevant variables, yielding refined probabilistic forecasts for each match. Such an approach was first suggested by &lt;a href=&quot;https://doi.org/10.1515/jqas-2018-0060&quot;&gt;Groll, Ley, Schauberger, Van Eetvelde (2019)&lt;/a&gt; and subsequently improved collaboratively. The ensemble of machine learners is trained to decide how to blend the different ability estimates with team-specific features that are typically less informative but still powerful enough to enhance the forecasts. The features considered comprise team- and country-specific details (market value, FIFA rank, UEFA points, number of Champions League players, and GDP per capita). By combining a large ensemble of machine learners, each of which employs the available information somewhat differently, the relative importances of all the covariates can be inferred automatically. The resulting predicted number of goals for each team can then finally be used to simulate the entire tournament 100,000 times.&lt;/p&gt; &lt;/li&gt; &lt;/ul&gt; &lt;h2 id=&quot;match-probabilities&quot;&gt;Match probabilities&lt;/h2&gt; &lt;p&gt;Using the forecasts from the machine learning ensemble yields the predicted number of goals for both teams in each possible match. The explanatory information used for this is the difference between the two teams in each of the variables listed above, i.e., the difference in historic match abilities (on a log scale), the difference in bookmaker consensus abilities (on a log scale), difference in average player ratings of the teams, etc. Assuming a bivariate Poisson distribution with the predicted numbers of goals for both teams, we can compute the probability that a certain match ends in a &lt;em&gt;win&lt;/em&gt;, a &lt;em&gt;draw&lt;/em&gt;, or a &lt;em&gt;loss&lt;/em&gt;. The same can be repeated in overtime, if necessary, and a coin flip is used to decide penalties, if needed.&lt;/p&gt; &lt;p&gt;The following heatmap shows for each possible combination of teams the probability that one team beats the other team in a knockout match. The color scheme uses green vs. purple to signal probabilities above vs. below 50%, respectively. The tooltips for each match in the interactive version of the graphic also print the probabilities for the match to end in a &lt;em&gt;win&lt;/em&gt;, &lt;em&gt;draw&lt;/em&gt;, or &lt;em&gt;loss&lt;/em&gt; after normal time.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2024-06-10-euro2024/p_match.html&quot;&gt;Interactive full-width graphic&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2024-06-10-euro2024/p_match.html&quot;&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-06-10-euro2024/p_match.png&quot; alt=&quot;Heatmap: Match probabilities&quot; /&gt;&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;performance-throughout-the-tournament&quot;&gt;Performance throughout the tournament&lt;/h2&gt; &lt;p&gt;As every single match can be simulated with the pairwise probabilities above, it is also straightfoward to simulate the entire tournament (here: 100,000 times) providing “survival” probabilities for each team across the different stages.&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2024-06-10-euro2024/p_surv.html&quot;&gt;Interactive full-width graphic&lt;/a&gt;&lt;/p&gt; &lt;p&gt;&lt;a href=&quot;https://www.zeileis.org/assets/posts/2024-06-10-euro2024/p_surv.html&quot;&gt;&lt;img src=&quot;https://www.zeileis.org/assets/posts/2024-06-10-euro2024/p_surv.png&quot; alt=&quot;Line plot: Survival probabilities&quot; /&gt;&lt;/a&gt;&lt;/p&gt; &lt;h2 id=&quot;odds-and-ends&quot;&gt;Odds and ends&lt;/h2&gt; &lt;p&gt;All our forecasts are probabilistic, clearly below 100%, and by no means certain. Thus, although we can quantify this uncertainty in terms of probabilities from an ensemble of potential tournaments, it is far from being predetermined which of these potential tournaments we will eventually see during the actual tournament.&lt;/p&gt; &lt;p&gt;Nevertheless the probabilistic view provides us with some interesting insights: For example, while most bookmakers favor England over France, our model reverses their roles. In a potential final between the two teams, however, France would only have a small advantage with a winning probability of 53.2%. Due to the tournament draw it is relatively unlikely, though, that the two top favorites play the final and much more likely (with a probability of 12.6%) that they play the second semifinal. Somewhat surprisingly, the most likely final (5.4%) is England vs. Germany where the winning probabilities would be almost exactly fifty-fifty.&lt;/p&gt; &lt;p&gt;It is also somewhat unexpected that defending champion Italy has only the 7th-highest probability of winning the championship again (5.6%). This is due to the substantial changes the team underwent in the last three years.&lt;/p&gt; &lt;p&gt;In any case, all of this means that the probabilistic forecasts leave a lot of room for surprises and excitement during the UEFA Euro 2024. But what is absolutely certain is that we look forward to an entertaining tournament as football fans (much more than as professional forecasters).&lt;/p&gt;</content>
			
				<category term="news" />
			
			
				<category term="UEFA" />
			
				<category term="euro" />
			
				<category term="football" />
			
				<category term="forecasting" />
			
			<published>2024-06-10T00:00:00+02:00</published>
		</entry>
	
</feed>
