|
Young European
statisticians Workshop (YES-III)Workshop
"Paradigms
of Model Choice"
October 5-6-7, 2009
Summary
This is the third workshop in the series of YES (Young European
Statisticians) workshops. The first was held in October 2007 on Shape
Restricted Inference with seminars given by Lutz Dümbgen (Bern) and Jon
Wellner (Seattle) together with shorter talks by Laurie Davies (Duisburg-Essen)
and Geurt Jongbloed (Delft). The second workshop was held in October 2008 on
High Dimensional Statistics with seminars given by Sara van de Geer
(Zürich), Nicolai Meinshausen (Oxford) and Gilles Blanchard (Berlin).
The present workshop is directed at young statisticians (mainly Ph.D.
students and postdocs) who are interested in the problem of model choice.
Short seminars each consisting of three 45 minute talks on various
aspects of model choice will be given by
Professor Laurie Davies, Duisburg-Essen, Professor Peter Grünwald,
Amsterdam, Professor Nils Hjort, Oslo and Professor Christian Robert -
Paris. The participants will also have the opportunity to give short talks
of 25 minutes and 5 minutes discussion on their own research.
Model choice has for many years been a point of disagreement and research
in statistics. The applications range from the choice between several low
dimensional models all of which are reasonable models for the data, the
choice of variables to be included in a linear regression, and the choice of
smoothing parameter in nonparametric regression, inverse and other ill-posed
problems. Over the years several techniques have been developed such as AIC,
BIC, MDL (Minimum Description Length) , cross-validation, Lasso (more
generally L_1-penalization) and FIC (focused information criterion). Many of
these techniques have proved successful for certain types of problem but
there is still a need for a discussion of the principles (if any) involved
as well as the advantages and disadvantages of these approaches. It is the
aim of the workshop to inform the participants of the state-of-the-art in
each of these several paradigms and to encourage discussion between the
different schools. It is also intended that each of the paradigms of model
choice provide examples of their use in real problems to demonstrate their
applicability to the analysis of data.
Each of the speakers will concentrate on the problem of model choice from
their own perspective.
Approximate models and regularization (Davies)
This approach to model choice is based on the idea of approximate models. A
model is regarded as an adequate approximation to a data set if `typical'
data generated under the model `looks like' the real data. The word `typical'
is made precise by specifying a real number α, 0 < α < 1, which determines
what percentage of the data sets generated under the model, are to be
regarded as typical. The words `look like' must be operationalized (in
practice often in the form of a computer program) so that for any model and
any data set it is possible to decide whether the model is an adequate
approximation to the data. The precise nature of this will depend on the
problem at hand; there is no general principle which can be used. Typically
there will be many adequate models and interest will centre on certain
simplest ones where simplicity can be defined in terms of shape (e.g. the
minimum number of local extreme values) or smoothness (minimum total
variation of a derivative) or the absence of `free lunches' (minimum Fisher
information). The ideas and the applications will be illustrated by several
examples, amongst others, from the area of nonparametric regression.
References
Davies, P. L. (1995) Data features. Statistica Neerlandica, (49),
185-245.
Davies, P. L. (2008) Approximating data (with discussion. Journal of the
Korean Statistical Society, (37) 191-240.
Tukey, J. W. (1993) Issues relevant to an honest account of data-based
inference, partially in the light of Laurie Davies's paper. Princeton
University, Princeton,
http://www.stat-math.uni-essen.de/tukey/tukey.php
The Minimum Description Length Principle(Grünwald)
We give a self-contained introduction to the Minimum Description Length
(MDL) Principle., introduced by J. Rissanen in 1978. MDL is a theory of
inductive inference, based on the idea that the more one is able to compress
a given set of data, the more one can be said to have learned about the
data. This idea can be applied to general statistical problems, and in
particular to problems of model choice. In its simplest form, for a given
class of probability models M and sample D, it tells us to pick the model H
\in M that minimizes the sum of the number of
bits needed to describe first the model H and then data D where D is encoded
`with the help of H'. This is a special case of the general formulation of
MDL, which is based on the information-theoretic concept of a `universal
model', which embody an automatic trade-off between goodness-of-fit and
complexity.
In these lectures, we focus on three aspects:
* Frequentist Considerations - Consistency and Minimax Convergence Rates:
MDL model choice and prediction is statistically consistent under a wide
variety of conditions. We review A. Barron's surprisingly simple proofs of
these results, which provide a direct link between data compression and
statistical convergence rates: each estimator can be interpreted as a code,
and the better this code compresses the data in expectation, the faster the
estimator's risk converges.
* Bayesian Considerations - since prior distributions may be interpreted as
codes, practical MDL implementations are often quite similar to Bayes factor
model selection and model averaging, but there are important differences.
For example, the Bayes predictive distribution reappears in MDL, but the
Bayes posterior does not. Also, MDL avoids the Bayesian inconsistency
results of Diaconis and Freedman, since these are based on priors that
provably do not lead to data compression.
* AIC/BIC-dilemma: standard MDL does not achieve the optimal minimax
convergence rates in some nonparametric settings. We explain this phenomenon
and describe the switch distribution as a potential remedy.
References
A. Barron, J. Rissanen and B. Yu. The Minimum Description Length Principle
in Coding and Modeling. IEEE Transactions on Information Theory 44(6),
2743-2760, 1998.
P. Grunwald. A Tutorial Introduction to the MDL Principle. Chapters 1 and 2
of 'Advances in MDL: Theory and Practice', MIT Press, 2005.
P. Grunwald. The Minimum Description Length Principle. MIT Press, 2007.
T. van Erven, P. Grunwald and S. de Rooij. Catching up Faster by Switching
Sooner: a prequential solution to the AIC-BIC dilemma. preprint, arXiv:0807.1005,
2008, November 2008.
Focused Information Criterion (Hjort)
The FIC was developed by Gerda Claeskens and Nils Hjort in two articles
in the Journal of the American Statistical Association in 2003. They have
since become two of the most cited articles on the problem of model choice.
Whereas criterion such as AIC or BIC choose a model without reference to its
intended use, the FIC criterion explicitly demands that the use to which the
model is to be put be made precise. If for example a quantile is of interest
one may choose a different model than that if the mean were the quantity of
interest. If both are of interest for the same data set, then one could
choose one model for the quantile and a different one for the mean.
Claeskens and Hjort have made this precise in an asymptotic setting and
shown how their approach can be validated. They are also able to prove the
advantage of model averaging if the results for different models are close
together. A further result which comes from their analysis is the
calculation of confidence intervals. Many statisticians choose a model on
the basis of some criterion and then, having chosen it, calculate confidence
intervals neglecting the process by which the model was chosen. This is
known to lead to over optimistic confidence intervals. Claeskens and Hjort
have shown how this problem can be overcome within their paradigm so that
the confidence intervals have at least asymptotically the correct coverage
probability.
References
Hjort, N.L. and Claeskens, G. (2003) Frequentist model average estimates.
Journal of the American Staistical Association, (98) 879--899.
Hjort, N.L. and Claeskens, G. (2003) Frequentist model average estimates.
Journal of the American Staistical Association, (98) 900--916.
Computational approaches to Bayesian model choice (Robert)
The seminar will cover recent developments in the computation of
marginal distributions for the comparison of statistical models in a
Bayesian framework. Although the introduction of reversible jump MCMC by
Green in 1995 is rightly perceived as the `second MCMC revolution', its
implementation is often too complex for the problems at hand. When the
number of models under consideration is of a reasonable magnitude there
exist computational alternatives such as bridge sampling, nested sampling
and ABC (Approximate Bayes Computation) which avoid model exploration with
reasonable efficiency. The seminar will be devoted to discussing the
advantages and disadvantages of these alternatives.
http://fr.arxiv.org/abs/0807.2767
http://www.arxiv.org/abs/0801.3887v2 |
Speakers
& Participants
Programme
& Abstracts
Registration
Organizers
Practical
Information
Sponsors
Registration
Registration is closed.
Practical information
Conference Location
the workshop location is EURANDOM, Den Dolech 2, 5612 AZ
Eindhoven, Laplace Building, 1st floor, LG 1.105.
EURANDOM is located on the campus of
Eindhoven
University of Technology, in the
'Laplacegebouw' building' (LG on the map). The university is located at
10 minutes walking distance from Eindhoven railway station (take the exit
north side and walk towards the tall building on the right with the sign TU/e).
For all information on how to come to Eindhoven, please check
http://www.eurandom.tue.nl/contact.htm
Hotel
For keynote speakers Hotel Queen
is reserved. You are requested to indicate arrival and departure dates on the
registration form .
For excepted contributed speakers, a reservation will be made in the
Sandton Hotel Eindhoven City Centre, FREE of charge. The room will be shared.
For preference of a single room (costs 45 euro per night) mark the box on
the registration form.
For participants it is possible to book a room through the organisation for
the reduced price of 89 euro per night plus 3,50 tourist tax in the
Sandton hotel Eindhoven City Centre.
Breakfast is included. Indicate arrival and departure dates on the
registration form .
For private bookings we suggest to consult the web pages of the
Tourist Information Eindhoven, Postbus
7, 5600 AA Eindhoven.
Lunches/dinner
On October 5 & 6 lunches are organised, free of costs for all
participants, if ordered on the
registration form.
The conference dinner will be held on Tuesday, October 6. For non-invitees an amount of 35 euro is requested, to be paid at arrival in cash (preferably
exact amount in euros). Indicate your attendance on the
registration form.
Contact
For more information please contact Mrs. Lucienne Coolen, workshop officer of
EURANDOM, at coolen'at'eurandom.tue.nl
Organisers
• Prof. P. L. Davies, University of Duisburg--Essen, Germany/Eindhoven
University of Technology, Eindhoven/
EURANDOM, Eindhoven.
(laurie.davies'at'uni-duisburg-essen.de )
• Prof. G. Jongbloed, University of Technology, Delft
(G.jongbloed'at'ewi.tudelft.nl
)
YES-II on "High dimensional statistics", October 6-7-8, 2008
With special
thanks and acknowledgement for the contributions of the following
sponsors:

 |