
About | Research | Events | People | Reports | Alumni | Contact | Home
|
From September 2011, the Informal Meetings will
be integrated in the new STO(chastics) Seminar Series. Informal Meetings Statisticians and Probabilists
Contact A. Di Bucchianico, P. de Andrade Serra & R. Castro Name and Affiliation Title
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Kees van Hee (TU/e) Discovering Characteristics of Stochastic Collections of Process Models Process models in organizational collections are
typically modelled by the same team and using the same conventions. As such,
these models share many characteristic features like size range, type and
frequency of errors. In most cases merely small samples of these collections
are available due to e.g. the sensitive information they contain. Because of
their sizes, these samples may not provide an accurate representation of the
characteristics of the originating collection. This paper deals with the
problem of constructing collections of process models, in the form of Petri
nets, from small samples of a collection for accurate estimations of the
characteristics of this collection. Given a small sample of process models
drawn from a real-life collection, we mine a set of generation parameters
that we use to generate arbitrarily large collections that feature the same
characteristics of the original collection. In this way we can estimate the
characteristics of the original collection on the generated collections. We
extensively evaluate the quality of our technique on various sample datasets
drawn from both research and industry. Rui Castro (TU/e) Sudoku and Sinkhorn Balancing
In the informal spirit of the meetings I am going to give a
(rather informal) talk on Sudoku puzzles: most people are quite familiar with
these popular puzzles, which are similar to the classical Latin squares problem.
There are many computational methods to find the solution of these puzzles (many
involve combinatorial searches). In this talk I'll describe a very simple method
that is based on relaxation of the discrete constraints, and uses a simple
adaptation of Sinkhorn balancing to find a solution. Sinkhorn balancing is an
iterative technique for transforming a matrix with positive entries into a
doubly stochastic matrix (one whose rows and columns sum to one). In this talk I
will discuss some properties of Sinkhorn balancing and show how it can be used
to solve (almost) any Sudoku puzzle in a very simple way. Ludwina Hobma (DAF Trucks) Test Time Reduction of DAF Engines Using Six Sigma Abstract: After a short introduction of Six Sigma as a project management methodology the challenging problem statement will be introduced. It will be shown how mathematical statistics helps a production company establishing results. Furthermore it will be shown where the theory of the statistician fails with respect to daily practice. Daily practice includes instability and outliers, besides convincing management and implementability. Please support me with new insight to realize the challenge! Alessandro Di Bucchianico, (TU/e) An estimation problem in software reliability Stochastic models are being used in software testing to support decision making, e.g. by predicting the required additional testing effort to achieve a certain quality level. Most models used in practice can be described as an order statistics process or a nonhomogeneous Poisson process. In this talk we will give a brief introduction to these models and point out some issues in using these models. In particular, we will state an estimation problem that arises from the practical interpretation of applying these models to software testing. This problem is either ignored or an ad-hoc solution . The audience is invited to contribute to this estimation problem. Francesca Nardi (TU/e) Metastability for Kawasaki dynamics at low temperature with two types of particles We study a two-dimensional lattice gas consisting of
two types of particles subject to Kawasaki dynamics at low temperature in a
large finite box with an open boundary. Each pair of particles occupying
neighboring sites has a negative binding energy provided their types are
different, while each particle has a positive activation energy that depends
on its type. There is no binding energy between neighboring particles of the
same type. We start the dynamics from the empty box and compute the
transition time to the full box. This transition is triggered by a critical
droplet appearing somewhere in the box. Carlo Lancia (Universita Tor Vergata (Rome)) Entropy-driven cutoff phenomenons Birth-and-death Markov chains exhibit a sharp cutoff in their convergence to equilibrium if suitable drift conditions are imposed on the transition rates. The cutoff behavior appears to be closely related to the fact that the stationary distribution is mostly concentrated on a region A whose diameter is much smaller than the size of the state space. Then the cutoff time is understood to be the effective amount of time necessary to reach A. The aim of this work is to extend this picture to the apparently unlike case of Markov chains with highly symmetric state space, for which the equilibrium measure is uniform. As a matter of fact, if it is possible to project the state space onto equivalence classes such that the entropy of the system is highly concentrated on a few of them, the behavior of the lumped chain will be analogous to the one of a birth and death process with the role of stationary distribution played by the entropy. I will review some applications of this result. Botond Szabo (TU/e - EURANDOM) Empirical Bayes Method
The Bernstein-van Mises Theorem says that in parametric models under
some regularity conditions the posterior mass will contract around the
true parameter θ0 with the optimal frequentist rate
independently from the choice of the prior distribution. In
nonparametric model case it was shown, that by bad choice of the prior
distribution the posterior distribution won't contract at all, or even
if it contracts around the true θ0 the contraction
rate will be slower than the optimal frequentist rate. An attempt to
solve this problem is to work with a family of prior distributions
instead of a single one. It arises the question how to choose the
optimal prior distribution out of the family of distributions in the
Bayes method. One solution is to put a hyperpior on the family of prior
distributions and work with this two level, hierarchical prior
distribution. A more practical approach is to choose with an empirical
method the optimal prior distribution, this method is called the
empirical Bayes method. Johan Lukkien (TU/e, Computer Science, System Architecture and Networks) Vehicle to vehicle communication We will present work we did in the area of vehicle to vehicle communication for the purpose of early warnings. This communication comprises periodic broadcasting use the WAVE small messaging protocol as part of the IEEE 802.11p wireless communication standard. This standard is based on CSMA/CA. Simulations show unbalanced loss behavior when vehicle density increases. Bart Janssen (TU/e) Fitting linear regression models with Zernike polynomials Zernike polynomials are an important tool in optics. Recently they appeared in various industrial projects carried out by Bart and colleagues. Bart will briefly mention the industrial projects in which Zernike polynomials appeared, discuss the background of Zernike polynomials and show R packages developed by him to fit linear regression models with Zernike polynomials.
Paulo de Andrade Serra (TU/e - EURANDOM), January 18, 2011 M-Estimation of the Period of a Cyclical Non-homogeneous Poisson Process We present the construction of a (semi-parametric) M-estimator for the period of a non-homogeneous Poisson process with a periodical intensity function. We address the issues of identifiably of this parameter, consistency of estimator, rate of convergence and discuss the severity of the conditions under which these results hold further making some connections with ergodic theory. Some simulations will be shown to exemplify the workings of the estimator. We also make a short comparison with estimators proposed in the past for the period. Further, we present a quick sight at ongoing work to develop an iterative procedure for improving the rate of convergence of the estimator based on estimating large multiples of the period; if successful, this procedure will allow us to obtain convergence rates arbitrarily close to the optimal rate which is known to be n^{3/2}. Subhasis Ghoshal (NCSU), December 21, 2010 Reference Prior for Large Parameter Spaces
The idea of a reference prior is a key concept in objective Bayesian
analysis, originally introduced by Bernardo and further developed by Berger,
Bernardo and many others. Reference prior is the result of an asymptotic
maximization of the expected relative entropy distance between the posterior
and the prior. In the absence of nuisance parameters, the procedure leads to
Jeffreys' prior, but other priors emerge if nuisance parameters are present.
Posterior asymptotic normality plays a key role in the asymptotic expansion
of the expected relative entropy. In this talk, we investigate to what
extent, the asymptotic expansion of relative entropy remains valid when the
dimension of the parameter space in an exponential family increases to
infinity with the sample size. We quantify the allowable rate of growth of
the dimension in terms of certain characteristics of the model and the
prior. We specifically discuss three examples --- independent normal
location model, multinomial model and Dirichlet model. We find explicit
growth rates in each model. We further explore the ideas to extend the
notion of reference prior beyond parametrics. A popular approach is to
consider a finite series approximation of a function of interest and induce
a prior through the coefficients. Our results can be potentially applied in
this setting. We shall discuss some partial results for density estimation
using a spline basis. Riu Castro (TU/e) Signal processing, learning theory and statistics My research interests are on the borderline of signal processing, learning theory and statistics. One of my major research focus is on active learning techniques, also known as sequential experimental design. These include learning/sampling procedures that are able to use information gleaned from previous samples to adapt the sampling procedure. Applications include, among others, network monitoring and measurement and effective spectrum analysis methods for opportunistic transmission in cognitive radio.Kees van Hee and Natalia Sidorova (TU/e, Computer Science), September 21,2010 How large should a log file be to discover its process model? Petri nets are a description formalism for
describing processes with concurrency and synchronization. They are used
workflow management systems to control business processes. These systems
store all events in a so called log file. Now suppose we have a log file
then the question is can we reconstruct the Petri net that produced it? This
is a hot topic in computer science and it is called process mining as a
special branch of data mining. Actually this can be seen as a statistical
estimation problem where the log is the set of observations and the Petri
net the parameter to be estimated. In particular it is interesting to know
how large the log file (i.e. the set of observations) should be in order to
limit the probability of making a wrong decision. The existing process
mining techniques do not cover these statistical aspects yet. Yvo Pokern (University College London), June 29, 2010 Nonparametric Drift Estimation for Stochastic
Differential Equations" For scalar stochastic differential equations on the
circle and the real line, a Bayesian estimator for the drift function based
on observing a sample path over a finite time interval is constructed using
Gaussian priors. We specify the Gaussian priors through their precision
operators which are assumed to be given by differential operators. Marios Pavlides (Frederick University, Nicosia, Cyprus), June 29, 2010 Two Statistical Vignettes: Simpson's Paradox and Shaved Dice Marie-Colette van Lieshout (CWI/TUe), June 15, 2010 Image Segmentation by Polygonal
Markov Fields We discuss the use of polygonal Markov fields for model-based image segmentation. The formal construction of consistent multi-coloured polygonal Markov fields by Arak-Clifford-Surgailis and its dynamic representation are recalled and adapted. We then formulate image segmentation as a statistical estimation problem for a Gibbsian modification of an underlying polygonal Markov field, and discuss the choice of Hamiltonian. Monte Carlo techniques for estimating the model parameters and for finding the optimal partition of the image are developed. We shall also discuss a class of Markov random fields that can be understood as discrete versions of polygonal fields. The analogy with continuum polygonal Markov fields is exploited to define Hamiltonians that are such that desirable properties of these processes can be carried over to the discrete context. Moreover, the analogy gives rise to new attractive sampling schemes complementing the usual local Gibbs and Metropolis methods employed for Gibbs fields on finite graphs. Alessandro Di Bucchianico (TU/e), June 8, 2010 A problem on performing ROC (Receiver Operating Characteristic) analyses for three instead of two outcomes Birgit Witte (Delft University of Technology), April 20, 2010 Consistent Estimators in the Current Status Continuous Mark Model We consider the problem of estimating the joint
distribution function of the event time and a continuous mark variable.
However, the event time is not observed directly but subject to interval
censoring case 1 and the continuous mark variable is only observed in case
the event occurred before time of inspection. A natural estimator for the
distribution function is the nonparametric maximum likelihood estimator (MLE).
Last updated
14-10-11
P.O. Box 513, 5600 MB Eindhoven, The Netherlands |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||