European Institute for Statistics, Probability, Stochastic Operations Research
and their Applications

October 24-25-26, 2016

Workshop on

Data Driven Operations Management

part of

STOCHASTIC ACTIVITY MONTH

Data Driven Operations Management

SUMMARY

REGISTRATION

SPEAKERS

PROGRAMME

ABSTRACTS

SUMMARY

We identify a noticeable paradigm shift in operations management problems towards an integration of statistics and operations management. This shift can be mainly attributed to technological developments, such as the Internet of Things, sensor technology, and social networks. With the aforementioned developments, all kind of data becomes highly available and may be exploited so as to lead to better strategic, tactical, and operational decisions. The workshop will focus on the following practical application Data-Driven Operations Management areas: behavioral operations management, production, maintenance and inventory management, retail, health-care operations, road traffic and communication networks, logistics and revenue management.
The aim of the workshop is to identify the relevant methodological needs within the interface of statistics and operations management and to bridge the corresponding scientific communities. Our focus will be on answering questions such as:
-What are the new performance evaluation and decision problems that one obtains by the use of the extra data?
-How can these problems be solved?
-What is the value of the extra data?
-Do we need new methodologies to study these problems, e.g., methods that combine statistics and operations research?
-How do we interpret decisions based on statistical data, e.g., how do we deal with the risk involved and should there be a behavioral operations research/management?

ORGANISERS

Alp Akcay	TU Eindhoven
Jasper Goseling	University of Twente
Bernd Heidergott	VU Amsterdam
Stella Kapodistria	TU Eindhoven

LIST OF SPEAKERS

KEYNOTE SPEAKERS

Alexander Goldenshluger	University of Haifa
Georg Pflug	Universität Wien
George Shanthikumar	Purdue University

INVITED SPEAKERS

Ivo Adan	TU Eindhoven
Gah-Yi Ban	London Business School
Mohsen Bayati	Stanford University
Arnoud den Boer	University of Amsterdam
Richard Boucherie	University of Twente
Rommert Dekker	Erasmus University Rotterdam
Moshe Haviv	Hebrew University of Jerusalem
Nathan Kallus	Cornell Tech
Michael Katehakis	Rutgers University
Ger Koole	VU Amsterdam
Dimitrios Mavroeidis	Philips
Nazanin Nooraee	TU Eindhoven
Ohad Perry	Northwestern University
Kalyan Talluri	Imperial College London
Evrim Ursavas	Groningen University
Jelle de Vries	VU Amsterdam
Spyros Zoumpoulis	INSEAD

CALL FOR POSTERS

We are pleased to announce a Call for Posters for the Data-Driven Operations Management Workshop, jointly organized by Eurandom and the Beta Research School. We invite PhD students, Postdocs and in general young researchers to submit an abstract for the poster session before October 15 2016 by sending an email to ddom@tue.nl with the header “Submission of poster abstract for the Data-Driven Operations Management Workshop”. A poster abstract must be in plain text and no longer than 500 words, not including bibliographic references. At least one author of each accepted poster is required to register and attend the workshop. The poster should be preferably of size A0 in portrait orientation (84.1 x 118.9cm; or 33.1 x 46.8 inches).

PROGRAMME

MONDAY OCTOBER 24

09.45 - 10.30	Registration and coffee/tea
10.30 - 10.45	Welcome / opening	Geert-Jan van Houtum
10.45 - 11.15		Arnoud den Boer	Decision-based model selection
11.15 - 11.45		Spyros Zoumpoulis	Customizing Marketing Decisions Using Field Experiments
11.45 - 12.15		Dimitrios Mavroeidis	Predictive Maintenance
12.15 - 13.45	Lunch
13.45 - 15.15		George Shanthikumar	A Framework for Prescriptive Empirical Operations Management
15.15 - 15.45	Break
15.45 - 16.15		Ohad Perry	Service Systems with Dependent Service and Patience Times
16.15 - 16.45		Evrim Ursavas	Locating LNG Bunkering Stations
16.45 - 17.15		Jelle de Vries	Determinants of Safe and Productive Driving: Empirical Evidence from Long-haul Cargo Transport

TUESDAY OCTOBER 25

09.00 - 09.30		Ger Koole	Data analysis and validation of call center models
09.30 - 10.00		Gah-Yi Ban	The data-driven (s, S) policy: why you can have confidence in censored demand data
10.00 - 10.30		Mohsen Bayati	Online Decision-making with High Dimensional Covariates
10.30 - 11.00	Break
11.00 - 12.30		Alexander Goldenshluger	Statistical inference for the M/G/infinity queue
12.30 - 14.00	Lunch
14.00 - 14.30		Richard Boucherie	Operations research solutions to improve the quality of healthcare
14.30 - 15.00		Moshe Haviv	Queueing paradoxes
15.00 - 15.30		Michael Katehakis	Simple Data Driven Policies for MDPs
15.30 - 17.30	Break
15.30 - 17.30		Poster session
18.30 -	Conference dinner

WEDNESDAY OCTOBER 26

09.30 - 10.00		Ivo Adan	Big data in daily manufacturing operations
10.00 - 10.30		Nazanin Nooraee	S-curve Prediction Models for High-Frequent Wind Turbine Data
10.30 - 11.00	Break
11.00 - 12.30		Georg Pflug	Decision making under uncertainty: data-driven modeling
12.30 - 14.00	Lunch
14.00 - 14.30		Rommert Dekker	Big Data in Shipping
14.30 - 15.00		Nathan Kallus	Dynamic Assortment Personalization in High Dimensions
15.00 - 15.30		Kalyan Talluri	Facility Location Decisions from Public Data
15.30 - 17.00	Drinks
15.30 - 17.00		Panel Discussion & Closing

ABSTRACTS

Ivo Adan

Big data in daily manufacturing operations

In this talk we present a project in semi-conductor manufacturing, showing how real time production data can be exploited to improve operational performance.

PRESENTATION

Gah-Yi Ban

The data-driven (s, S) policy: why you can have confidence in censored demand data

We revisit the classical dynamic inventory management problem of Scarf (1959) from a distribution-free, data-driven perspective. We propose a nonparametric estimation procedure for the optimal (s, S) policy that is asymptotically optimal and derive asymptotic confidence intervals around the estimated (s, S) levels. We further consider having at least some of the data censored from the absence of backlogging. We show that the intuitive procedure of correcting for censoring in the demand data directly yields an inconsistent estimate. We then show how to correctly use the censored data to obtain consistent decisions and derive confidence intervals for this policy. Surprisingly, under some conditions, estimated ordering decisions with censored demand data may have smaller variability and mean squared error (MSE) than with fully uncensored data. We thus arrive at the remarkable result that a decision maker with fully uncensored data can add artificial demand data to improve the estimation of the (s,S) policy. We provide a prescription for the optimal amount of artificial data to add.
(link to paper)

Mohsen Bayati

Online Decision-making with High Dimensional Covariates

Growing availability of data has enabled decision-makers to tailor choices at the individual-level. This involves learning a model of decision rewards conditional on individual-specific covariates or features. Recently, "contextual bandits" have been introduced as a framework to study these online decision making problems. However, when the space of features is high-dimensional, existing literature only considers situations where features are generated in an adversarial fashion that leads to highly conservative performance guarantees -- regret bounds that scale by square-root of number of samples.
Motivated by medical decision making problems where stochastic features are more realistic, we introduce a new algorithm that relies on two sequentially updated LASSO estimators. One estimator (with a low-bias) is used to select a candidate subset of the decisions, next a more biased (but potentially more accurate) estimator is used to select the optimal decision. We prove that our algorithm achieves a regret that scales poly-logarithmically in the number of samples and features. The key step in our analysis is proving a new oracle inequality that guarantees the convergence of the LASSO estimator despite the non-i.i.d. data induced by the bandit policy.
We illustrate the practical relevance of the proposed algorithm by evaluating it on a warfarin dosing problem. A patient's optimal warfarin dosage depends on the patient's genetic profile and medical records; incorrect initial dosage may result in adverse consequences such as stroke or bleeding. We show that our algorithm outperforms existing bandit methods as well as physicians to correctly dose a majority of patients.
(joint work with Hamsa Bastani)

Arnoud den Boer

Decision-based model selection

In optimization problems, simple mathematical models that discard important factors may sometimes be preferred to more realistic models. This may occur if the parameters of the simple model are easier to estimate than the parameters of the complex model, or if the optimization problem corresponding to the simple model can be solved exactly whereas the optimization problem corresponding to the `realistic model' is intractable. This trade-off between three sources of errors (modeling, estimation, and optimization errors) is encountered in many stochastic optimization problems.
The question we address is: how can one determine if it is better to use a simplified model, rather than a more realistic model? In other words: given a data set and a particular optimization problem, how do we know whether the model-misspecification error induced by a simple model is dominated by estimation and optimization errors corresponding to more realistic models? In this research we propose a generic decision-based model selection method that determines when simplicity is preferred to realism. We explain the theoretical framework of our method, and illustrate the potential performance improvement in assortment optimization and newsvendor problems.

PRESENTATION

Richard Boucherie

Operations research solutions to improve the quality of healthcare

CHOIR: Center for Healthcare Operations Improvement & Research

Healthcare expenditures are increasing in many countries. Delivering adequate quality of healthcare requires efficient utilization of resources. Operations Research allows us to maintain or increase the current quality of healthcare for a growing number of patients without increasing the required work force. In this talk, I will describe a series of mathematical results obtained in the Center for Healthcare Operations Improvement and Research of the University of Twente, and I will indicate how these results were implemented in Dutch hospitals.

Efficient planning of operating theatres will reduce the wasted hours of staff, balancing the number of patients in wards will reduce peaks and therefore increases the efficiency of nursing care, efficient rostering of staff allows for more work to be done by the same number of people. While employing operations research techniques seems to be dedicated to improving efficiency, at the same time improved efficiency leads to increased job satisfaction as experienced workload is often dominated by those moments at which the work pressure is very high, and it also improves patient safety since errors due to peak work load will be avoided.

PRESENTATION

Rommert Dekker

Big Data in Shipping

Quite recently, ships are obliged to send information on their status (position, speed) to a general platform. Next public access to this so-called AIS has been ensured, such that everybody can see which ships are around (except from small ships). One would expect that the availability of such an amount of real-time data would generate a wealth of applications, but todate there are only few applications of that data. In this talk we will analyse the role of these data in facilitating applications and which – essential –data is lacking so far. We will first focus on arrivals and departures of ships in ports, next we will consider the analysis of ship delay and their use in timetables and finally we will present some AIS applications to date.

Alexander Goldenshluger

Statistical inference for the M/G/infinity queue

The subject of this talk is statistical inference on the service time distribution and its functionals in the M/G/infinity queue. In particular, we will discuss three different observation schemes with incomplete data on the queue: observations of arrivals and departures without identification of customers,
observations of the superposed arrival-departure point process and observations of the queue-length (number-of-busy-servers) process. In these settings we derive some probabilistic results on the processes involved and construct estimators of the service time distribution with provable accuracy guarantees. The problems of estimating the service time expectation and the arrival rate are discussed as well. We will present also some results on comparison of different estimators.

PRESENTATION

Moshe Haviv

Queueing paradoxes

The talk will survey three paradoxes and one anti-paradox emerging in queueing systems. The paradoxes are Breass, Downs-Thomson and Javons. The anti-paradox is that the other line is not that short after all.

PRESENTATION

Nathan Kallus

Dynamic Assortment Personalization in High Dimensions

We demonstrate the importance of structural priors for effective, efficient large scale dynamic assortment personalization. Assortment personalization is the problem of choosing a best assortment of products, ads, or other offerings (items) to specifically target a particular individual or consumer segment (type). This is a central problem in revenue management for e-commerce, online advertising, and multi-location brick-and-mortar retail, where both types and items number in the thousands to millions. Efficient use of data is critical in this large-scale setting, as the number of interactions with customers is limited – definitely not in the trillions – so it is infeasible to learn each one's preferences independently. Furthermore, learning preferences is not enough: the goal of personalization is revenue, not mere knowledge. In dynamic assortment personalization, the retailer chooses assortments to learn preferences and optimize revenue simultaneously.
We formulate the dynamic assortment personalization problem as a discrete-contextual bandit with m contexts, where type (individual or segment identity) is the context and assortments of n items are the arms. We assume that each type's preferences follow a simple parametric model with n parameters (multinomial logit); in all, there are mn parameters. Existing literature suggests that, over T interactions, order optimal regret is mnlog(T), which is infeasibly large in most e-commerce settings.
In this paper, we impose natural structure – a small latent dimension for the parameter space, or low rank on the matrix of mn parameters – on the problem. This structure reduces the number of parameters to estimate and the regret incurred in estimating them. In the static setting, we show that this model can be efficiently learned from surprisingly few interactions. We propose an efficient algorithm to learn the model that converges globally whenever the model is learnable. In the dynamic setting, we show that structure-aware dynamic assortment personalization can have regret that is an order of magnitude smaller than structure-ignorant approaches.

Michael Katehakis

Simple Data Driven Policies for MDPs

Markov decision processes (MDPs) have a variety of applications, not just the classical OR fields but also in other directions such as computer science, engineering and biology/medical science. We first give a brief survey of the state of the art of the area of computing optimal data driven (adaptive) policies for MDPs with unknown rewards and or transition probabilities.
Then, we present three simple algorithms for adaptively optimizing the average reward in an unknown irreducible MDP. The first algorithm uses estimates for the MDP and and chooses actions by maximizing an inflation of the estimated right hand side of the average reward optimality equations. The second is based on estimating the optimal rates at which actions should be taken and the third is based on generalized Thompson sampling ideas. For the first we show that the total expected reward obtained by this algorithm up to time n is within O(ln n) of the reward of the optimal policy, and in fact achieves asymptotically minimal regret. Various computational challenges and simplifications are discussed.
(joint work with Wesley Cowan, PhD, Rutgers University)

Ger Koole

Data analysis and validation of call center models

Queueing models are used for scheduling purposes in many queueuing models, but they are never validated.
We analyzed data from a Dutch call center and validated some common features such us Poisson arrivals and exponential talk times.
We also compared the overall performance with models of different levels of precision in modelling.
(joint work with Sihan Ding, April Li, Rob vd Mei and Raik Stolletz)

PRESENTATION

Dimitrios Mavroeidis

Predictive Maintenance

Predictive Maintenance algorithms aim to identify early signs of deteriorating equipment condition allowing for timely scheduling of maintenance visits thus preventing unplanned downtime and customer inconveniences. The importance of this problem for several industries has led to the development of various predictive maintenance techniques in the fields of statistics, operations research and machine learning.
A big obstacle commonly faced by researchers is related to the limited availability of real-world use-cases and data to validate the existing techniques and to develop novel approaches. In this talk we will present a predictive maintenance use-case related to Philips medical devices, highlighting the specific use-case challenges as well as the solutions that have been developed.

Nazanin Nooraee

S-curve Prediction Models for High-Frequent Wind Turbine Data

Developments of technologies support collection and storage of huge amounts of data. Given this data, scientists can investigate what information is contained and try to explain what has happened or predict what will happen in the future. However, drawing inference and making reliable conclusions, even from large data sets, often require advanced statistical models.
In this talk, we discuss and compare various parametric and non-parametric statistical models for an industrial application with high-frequent data. More precisely, we were interested in modeling the relation between wind turbine power and wind speed. In order to model the wind turbine power curve, which has an S-curve shape, we applied different kinds of 5-parameter logistic functions and smoothing functions on one season of data collected at a frequency of every ten minutes. The mathematical models were evaluated with the mean square error and they were compared to a theoretical minimum. Additionally, the model prediction were validated on different time windows.
(joint work with Stella Kapostria and Sándor Kolumbán (TU/e))

Ohad Perry

Service Systems with Dependent Service and Patience Times

Most queueing models for service systems in the literature are assumed to have independent primitive processes (arrival, service, abandonment, etc.). However, data shows that the patience of a customer may depend on that customer’s service requirement. In this talk we study the impacts that such a dependency has on key performance measures (waiting times, queue length, proportion of abandonment and throughput), and on optimal capacity decisions. In particular, we consider a system with a single pool of many statistically-homogeneous agents serving one class of statistically-identical customers whose service requirements and patience times are dependent. Since the assumed dependence structure renders exact analysis intractable, we propose a fluid approximation which is characterized via the entire joint distribution of the service and patience times. To evaluate the effects of the dependence, we employ bivariate dependence orders and copulas, and provide structural results which facilitate revenue optimization when a staffing cost is incurred. Simulation experiments demonstrate that our fluid approximation is accurate and effective.
(joint work with Allen Wu and Achal Bassamboo, Northwestern University)

Georg Pflug

Decision making under uncertainty: data-driven modeling

As we understand it, data-driven modeling means that a minimum of assumptions (or no assumptions) are made a-priori and we let the observed data "speak for themselves". To put it differently, the the data-driven approach should be a nonparametric one.
In this talk, we consider a multistage stochastic decision model as it appears in portfolio optimization, option pricing, hydrostorage management, energy trading etc. The uncertain parameters of such a model form a (multivariate) stochastic process, for which typically a parametric form is assumed (in the simplest case such as in the Black-Scholes formula just with one single parameter).
For the numerical calculation of optimal decisions, a discretization (a scenario tree or a scenario lattice) is needed and can be found by approximation techniques. We review briefly these techniques, but put the emphasis on nonparametric scenario generation. While large data sets are of course desirable for the model quality, the technique can also handle quite small data sets. The same nonparametric technique allows also to reduce large scenario trees in size with keeping much of their stochastic properties as we demonstrate with examples.
In addition, we consider the problem of model misspecification. By using a nonparametric notion of model distance, we are able to define decision strategies, which are robust against small model misspecifications. A relation between the size of the data set and the model ambiguity can be given by recent inequalities due to Villani and others. An illustrative example from hydrostorage management is given.

PRESENTATION

George Shanthikumar

A Framework for Prescriptive Empirical Operations Management

We provide a framework for prescriptive empirical modeling with specific attention to overcoming structural and statistical errors. This is achieved through operational statistics and objective operational learning which are built on the basis of data integration and cross validation. We will illustrate how regularization in sample approximation approaches and data driven robust optimization with cross validation relates to operational objectives and operational statistics. We will also illustrate how data driven modeling using data mining and econometric modeling with machine learning can be used here.
(joint work with F. Qi)

Kalyan Talluri

Facility Location Decisions from Public Data

There has been a tremendous increase in the size and availability of public data, yet it is not clear how a firm might put it to good use. Analytical studies rarely seem to go beyond summary statistics and attractive visualizations. In this paper we present an application based only on publicly available data in which a restaurant chain makes a facility-location decision: where to locate a new restaurant, of what type, and in which price range. We combine Yelp review data sets, with demographic, geographic and restaurant inspection data to build a model of demand and use it to formulate an optimization problem that recommends the top $k$ locations.
(joint work with Muge Tekin (Universitat Pompeu Fabra, Barcelona))

Evrim Ursavas

Locating LNG Bunkering Stations

The growing awareness of the environment and new regulations of the International Maritime Organization and the European Union has forced ship-owners to reduce pollution. Liquified natural gas (LNG) is one of
the most promising options for accomplishing this reduction for inland waterways and short sea shipping. However, an extensive LNG infrastructure has not yet been developed due to low commercial demand and
demand is still low due to the absence of LNG fuel facilities. To overcome this problem, refuelling facilities need to be located in strategic locations allowing for alternative investment structures. To this end, we
develop mathematical models that determine locations for refuelling stations where terminal-to-ship and truck-to-ship bunkering alternatives are analyzed.We consider the characteristics of an LNG network, such as the boil-off eff ect (from loading and storage) and dynamic demand by accounting for factors like capacity. We consider cases where capacity expansion and reduction is beneficial.We develop a lower bound heuristic based on lagrangian relaxation technique. We derive bounds by employing a heuristic to obtain feasible solutions of the lagrangian solution. We perform our experiments for the waterway network in the Arnhem-Nijmegen region in the Netherlands. Extensive experiments are provided to study the possible future scenarios.

Jelle de Vries

Determinants of Safe and Productive Driving: Empirical Evidence from Long-haul Cargo Transport

Using GPS data of 370 long-haul trips in India, survey data of 49 drivers, and ERP data, this study examines the role of driver personality characteristics in predicting risky and productive driving. The results show that more conscientious drivers display more risky driving behavior. More extravert drivers are less productive, whereas driver safety consciousness positively relates to productivity. These results can serve as a starting point for further studies into how long-haul transport companies may use individual driver characteristics in their training and selection procedures to meet operational safety and productivity objectives.
(joint worh with René de Koster, Debjit Roy, Serge Rijsdijk)

PRESENTATION

Spyros Zoumpoulis

Customizing Marketing Decisions Using Field Experiments

We investigate how firms can use the results of field experiments to optimize promotion assignments between customers. We evaluate seven widely-used segmentation methods using a series of two large scale field experiments. The first field experiment is used to generate a common pool of training data for each of the seven methods. We then validate the seven optimized policies provided by each method together with uniform benchmark policies in a second field experiment. The findings reveal that model-driven methods (Lasso regression and Finite Mixture Models) performed the best. Some distance-driven methods also performed well (particularly $k$-Nearest Neighbors). However, the classification methods we tested performed relatively poorly. The precision of the data varied with the level of aggregation, and the model-driven methods made the best use of the information when it was more precise. The model-driven methods also performed best in parts of the parameter space that are well represented in the training data. The relative performance of the methods is robust to modest differences in the settings used to create the training and validation data. We explain the poor performance of the classification methods in our setting and describe how it could be improved.

PRESENTATION

PRACTICAL INFORMATION

● Venue

Eurandom, Mathematics and Computer Science Dept, TU Eindhoven,

De Groene Loper 5, 5612 AZ EINDHOVEN, The Netherlands

Eurandom is located on the campus of Eindhoven University of Technology, in the Metaforum building (4th floor) (about the building). The university is located at 10 minutes walking distance from Eindhoven main railway station (take the exit north side and walk towards the tall building on the right with the sign TU/e).
Accessibility TU/e campus and map.

● Registration

Registration is free, but compulsory for ALL speakers and participants. This link will redirect you to the Evenbrite website, where you will find the REGISTRATION FORM

● Accommodation / Funding

Hotel will be booked for all keynote and invited speakers. Please give your arrival and departure date on the registration form.

Other participants have to make their own arrangements.

For hotels around the university, please see: Hotels (please note: prices listed are "best available").

More hotel options can be found on the webpages of the Tourist Information Eindhoven, Postbus 7, 5600 AA Eindhoven.

● Travel

For those arriving by plane, there is a convenient direct train connection between Amsterdam Schiphol airport and Eindhoven. This trip will take about one and a half hour. For more detailed information, please consult the NS travel information pages or see Eurandom web page location.

Many low cost carriers also fly to Eindhoven Airport. There is a bus connection to the Eindhoven central railway station from the airport. (Bus route number 401) For details on departure times consult http://www.9292ov.nl

The University can be reached easily by car from the highways leading to Eindhoven (for details, see our route descriptions or consult our map with highway connections.

● Conference facilities : Conference room, Metaforum Building MF11&12

The meeting-room is equipped with a data projector, an overhead projector, a projection screen and a blackboard. Please note that speakers and participants making an oral presentation are kindly requested to bring their own laptop or their presentation on a memory stick.

● Conference Secretariat

Upon arrival, participants should register with the workshop officer, and collect their name badges. The workshop officer will be present for the duration of the conference, taking care of the administrative aspects and the day-to-day running of the conference: registration, issuing certificates and receipts, etc.

● Cancellation

Should you need to cancel your participation, please contact Patty Koorn, the Workshop Officer.

● Contact

Mrs. Patty Koorn, Workshop Officer, Eurandom/TU Eindhoven, koorn@eurandom.tue.nl

Last updated 05-12-16,
by PK

P.O. Box 513, 5600 MB Eindhoven, The Netherlands
tel. +31 40 2478100
e-mail: info@eurandom.tue.nl

European Institute for Statistics, Probability, Stochastic Operations Research and their Applications

ORGANISERS

SPONSORS

LIST OF SPEAKERS

WEDNESDAY OCTOBER 26

ABSTRACTS

PRACTICAL INFORMATION

● Venue

● Registration

● Accommodation / Funding

● Travel

● Conference facilities : Conference room, Metaforum Building MF11&12

● Conference Secretariat

● Cancellation

● Contact

Last updated 05-12-16, by PK

European Institute for Statistics, Probability, Stochastic Operations Research
and their Applications

Last updated 05-12-16,
by PK