ENBIS-17 in Naples

9 – 14 September 2017; Naples (Italy) Abstract submission: 21 November 2016 – 10 May 2017

My abstracts


The following abstracts have been accepted for this event:

  • Uncertainty and Sensitivity Analysis of Functional Risk Curves Based on Gaussian Processes

    Authors: Bertrand Iooss (EDF R&D), Loïc Le Gratiet (EDF R&D)
    Primary area of focus / application: Other: French SFdS session on Computer experiments and energy
    Keywords: Computer experiments, Gaussian process, Uncertainty, Metamodel, Probability of detection, Non destructive testing
    Submitted at 22-Feb-2017 18:47 by Bertrand Iooss
    12-Sep-2017 11:40 Uncertainty and Sensitivity Analysis of Functional Risk Curves Based on Gaussian Processes
    In industrial practice, the estimation of a functional risk curve (FRC) is often required as a quantitative measure of a system safety. A FRC gives the probability of an undesirable event in function of the value of a critical parameter of a considered physical system. Our purpose considers the qualification issues of non-destructive examination processes, where the FRC corresponds to the probability of flaw detection curves. FRCs are used in many other engineering framework, e.g. in the seismic fragility assessment.

    The estimation of the FRC sometimes relies on deterministic phenomenological computer models which simulate complex physical phenomena. Uncertain input parameters of this computer code are modeled as random variables. Standard uncertainty treatment techniques require many model evaluations and a major algorithmic difficulty arises when the computer code under study is too time expensive to be directly used. For cpu-time expensive models, one solution consists in replacing the numerical model by a mathematical approximation, called a response surface or a metamodel. In this communication, the Gaussian process regression is used and applied in the particular context of a FRC as a quantity of interest. We focus on the Gaussian process metamodel in order to build FRCs from numerical experiments, allowing to obtain confidence bands.

    Associated to the estimation of the quantity of interest, the sensitivity analysis step is performed to determine those parameters that mostly influence on model response. We propose new global sensitivity indices attached to the whole FRC, while showing how to develop them with a Gaussian process model.
  • Probabilistic Modelling and Forecasting of the Wind Energy Resource at the Monthly to Seasonal Scale

    Authors: Peter Tankov (ENSAE ParisTech), Bastien Alonzo (Laboratoire de Météorologie Dynamique, CNRS/Ecole Polytechnique), Philippe Drobinski (Laboratoire de Météorologie Dynamique, CNRS/Ecole Polytechnique), Riwal Plougonven (Laboratoire de Météorologie Dynamique, CNRS/Ecole Polytechnique)
    Primary area of focus / application: Other: Modeling, forecasting and risk evaluation of wind energy production
    Keywords: Probabilistic forecasting of wind, Kernel regression, Single-index models
    Submitted at 22-Feb-2017 20:24 by Peter Tankov
    12-Sep-2017 10:50 Probabilistic Modelling and Forecasting of the Wind Energy Resource at the Monthly to Seasonal Scale
    We build and evaluate a probabilistic model designed for forecasting the distribution of the daily mean wind speed at the seasonal timescale. On such long-term timescales, the variability of the surface wind speed is strongly influenced by the large-scale situation of the atmosphere. Our aim is to predict the daily mean wind speed distribution at a specific location using the information on the large-scale situation of the atmosphere, summarised by a single index. To this end, we first estimate the conditional probability density function of the wind speed given the index by Gaussian kernel regression over 20 years of daily data. We next use the ECMWF seasonal forecast ensemble to predict the index at the seasonal timescale. The ensemble forecast displays a growing uncertainty with time leading to an increase of the confidence interval width predicted by the probabilistic model. We show that the model is sharper than the climatology at the horizon of one month, even if it displays a strong loss of precision after 15 days. Using the statistical post-processing method EMOS (ensemble model output statistics) to recalibrate the ensemble forecast leads to a further improvement of our probabilistic forecast, which then remains sharper than the climatology at the seasonal horizon.
  • Bayesian Thinking in the Cosmetic Industry

    Authors: Philippe Bastien (L'Oréal R&D), Charles Gomes (L'Oréal R&D)
    Primary area of focus / application: Other: Statistics for cosmetics
    Keywords: Bayesian, Clinical, p-value, Prior, Power
    Submitted at 27-Feb-2017 10:55 by Philippe Bastien
    Accepted (view paper)
    12-Sep-2017 18:20 Bayesian Thinking in the Cosmetic Industry
    Many scientific institutions have recently questioned the reproducibility issue of scientific results. They stressed the misuse of the p-value, which is contributing to the number of research findings that cannot be reproduced. To address this issue, The American Statistical Association recommends supplementing p-values with other approaches which include Bayesian methods. This is currently in practice at L'Oréal for the Proof of Concept Studies and will be described based on results from a Whitening study.
    This approach gives a greater place to the expertise through the definition by the clinicians of classes of performance for the parameters of interest, taking better account of the effect size in the interpretation of the results. Determining the probabilities of belonging to each of the performance classes provides much more information than the binary rule associated with the p-value. It will also allow to define efficiency profiles for the actives and to rank them accordingly. Moreover, reproducibility issue can be addressed using Bayesian power to determine the probability a future study will be successful based on results from a previous one. If success is associated to a significant comparison, this corresponds to the notion of Assurance, but success can be any other decision rule in a full Bayesian approach. In the example shown, a success will be defined by the probability of belonging to a performance class higher or equal to a specified threshold. Incorporation of prior information, whether it is from expert opinion or historical data will be also discussed.
  • Fast Clustering of Streaming Time Series Summarized by Histograms

    Authors: Antonio Balzanella (Università della Campania Luigi Vanvitelli), Rosanna Verde (Università della Campania Luigi Vanvitelli), Antonio Irpino (Università della Campania Luigi Vanvitelli)
    Primary area of focus / application: Other: Italian SIS session on Statistics in Data Science
    Keywords: Clustering, Data stream mining, Big Data, Distribution data analysis
    Submitted at 27-Feb-2017 12:54 by Antonio Balzanella
    12-Sep-2017 11:40 Fast Clustering of Streaming Time Series Summarized by Histograms
    This paper deals with the on-line clustering of multiple data streams. We assume that a sensor network is used for monitoring over time a physical phenomenon. Each sensor, performs repeated measurements at a very high frequency so that it is not possible to store the whole amount of data into some easy to access media. Still, we assume that the monitored phenomenon is highly evolving. We can think, for instance, at temperature monitoring, seismic activity monitoring, pollution monitoring.
    Our aim is to find groups of sensors which behave similarly over time.
    The proposed strategy is made by two phases: the online phase aims at summarizing the incoming data; the offline phase provides the partitioning of the streams into clusters. In the online phase, the incoming observations are split into batches. Each subsequence in the batch is summarized by a histogram. Finally, a fast clustering algorithm is performed on the histograms in order to get a local partitioning of the data. The offline step, finds a consensus partition starting from the local partitions of the data streams.
    Through an application on real data, we show the effectiveness of our strategy in finding homogeneous groups of data streams.
  • Consistent Testing for Pairwise Dependence in Time Series

    Authors: Konstantinos Fokianos (University of Cyprus)
    Primary area of focus / application: Other: ASQ international journal session
    Keywords: Distance covariance, Empirical characteristic function, Generalized spectral density, Kernel, U-statistic, V-statistic, White noise
    Submitted at 1-Mar-2017 11:57 by Konstantinos Fokianos
    11-Sep-2017 12:30 Consistent Testing for Pairwise Dependence in Time Series
    We consider the problem of testing pairwise dependence for stationary time series. For this, we suggest the use of a Box-Ljung type test statistic which is formed after calculating the distance covariance function among pairs of observations. The distance covariance function is a suitable measure for detecting dependencies between observations as it is based on the distance between the characteristic function of the joint distribution of the random variables and the product of the marginals. We show that, under the null hypothesis of independence and under mild regularity conditions, the test statistic converges to a normal random variable. The results are complemented by several examples. This article is a joint work with M. Pitsillou.
  • VAM, an R Package to Take into Account the Effect of Maintenance and Ageing

    Authors: Doyen Laurent (Univ. Grenoble Alpes)
    Primary area of focus / application: Other: Statistical analysis of industrial reliability and maintenance data
    Keywords: Imperfect maintenance, Imperfect repair, Repairable system reliability, Virtual age, Software
    Submitted at 1-Mar-2017 13:20 by Doyen Laurent
    12-Sep-2017 09:00 VAM, an R Package to Take into Account the Effect of Maintenance and Ageing
    Recurrent events data arise in many application fields such as epidemiology (e.g. relapse times of a disease), industry (e.g. repair times of a system), ... To analyze such data, we must be able to take into account events effects on successive occurrence times. The presentation is focalized on maintenance efficiency in reliability context, but it can equivalently be applied to intervention and treatment efficiency in epidemiology context. The basic assumptions are known as perfect maintenance or As Good As New (the system is renewed) and minimal maintenance or As Bad As Old (maintenance has no effect on future occurrence times). Obviously, reality falls between these two extreme cases. An intermediate effect can be described thanks to imperfect maintenance models. We will present with practical examples an introductory tutorial to our VAM software. VAM, for Virtual Age Models, is an R open source package that implements the principal imperfect maintenance models. VAM usage is based on a formula which specify the characteristic of the data set to analyze and the model used for that. Thanks to this formula description the package becomes adaptive. In fact, the formula is defined by the user and characterizes the behavior of the new unmaintained system, the types, effects and number of different preventive and corrective maintenances, and how preventive maintenance times are planned. Then, the package functionalities enable to simulate new data sets, to estimate with maximum likelihood method the parameters of the model, to calculate and plot different indicators. These functions can be in particular used to implement Monte Carlo and bootstrap methods. A weakness of classical R codes is that such computation can becomes quite long because R is an interpreted and not compiled language. But this is not the case for the VAM package since it is mainly implemented in C++ thanks to the Rcpp package.