ENBIS-14 in Linz21 – 25 September 2014; Johannes Kepler University, Linz, Austria Abstract submission: 23 January – 22 June 2014
The following abstracts have been accepted for this event:
Optimizing the Number of Tested Hypotheses Under Limited Computational Ressources
Authors: Andreas Futschik (Johannes Kepler University)
Primary area of focus / application: Reliability
Keywords: Simulation, Hypothesis testing, Optimization
Submitted at 28-May-2014 15:26 by Andreas Futschik
We consider a multiple hypothesis testing framework when the overall number of observations that can be collected is large but limited by computational constraints. A natural question in this context is whether the number of hypotheses to be tested should be limited in favor of additional observations per considered hypothesis. We provide guidelines concerning the choice of an optimum number of considered hypotheses in common testing situations. Thinking of correctly rejected null hypotheses as interesting findings, our optimization is with respect to the expected number of correct rejections while controlling for the multiple testing error. We also briefly discuss the classification setting, where a linear combination of true and false positives is considered. We demonstrate that considering an appropriate number of hypotheses in this context can lead to a substantial increase in the expected number of correct rejections.
Two-Stage Sampling Plans for Auditing and High-Quality Product Inspection
Authors: Rainer Göb (University of Würzburg)
Primary area of focus / application: Quality
Secondary area of focus / application: Business
Keywords: Two-stage sampling, Proportion nonconforming, Prior information, Confidence interval
Submitted at 29-May-2014 01:40 by Rainer Göb
DoE Approached Differently: Making Every Experiment Count
Authors: Peter van Bruggen (Unilever R&D)
Primary area of focus / application: Design and analysis of experiments
Secondary area of focus / application: Education & Thinking
Keywords: Design of Experiments, JMP, Plyos, Unilever, Research and development, Software tool
Submitted at 29-May-2014 14:55 by P.C. van Bruggen
Accepted (view paper)
There are many tools available that can help in setting up and analysing DoE’s. However, these tools have some major drawbacks: 1) There is an overwhelming amount of possible DoE’s from which the user needs to make a choice, 2) The level of detail that is needed to fill in the necessary information is high and 3) It might be difficult to understand the results of the statistical analysis of the measurement results. Without proper guidance an incorrect design could be chosen easily or results could be interpreted erroneously and both issues could make the difference between success and failure.
Instead of training all scientist in the basics of statistics and DoE, Unilever R&D has chosen to deploy a tool in which these difficulties have been overcome. This tool – called “Plyos” – was developed in-house and runs in JMP. First it creates an experimental design based on information provided by the user (responses, factors and some other common details). The user is guided through this process with extended help. Then it analyses the data from the DoE based experiments. At the same time the tool gives detailed explanations how the statistical techniques and the results should be interpreted.
The presentation highlights the ways of working, the advantages and the limitations of the tool.
Two Layers Statistical Meta-Classifier: A Python Package
Authors: Delio Panaro (University of Genoa), Andrea Olivari (University of Genoa), Eva Riccomagno (University of Genoa)
Primary area of focus / application: Mining
Keywords: Statistical meta-classifier, Supervised learning, Skewed data, Python
Submitted at 29-May-2014 15:54 by Delio Panaro
The Python package contains two modules: train and predict. The main module is train and it requires as input a dataset organized in an array, a vector of corresponding labels and a floating point number t in ]0.5,1[. It returns a trained simple classifier for each feature of the dataset, a trained Adaboost based on more than one feature and the meta-classifier architecture. In the current beta version, the simple classifiers are all from the same family chosen by the user among standard classifiers such as: Support Vector Machine, Decision Tree or Stochastic Gradient Descent. The architecture is based on the performance of the simple classifiers: weak classifiers performing better than the threshold t are allocated to both the first layer and the Adaboost whereas others are used only in the Adaboost or discarded if performing poorly. The output of train could be hidden or not. It consist of two vectors: the first one has length the number of features and each of his entries take one of three values according to how the corresponding feature has been allocated in the architecture. The second vector contains the estimate of the Adaboost weights for the relevant features.
The module predict requires in input a datapoint as a vector and returns the predicted label. The datapoint is processed by the first layer which, according to a majority rule, can return a final hypothesis (predicted label) or send it to the second layer where it is processed by the Adaboost, which returns the final hypothesis.
The meta-classifier has been tested on datasets from three very different classification problems and results will be presented together with performance analyses.
A Comparative Study of Different Methodologies for Supervised Fault Diagnosis in Multivariate Statistical Process Control
Authors: Santiago Vidal-Puig (UPV (Valencia)), Alberto Ferrer (UPV(Valencia)), Raffaele Vitale (UPV(Valencia))
Primary area of focus / application: Process
Keywords: Fault diagnosis, Statistical process control, Supervised methods, Latent-Variable based methods
Submitted at 29-May-2014 16:03 by Santiago Vidal
An Enhanced Procedure for Kriging-Based Adaptive Sampling
Authors: Daniele Romano (Dipartimento di Ingegneria Meccanica, Chimica e dei Materiali, Universita` di Cagliari), Rocco Ascione (ENEA)
Primary area of focus / application: Design and analysis of experiments
Secondary area of focus / application: Modelling
Keywords: Adaptive sampling, Sequential experiments, Kriging models, Optimization, Engineering
Submitted at 29-May-2014 19:41 by Daniele Romano
However, a different option would be starting with a Random or a LH sample (sample one) followed by an adaptive sample (sample two) where units are taken sequentially with the purpose to optimize an objective function. This is a composite sampling scheme which can significantly improve the trade-off between sample size and the information collected.
The core of the method is to drive the next-site selection in sample two by a sequel of kriging models, namely stationary Gaussian stochastic processes with a given autocorrelation structure . The distinctive merit of such models is their ability to promptly reconfigure themselves, changing the pattern of predictions and prediction uncertainty each time a new measurement comes in. The next sampling site can be selected via a number of model-based criteria, inspired by the principles of reducing prediction uncertainty or optimizing an objective function, or a combination of the two. Needless to say, adaptive kriging sampling can be regarded as a model-based optimizer.