ENBIS-18 in Nancy

2 – 25 September 2018; Ecoles des Mines, Nancy (France) Abstract submission: 20 December 2017 – 4 June 2018

My abstracts

 

The following abstracts have been accepted for this event:

  • Tracking Changes on Insurance Industry of Saudi Arabia along the Past Decade. Is it Expanding?

    Authors: Igor Barahona (Autonomous University of Mexico. Institute of Mathematics), Tarifa Almulhim (School of Business, King Faisal University)
    Primary area of focus / application: Finance
    Secondary area of focus / application: Economics
    Keywords: Insurance industry, multivariate statistics, Deep-learning
    Submitted at 23-Mar-2018 19:14 by IGOR BARAHONA
    Accepted
    3-Sep-2018 14:00 Tracking Changes on Insurance Industry of Saudi Arabia along the Past Decade. Is it Expanding?
    Insurance industry has shown a remarkable growth during the last two decades. There is relatively little attention on studies that present detailed description of the form this industry has expanded during the last years in developing countries. Considering most of the studies available on literature investigate variations on insurance industry in an aggregated way, our approach is focused on providing a more detailed description of this growth. Namely, here we investigate six different indicators, which were measured for nine market niches along the last twelve years. Multivariate statistical methods, such as Principal Components Analysis, Correspondence Analysis and Correlations Matrices allow us to disclose (in a more precisely form) how insurance industry in Saudi Arabia, (an oil-rich country), evolved during the mentioned period. Subsequently, Keras, which is a framework to accurately define and train deep-learning models is applied on estimating industry's values for coming years. Descriptive and forecasting analysis are both performed, with the annual data from the Saudi Arabian Monetary Authority (SAMA) for the period 2005-2016.

    Our preliminary results show an important difference on Retention ratio in contrast with the rest of indicators. On the other hand, penetration and density are indicators that show strong correlation.

    Finally, we propose this work might be helpful for decision makers, government, customers and persons related with the insurance industry in Saudi Arabia. Taking into account that the acquisition of accurate data in the country is a complex task nowadays, this work intents to make a significant contribution on that direction.
  • Curve Linear Regression with clr

    Authors: Amandine Pierrot (EDF R&D), Yannig Goude (EDF R&D), Qiwei Yao (London School of Economics)
    Primary area of focus / application: Modelling
    Keywords: Curve regression, Functional time series, Dimension reduction, Load forecasting, R software
    Submitted at 26-Mar-2018 11:30 by Amandine PIERROT
    Accepted (view paper)
    4-Sep-2018 15:20 Curve Linear Regression with clr
    We present a new R package for curve linear regression: the clr package.
    This package implements a new methodology for linear regression with both curve response and curve regressors, which is described in Cho et al. (2013), « Modelling and forecasting electricity load curves: a hybrid approach », and in Cho et al. (2015), « Modelling and forecasting electricity load via curve linear regression ».
    The key idea behind this methodology is dimension reduction based on a singular value decomposition in a Hilbert Space, which reduces the curve regression problem to several scalar linear regression problems.
    We apply curve linear regression with clr to model and forecast daily electricity loads from Great Britain.
  • Bayesian Models for Evaluation of Risks in Conformity Assessment of Multicomponent Materials or Objects

    Authors: Francesca Pennecchi (INRiM, Torino), Ilya Kuselman (Independent Consultant on Metrology, Modiin), Ricardo J. N. B. da Silva (Centro de Química Estrutural, University of Lisboa), Brynn Hibbert (School of Chemistry, Sydney)
    Primary area of focus / application: Other: Special session on Metrology
    Keywords: Conformity assessment, Risk of false decision, Measurement uncertainty, Multicomponent material, Bayesian model
    Submitted at 26-Mar-2018 14:53 by Francesca Pennecchi
    Accepted (view paper)
    4-Sep-2018 10:10 Bayesian Models for Evaluation of Risks in Conformity Assessment of Multicomponent Materials or Objects
    Documents providing guidance for assessing conformity of an item (entity, object or system) with respect to specified requirements have been published in recent years. The widely-known document JCGM 106 [1] offers a Bayesian approach for evaluation of risks of false decisions in conformity assessment taking into account measurement uncertainty. The probability of accepting the item when it should have been rejected is named ‘consumer’s risk’, whereas the probability of falsely rejecting the item is the ‘producer’s risk’. For a given tested item, such risks are referred to as ‘specific consumer’s risk’ and ‘specific producer’s risk’, respectively. When the item is considered as randomly drawn from a statistical population of such items, corresponding risks are named ‘global consumer’s risk’ and ‘global producer’s risk’, since they characterize the item production globally.

    When multicomponent materials, such as medications, alloys, food and clinical samples, or environmental compartments (e.g. ambient air), undergo conformity assessment, even if the assessment is successful for each component of the material batch or lot, the total probability of a false decision (total consumer’s risk or total producer’s risk), concerning the batch or lot as a whole, might still be significant. Modelling of such scenarios is important for understanding conformity assessment risks in customs control, clinical analysis, pharmaceutical industry, environmental control and other fields.

    In the IUPAC Project [2], Bayesian models of total risk evaluation are formulated for both cases of independence and correlation of the involved variables (components concentrations and corresponding test results). In the former case, based on the law of total probability, it was shown that total risks can be evaluated as appropriate combinations of the particular risks (i.e. those related to the particular/separate components). In the latter case, evaluation of the risks required modelling the variables by multivariate prior probability density and likelihood function in order to obtain corresponding multivariate posterior distribution from which the risks could be calculated. In these situations, correlation could have a considerable influence on the risks.

    Analytical results of examples treated in the Project were calculated in the R programming environment. In parallel, a user-friendly MS-Excel program was developed, based on the same Bayesian approach, but implementing Markov Chain Monte Carlo for quantification of specific risks.

    [1] BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML (2012) JCGM 106:2012, Evaluation of Measurement Data – The Role of Measurement Uncertainty in Conformity Assessment. https://www.bipm.org/en/publications/guides/#gum.
    [2] IUPAC Project 2016-007-1-500 (2016) Risk of conformity assessment of a multicomponent material or object in relation to measurement uncertainty of its test results. https://iupac.org/projects/project-details/?project_nr=2016-007-1-500.
  • Analysis of Designed Experiments with Functional Responses

    Authors: Chris Gotwalt (JMP Division of SAS Institute), Phil Kay (JMP Division of SAS Institute)
    Primary area of focus / application: Design and analysis of experiments
    Secondary area of focus / application: Modelling
    Keywords: Designed experiment, Mixture design, Functional principal components, B-splines, Mixed model, Repeated measures
    Submitted at 26-Mar-2018 16:13 by Chris Gotwalt
    Accepted
    4-Sep-2018 14:10 Analysis of Designed Experiments with Functional Responses
    Designed experiments with a functional responses are now quite common in science and industry. Examples are easy to come by: repeated measurements in time, experiments that study behavior over a range of temperature settings, machines that output shear/viscosity curves or spectral curves, dissolution profiles. A variety of techniques have been developed to model such data, such as partial least squares, linear mixed models with time series errors, and many ad-hoc approaches where one extracts features from the response curves like the overall mean or time where a peak occurs. All of these methods encounter substantial difficulty when one wants a simple prediction equation for the response curve as a function of the longitudinal variable (time, shear, temperature, etc.) and the other experimental factors. We will introduce and demonstrate a two-step approach that simplifies the process substantially. First the functional responses are fit with splines and a functional principal components analysis extracts the principal eigenfunctions and scalar scores for each function. This is a dimension reduction step that solves out the longitudinal variables. Then the leading functional principal component scores, which a just scalars, are modeled using the factors as inputs least squares and a variable selection procedure such as forward selection. Combining the results leads to a single expression that shows how the functional response changes as a function of the longitudinal variable and the experimental factors. We will review the statistical methodology and then use case studies to demonstrate the simplicity and effectiveness of the approach as it is implemented in JMP Pro 14.
  • Big Data Strategies for Online Monitoring of Processes

    Authors: Flavia Dalia Frumosu (Technical University of Denmark), Murat Kulahci (Technical University of Denmark)
    Primary area of focus / application: Modelling
    Secondary area of focus / application: Quality
    Keywords: Manufacturing, Big data, Industry 4.0, Latent methods, Multivariate data
    Submitted at 27-Mar-2018 14:45 by Flavia Dalia Frumosu
    Accepted (view paper)
    5-Sep-2018 11:10 Big Data Strategies for Online Monitoring of Processes
    More and more high frequency and high dimensional data is becoming available particularly in production processes. In manufacturing, this phenomenon is often a direct result of digitalization of production systems as in Industry 4.0. Latent structures based methods are often employed in the analysis of multivariate and complex data. In processes with fast production rate, data on the quality characteristics of the process output tends to be scarcer than the available process data, which is generated through multiple sensors and automated data collection schemes. The research question addressed in this work is how to use all available process data in the pursuit of better process monitoring and control by means of new strategies and latent structure based methods. More precisely, a well-defined strategy for data collection is expected to improve the prediction of quality characteristics and ultimately the performance of online monitoring processes. In this work, we will discuss our proposed approach in the pursuit of such strategy and provide some examples on its execution.
  • Fault Detection for Batch Processes Using k Nearest Neighbours and Dynamic Time Warping

    Authors: Max Spooner (Technical University of Denmark), Murat Kulahci (Technical University of Denmark)
    Primary area of focus / application: Process
    Secondary area of focus / application: Mining
    Keywords: Batch process, Fault detection, Process monitoring, Dynamic time warping, Nearest neighbours, Pensim
    Submitted at 28-Mar-2018 10:07 by Max Spooner
    Accepted
    3-Sep-2018 14:20 Fault Detection for Batch Processes Using k Nearest Neighbours and Dynamic Time Warping
    Online statistical process monitoring of batch processes is challenging due to the three-way structure of the data. Typically, J variables (pH, temperature, concentrations, etc.) are measured at K times points (e.g. every minute) throughout each of I batches. To monitor a new batch, one established approach is to first model the variation of batches from normal operating conditions (NOC) with multiway PCA. As the new batch progresses, it is compared to this model and an alarm is given if it deviates too greatly from the model. This is especially suited to processes where batch-to-batch variation of the variables at each time-point is approximately normally distributed, but less so if there is clustering of batches due to, e.g., changes in suppliers of raw materials. A new data-driven k-nearest neighbour method for online monitoring of batch processes is presented. This method uses the dynamic time warping (DTW) distance between an ongoing batch, and past NOC batches and signals an alarm if the distance becomes too great. The DTW distance has the advantage that it is not sensitive to minor differences in rates of progress between the ongoing and past NOC batches. The method is demonstrated using an extensive dataset of NOC and faulty batches from a simulated penicillin batch process, and shown to be flexible to local structures in NOC batches, such as clustering.