# ENBIS: European Network for Business and Industrial Statistics

You are here:

## ENBIS-18 in Nancy

2 – 25 September 2018; Ecoles des Mines, Nancy (France) Abstract submission: 20 December 2017 – 4 June 2018

The following abstracts have been accepted for this event:

• A Review of Multiband Image Fusion Methods With a Specific Attention to Bayesian Methods

Authors: Jean-Yves Tourneret (University of Toulouse)
Primary area of focus / application: Other: Keynote Lecture
Keywords: Image fusion, Pansharpening, Bayesian inference, Sparse estimation, Sylvester matrix equation
Submitted at 4-Jun-2018 17:49 by Jean-Yves Tourneret
Accepted
3-Sep-2018 09:30 Opening Keynote: Jean-Yves Tourneret on: "A Review of Multiband Image Fusion Methods With a Specific Attention to Bayesian Methods"
This talk will discuss several methods for fusing high spectral resolution images (such as hyperspectral images) and high spatial resolution images (such as panchromatic or multispectral images) in order to provide images with improved spectral and spatial resolutions. The first part will be devoted to summarizing the main image fusion methods based on component substitution, multiresolution analysis, Bayesian inference and matrix factorization. The second part will present recent Bayesian fusion strategies exploiting prior information about the target image to be recovered, constructed by interpolation or by using dictionary learning techniques. The resulting Bayesian estimators can be computed by using samples generated by Markov chain Monte Carlo algorithms, by exploiting the efficiency of alternating optimization methods or by solving Sylvester matrix equations.
• Disaggregated Electricity Forecasting Using Clustering of Individual Consumers

Authors: Jairo Cugliari (Université de Lyon), Benajmin Auder (Université Paris Sud), Yannig Goude (Université Paris Sud / EDF R&D), Jean-Michel Poggi (Université Paris Sud, Université Paris Descartes)
Primary area of focus / application: Other: R
Secondary area of focus / application: Other: R
Keywords: Time series, Wavelets, Electricity demand, Clustering, R
Submitted at 4-Jun-2018 17:50 by Jairo Cugliari
Accepted (view paper)
4-Sep-2018 16:20 Disaggregated Electricity Forecasting Using Clustering of Individual Consumers
We propose to build clustering tools useful for forecasting the load consumption. The idea is to disaggregate the global signal in such a way that the sum of disaggregated forecasts significantly improves the prediction of the whole global signal. The strategy is in three steps: first we cluster curves defining numerous super-consumers, then we build a hierarchy of partitions, and then the best one is finally selected with respect to a disaggregated forecast criterion.

The shape of the curves exhibits rich information about the calendar day type, the meteorological conditions or the existence of special electricity tariffs. Using the information contained in the shape of the load curves, [1] proposed a flexible nonparametric function-valued forecast model called KWF (Kernel+Wavelet+Functional) well suited to handle nonstationary series.

In [2] we applied this strategy to a dataset of individual consumers from the French electricity provider EDF. A substantial gain
of $16$ \% in forecast accuracy comparing to the 1-cluster approach is provided by disaggregation while preserving meaningful classes of consumers.

This project's aim is to evaluate the upscaling capacity of the strategy developed in [2] to cope with the up-growing volume of data. For this, we explore different strategies with simulated datasets ranging from thousands to tens of millions of consumers. Our experiments show that no sophisticated computing technology is needed to solve this problem.

A R package is under development (available in Github: github.com/cugliari/iecclust) where our strategies are implemented.

[1] A. Antoniadis, X. Brossat, J. Cugliari, and J.-M. Poggi. P\évision d'un processus à valeurs fonctionnelles en présence de non stationnarités. Application à la consommation d'électricit\é. Journal de la Société Française de Statistique, 153(2):52 -- 78, 2012.
[2] J. Cugliari, Y. Goude, and J.-M. Poggi. Disaggregated electricity forecasting using wavelet-based clustering of individual consumers. In Energy Conference, IEEE International, 2016.
• Reliability Engineering - Challenges and Opportunities

Authors: Anan Halabi (KLA-Tencor)
Primary area of focus / application: Reliability
Keywords: Reliability engineering, Development cycles, Reliability problems, Reliability methods
Submitted at 4-Jun-2018 19:01 by Anan Halabi
Accepted
4-Sep-2018 12:20 Reliability Engineering - Challenges and Opportunities
This article presents today reliability engineering challenges in front of the increasing complexities of systems and the new trend of faster development cycles and dynamic environments. The paper will illustrate gaps in methods and tools to overcome these challenges. The paper will summarize old problems which were discussed and their status as of today as well current new challenges. Locally, we will share real problems and how practically we handle them. It can be a good opportunity to communicate and share with community of reliability engineering how to cope with real problems in the industry in front of the theoretical methodology and in the other hand a good basis for the community of research to develop new methods and definitions of reliability.
• Generative adversarial nets and Cerema AWP dataset

Authors: Seck Ismaila (Insa Rouen Normandie), Loosli Gaëlle (Université Clermont-Auvergne)
Primary area of focus / application: Other: invited session
Keywords: Deep learning, Computer vision, Generative models, Generative adversarial nets, Cerema AWP dataset
Submitted at 4-Jun-2018 19:41 by SECK Ismaila
Accepted (view paper)
5-Sep-2018 10:00 Generative adversarial nets and Cerema AWP dataset
This talk will be about Generative Adversarial Networks(GANs), and a recently introduced dataset, the Cerema AWP(Adverse Weather Pedestrian). We want to assess the capacity of GANs to generate a particular element, in our case a pedestrian, at a specified place. The cerema AWP database is a good database for that task since for each image we have the bounding box of the pedestrian. The Cerema AWP dataset is an image dataset that was produced in a special installation, a tunnel in which different weather condition can be artificially created. And since that database was originally created for pedestrian detection, there is on each image a pedestrian. And the dataset is annotated according to the weather (10 different weathers), the pedestrian (5 different), their clothes (each pedestrian appears with two different clothes). Additional information such as the pedestrian’s direction or the bounding box of the pedestrian is available. The controlled environment, and those detailed information make this database attractive for our purpose. Indeed the background being fixed, it seems to be a simpler version of the problem we would get with different backgrounds, perspectives or other uncontrolled variations. In the cerema AWP database, most of the variation being controlled and associated with labels, we can study the generation, with all the conditions or according to a subset of weather or other conditions. In a previous study using a standard GAN, generated images presented a mixture of weather on a single output, showing that the generative network had trouble matching the dataset distribution. This problem was solved using a conditioning on the weather. Now the generated images have a uniform weather but a problem persists: we don’t have pedestrians on images. We are going to present the architectures, the ways of conditioning and others tricks to help the generator focus on the generation of pedestrians while generating realistic images.
• Handling Error in Variables in Linear and Quadratic Regression Using a Stochastic Gradient Method: Application to State Estimation in Power Grids

Authors: Stephane Chretien (National Physical Laboratory), Paul Clarkson (National Physical Laboratory)
Primary area of focus / application: Modelling
Secondary area of focus / application: Metrology & measurement systems analysis
Keywords: Error in variables, Stochastic gradient, Composite estimation, Power grids, Semi-definite programming relaxation
Submitted at 4-Jun-2018 20:01 by Stephane Chretien
Accepted
4-Sep-2018 14:10 Handling Error in Variables in Linear and Quadratic Regression Using a Stochastic Gradient Method: Application to State Estimation in Power Grids
Linear regression is one of the most basic model in multivariate statistics. Another problem of great importance is the one of quadratic regression, i.e. the estimation problem for the model

y_i = b_0^tX_ib_0 + epsilon_i

where X_i, i=1,...,n are matrices of order p. This type of quadratic measurements are of paramount relevance in many industrial problems, such as e.g. power grid monitoring. Such problems can sometimes be efficiently studied via a convex relaxation based on Semi-Definite programming (SDP), which can be formulated as the following optimisation problem

min sum_{i=1}^n \ (y_i-\text{trace}(X_iB))^2

under the constraint that B is a positive semi-definite matrix of order p. One of the standard ways to look at this problem is to perform the estimation conditionally on the covariates and derive finite sample or asymptotic properties of the estimator.

In many statistical studies, however, practitioners have to take into account the variability of the covariates and provide a consistent estimator of b0 without prior information about the variance of these covariates (Zellner 1970). The corresponding setting is often known as regression with "errors in variables". Various approaches have been proposed for this problem based on the ideal of total least squares minimisation; see van Huffel (2013) for an exhaustive overview of the problem. The Bayesian approach has also been studied by Florens (1974), for instance.

The goal of our work is to address the problem of regression with error in variables using an efficient and scalable stochastic gradient method. In the case of quadratic measurements, we will consider a Semi-Definite Programming relaxation of the quadratic least-squares problem. These problems are reformulated as estimation in a model where expectations are composed with non-linear functionals. We will follow a methodology developed by Wang (2014) based on a new stochastic gradient approach. One of the main advantages of the methods developed by Wang (2014) is their inherent scalability for very large problems, a feature which is not shared by most standard generalised eigenvalue based methods.

Our main contribution is an improved algorithm which extends the method of Wang (2014). Our method is able to handle both the linear setting and the Semi-Definite relaxation of the quadratic setting. Extensive simulation experiments illustrate the efficiency of our algorithm.
• Are the Ashtabula Fish Still Sick? – A Bayesian Bioequivalence Answer

Authors: Tim Robinson (University of Wyoming)
Primary area of focus / application: Other: US Invited Session
Secondary area of focus / application: Consulting
Keywords: Bayes, Bioequivalence testing, Logistic regression, Environmental, Contamination, Remediation
Submitted at 4-Jun-2018 20:03 by Tim Robinson
Accepted
5-Sep-2018 10:00 Are the Ashtabula Fish Still Sick? – A Bayesian Bioequivalence Answer
The Ashtabula River covers an area that lies in northeast Ohio, US, flowing into Lake Erie's central basin at the city of Ashtabula where its drainage covers an area of 355 km2. Native American inhabitants referred to the river as the Hash-tah-buh-lah or “river of many fish." Beginning in the early 1800s, the lower Ashtabula River was widened and deepened into a deep draft harbor to accommodate commercial shipping and shipbuilding enterprises. In the mid-1900s several chemical production companies began operation along tributaries of the Ashtabula and, over time, discharges from these facilities left the lower Ashtabula River heavily contaminated with polychlorinated biphenyls (PCBs), heavy metals and a suite of other agents. A tributary of the lower Ashtabula was named a Superfund site in 1983 and under the 1987 Great Lakes Water Quality Agreement, the lower 3.2 km of the Ashtabula River was designated as a Great Lakes Area of Concern (AOC). A variety of private and governmental agencies contributed to a cleanup of the lower Ashtabula and associated industrial sites from 1999-2013. To assess the efficacy of the clean-up efforts, several beneficial use impairments (BUI’s) were defined and by 2017, all but two of the BUI’s had achieved pre-defined remediated thresholds. One of the remaining BUI’s involves liver tumor rates and liver concentrations of PCB’s in Brown Bullhead catfish, a pre-identified indicator species. Statutes indicate that a BUI can be removed if it can be demonstrated that an impairment is not limited to the local geographic extent of the AOC, but rather is typical of area-wide conditions. In this talk, I describe a Bayesian approach to bioequivalence testing where the results are being used to support the removal of a BUI related to tumor rates and liver concentrations of PCB’s in the chosen indicator species.