ENBIS-16 in Sheffield

11 – 15 September 2016; Sheffield Abstract submission: 20 March – 4 July 2016

Clustering Variables Based on a Dynamic Mixed Criteria: Application to the Energy Management

13 September 2016, 11:40 – 12:10


Submitted by
Christian Derquenne
Christian Derquenne (EDF R&D)
The research structures in the data is an essential aid to understand the phenomena to be analyzed before any further treatment. Unsupervised learning and visualization techniques are the main tools to facilitate this research facilities. We offer a set of methods for clustering numeric variables. These are based on a mixed approach: correlation between the initial variables and one-dimensionality of the resulting groups to dynamically build a typology by controlling the number of classes and quality. It allows primarily to "discover" an "optimal" number of classes without fixing it a priori. We introduce an approach for distributing the mixed criterion. We evaluate our approach on simulated data sets and compared to a method using PCR by oblique rotation (VARCLUS SAS software) and the criterion of dissimilarity Ward on the correlation matrix. The results show a significant gain in our approach to VARCLUS and Ward, in terms of detecting the number of classes and fitness content groups from the observed typology. Then, as part of energy management, we built time series typologies in the areas of market prices and electricity consumption. The characterization of each group of curves obtained allowed to identify and understand the behavior of the joint evolution of the phenomena studied and to detect differences in behavior between clusters. Finally, we discuss the contributions and limitations of our approach, and propose improvements and future directions, including the problems of non-linearity between variables, missing data, presence of outliers and large on individuals, and variables.

Return to programme