ENBIS-11 in Coimbra

4 – 8 September 2011 Abstract submission: 1 January – 25 June 2011

Detecting Influent Observations using CART Classification Trees. Application to the classification of the cities of Paris area

5 September 2011, 09:15 – 10:00


Submitted by
Jean-Michel Poggi
Jean-Michel Poggi
Universite Orsay, Universite Paris Descartes
This plenary talk is an introduction to the session "Business and Industrial Statistics in France".

It is divided in two parts: first 15 minutes devoted to a general presentation of the French Statistical Society (SFdS) and then a 35 minutes talk dedicated to "Detecting Influent Observations using CART Classification Trees. Application to the classification of cities of Paris area", which is a joint work with Avner Bar-Hen and Servane Gey (MAP5, Univ. Paris Descartes, France).

The first part of the talk presents the French Statistical Society (SFdS), its activities, projects, organisation with a special emphasis on business and industrial statistics.

This second part of the talk deals with the detection of influent observations in a classification context, more precisely with measuring the influence of observations on the results obtained with CART classification trees.
A real dataset relating the administrative classification of cities surrounding Paris, France, to the characteristics of their tax revenues distribution, is first presented and used to illustrate the notions developed throughout the talk.
To define the influence of individuals on the analysis, we use influence measures to propose criterions to quantify the sensitivity of the CART classification tree analysis. The proposals, based on jackknife trees, are organized around two lines: influence on predictions and influence on partitions. In addition, the analysis is extended to the pruned sequences of CART trees to produce a CART specific notion of influence.
Finally, getting back to the Paris area real dataset, we analyze it using the new influence-based tools.

Return to programme