Chargement...
 

Algorithmics-Graphs-Combinatorics

Domaine
Algorithmics-Graphs-Combinatorics
Domain - extra
Computational Geometry
Année
2011
Starting
September
État
Closed
Sujet
Topological and Geometric Inference from Measures
Thesis advisor
CHAZAL Frédéric
Co-advisors
OUDOT Steve, Geometrica group, INRIA
Laboratory
Collaborations
CoMeT — Associated Team with the geometric computation group at Stanford University
Abstract
A new data analysis paradigm emerged recently, where point clouds are no longer treated as mere compact sets but rather as empirical measures. A notion of distance to such measures has been defined and shown to be stable with respect to perturbations of the measure, thus making it possible to deal with data sets corrupted with high-amplitude noise. Although this distance can easily be computed pointwise in the case of a point cloud, its sublevel-sets, which carry the interesting topological and geometric information about the measure, remain hard to compute or approximate.The goal of this Ph.D. will be to turn the theory of distances to measures into effective algorithms for point cloud data analysis in arbitrary dimensions, in the same spirit as what has been done in the past for distances to compact sets. Such algorithms are likely to find applications in data analysis in the presence of significant noise and outliers.
Context
The wide availability of measurement devices and simulation tools has
led to an explosion in the amount of available geometric data, both in
academia and in industry. Often this data is in the form of point
clouds sampled from some unknown geometric spaces or entities. Before
such data can be effectively exploited, its underlying structures must
be identified, extracted, and analyzed. In the last decade or so,
Computational Topology was a major contributor to the understanding of
geometric structures in point cloud data, with the emergence of new
theories like topological persistence or distances to compact
sets. These theories rely on the assumption that the data lies on or
close to some geometric structures and that the level of noise remains
controlled.
Objectives
A new paradigm emerged recently, where point clouds are
no longer treated as mere compact sets but rather as empirical
measures. A notion of distance to such measures has been defined and
shown to be stable with respect to perturbations of the measure, thus
making it possible to deal with data sets corrupted with
high-amplitude noise. Although this distance can easily be computed
pointwise in the case of a point cloud (simply average the squared
distances to the k nearest neighbors), its sublevel-sets, which carry
the interesting geometric information about the measure, remain hard
to compute or approximate. The current challenge is to turn the theory of
distances to measures into effective algorithms for point cloud data
analysis in arbitrary dimensions, in the same spirit as what has been
done in the past for distances to compact sets. Such algorithms would find many applications in data analysis in the presence of significant noise and outliers.
Work program
On the mathematical front, the task of the Ph.D. candidate will be to work out equivalents of the Cech filtrations in the context of distances to measures. This means reinterpreting the sublevel-sets of distances to measures as unions of balls, from which Cech complexes are derived through nerve constructions.

On the algorithmic front, the task will be to devise tractable algorithms for estimating the topological and geometric properties of distances to measures, including their persistence diagrams. Although algebraic constructions based on \v Cech complexes can be naturally turned into theoretical algorithms, the latter have prohibitive complexities. It is then necessary to use approximation and show that simpler structures based on distance comparisons, like the Rips complexes, lead to tractable algorithms while keeping similar algebraic properties.

On the applications front, the designed data structures and algorithms will be implemented and tested against real-life data.
Extra information
Link to the full description of the subject.
Prerequisite
A good background in theoretical computer science and in algebraic topology.
Détails
Expected funding
Institutional funding
Status of funding
Confirmed
Candidates
BUCHET Mickaël - AMX
Utilisateur
frederic.chazal
Créé
Mercredi 04 mai 2011 11:48:50 CEST
dernière modif.
Mardi 17 mai 2011 15:11:09 CEST

Fichiers joints

 filenamecrééhitsfilesize 
Aucun fichier joint à cette fiche


Ecole Doctorale Informatique Paris-Sud


Directrice
Nicole Bidoit
Assistante
Stéphanie Druetta
Conseiller aux thèses
Dominique Gouyou-Beauchamps

ED 427 - Université Paris-Sud
UFR Sciences Orsay
Bat 650 - aile nord - 417
Tel : 01 69 15 63 19
Fax : 01 69 15 63 87
courriel: ed-info à lri.fr