Physics and Astronomy Building Room 1434A

Andrea Bertozzi

University of California, Los Angeles

I will present semi-supervised and unsupervised graph models for classification using similarity graphs and for community detection in networks. The methods stem from graph-based variational models built on graph cut metrics. The equivalence between the graph mincut problem and total variation minimization on the graph for an assignment function allows one to cast graph-cut variational problems in the language of total variation minimization, thus creating a parallel between low dimensional data science problems in Euclidean space (e.g. image segmentation) and high dimensional clustering. The connection paves the way for new algorithms for data science that have a similar structure to well-known computational methods for nonlinear partial differential equations. This paper focuses on a class of methods build around diffuse interface models (e.g. the Ginzburg–Landau functional and the Allen–Cahn equation) and threshold dynamics, developed by the speaker and collaborators. Semi-supervised learning with a small amount of training data can be carried out in this framework with diverse applications ranging from hyperspectral pixel classification to identifying activity in police body worn video. It can also be extended to the context of uncertainty quantification with Gaussian noise models. The problem of community detection in networks also has a graph-cut structure and algorithms are presented for the use of threshold dynamics for modularity optimization. With efficient methods, this allows for the use of network modularity for unsupervised machine learning problems with unknown number of classes.