Category Archives: Graph Theory

Clustering in hypergraphs

P. Purkait, T. J. Chin, A. Sadri and D. Suter, Clustering with Hypergraphs: The Case for Large Hyperedges, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 9, pp. 1697-1711, DOI: 10.1109/TPAMI.2016.2614980.

The extension of conventional clustering to hypergraph clustering, which involves higher order similarities instead of pairwise similarities, is increasingly gaining attention in computer vision. This is due to the fact that many clustering problems require an affinity measure that must involve a subset of data of size more than two. In the context of hypergraph clustering, the calculation of such higher order similarities on data subsets gives rise to hyperedges. Almost all previous work on hypergraph clustering in computer vision, however, has considered the smallest possible hyperedge size, due to a lack of study into the potential benefits of large hyperedges and effective algorithms to generate them. In this paper, we show that large hyperedges are better from both a theoretical and an empirical standpoint. We then propose a novel guided sampling strategy for large hyperedges, based on the concept of random cluster models. Our method can generate large pure hyperedges that significantly improve grouping accuracy without exponential increases in sampling costs. We demonstrate the efficacy of our technique on various higher-order grouping problems. In particular, we show that our approach improves the accuracy and efficiency of motion segmentation from dense, long-term, trajectories.

Subgraph matching (isomorphism) using GPUs for managing commonsense knowledge, and a short list of other graph problems that have had benefit from multiprocessing

Ha-Nguyen Tran, Erik Cambria, Amir Hussain, Towards GPU-Based Common-Sense Reasoning: Using Fast Subgraph Matching, Cognitive Computation, December 2016, Volume 8, Issue 6, pp 1074–1086, DOI: 10.1007/s12559-016-9418-4.

Common-sense reasoning is concerned with simulating cognitive human ability to make presumptions about the type and essence of ordinary situations encountered every day. The most popular way to represent common-sense knowledge is in the form of a semantic graph. Such type of knowledge, however, is known to be rather extensive: the more concepts added in the graph, the harder and slower it becomes to apply standard graph mining techniques.In this work, we propose a new fast subgraph matching approach to overcome these issues. Subgraph matching is the task of finding all matches of a query graph in a large data graph, which is known to be a non-deterministic polynomial time-complete problem. Many algorithms have been previously proposed to solve this problem using central processing units. Here, we present a new graphics processing unit-friendly method for common-sense subgraph matching, termed GpSense, which is designed for scalable massively parallel architectures, to enable next-generation Big Data sentiment analysis and natural language processing applications.We show that GpSense outperforms state-of-the-art algorithms and efficiently answers subgraph queries on large common-sense graphs.

Performing filtering on graphs instead of individual signals

E. Isufi, A. Loukas, A. Simonetto and G. Leus, “Autoregressive Moving Average Graph Filtering,” in IEEE Transactions on Signal Processing, vol. 65, no. 2, pp. 274-288, Jan.15, 15 2017. DOI: 10.1109/TSP.2016.2614793.

One of the cornerstones of the field of signal processing on graphs are graph filters, direct analogs of classical filters, but intended for signals defined on graphs. This paper brings forth new insights on the distributed graph filtering problem. We design a family of autoregressive moving average (ARMA) recursions, which are able to approximate any desired graph frequency response, and give exact solutions for specific graph signal denoising and interpolation problems. The philosophy to design the ARMA coefficients independently from the underlying graph renders the ARMA graph filters suitable in static and, particularly, time-varying settings. The latter occur when the graph signal and/or graph topology are changing over time. We show that in case of a time-varying graph signal, our approach extends naturally to a two-dimensional filter, operating concurrently in the graph and regular time domain. We also derive the graph filter behavior, as well as sufficient conditions for filter stability when the graph and signal are time varying. The analytical and numerical results presented in this paper illustrate that ARMA graph filters are practically appealing for static and time-varying settings, as predicted by theoretical derivations.

Good review of similarity measures between elements with semantics

Mohammad Taher Pilehvar, Roberto Navigli, From senses to texts: An all-in-one graph-based approach for measuring semantic similarity, Artificial Intelligence, Volume 228, November 2015, Pages 95-128, ISSN 0004-3702, DOI: 10.1016/j.artint.2015.07.005.

Quantifying semantic similarity between linguistic items lies at the core of many applications in Natural Language Processing and Artificial Intelligence. It has therefore received a considerable amount of research interest, which in its turn has led to a wide range of approaches for measuring semantic similarity. However, these measures are usually limited to handling specific types of linguistic item, e.g., single word senses or entire sentences. Hence, for a downstream application to handle various types of input, multiple measures of semantic similarity are needed, measures that often use different internal representations or have different output scales. In this article we present a unified graph-based approach for measuring semantic similarity which enables effective comparison of linguistic items at multiple levels, from word senses to full texts. Our method first leverages the structural properties of a semantic network in order to model arbitrary linguistic items through a unified probabilistic representation, and then compares the linguistic items in terms of their representations. We report state-of-the-art performance on multiple datasets pertaining to three different levels: senses, words, and texts.

A quick, formal explanation of the PageRank algorithm and its existing variants

Lei, J.; Chen, H., Distributed Randomized PageRank Algorithm Based on Stochastic Approximation, Automatic Control, IEEE Transactions on , vol.60, no.6, pp.1641,1646, June 2015. DOI: 10.1109/TAC.2014.2359311.

A distributed randomized PageRank algorithm based on stochastic approximation (SA) is proposed to estimate the importance scores of web pages. Compared with the existing methods, the algorithm given here has wider applications in the sense that it can deal with a larger class of randomizations. The strong consistency of the estimates is proved, and the robustness of the PageRank value is analyzed as well. Numerical examples are given to verify the obtained theoretic results.

Novel algorithm for inexact graph matching of moderate size graphs based on Gaussian process regression

Serradell, E.; Pinheiro, M.A.; Sznitman, R.; Kybic, J.; Moreno-Noguer, F.; Fua, P., (2015), Non-Rigid Graph Registration Using Active Testing Search, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.3, pp.625,638. DOI:


We present a new approach for matching sets of branching curvilinear structures that form graphs embedded in R^2 or R^3 and may be subject to deformations. Unlike earlier methods, ours does not rely on local appearance similarity nor does require a good initial alignment. Furthermore, it can cope with non-linear deformations, topological differences, and partial graphs. To handle arbitrary non-linear deformations, we use Gaussian process regressions to represent the geometrical mapping relating the two graphs. In the absence of appearance information, we iteratively establish correspondences between points, update the mapping accordingly, and use it to estimate where to find the most likely correspondences that will be used in the next step. To make the computation tractable for large graphs, the set of new potential matches considered at each iteration is not selected at random as with many RANSAC-based algorithms. Instead, we introduce a so-called Active Testing Search strategy that performs a priority search to favor the most likely matches and speed-up the process. We demonstrate the effectiveness of our approach first on synthetic cases and then on angiography data, retinal fundus images, and microscopy image stacks acquired at very different resolutions.