I am interested in solving real world problems on large Graph or Network data using statistically sound reasoning and mathematically rigorous methods. Traditional statistics and Machine Learning operate on data that are collections of real vectors. We define and use the analogous concepts where the data are connections between entities that do not live in such a nice space. We solve real problems using this graph representation, and make sure that those solutions are grounded in a solid theoretical framework, allowing us to reason more effectively. I am also interested in understanding the complexity of streaming computations. Of particular interest are online learning, one pass algorithms, and the W-stream model of computation.
The major problems of interest to me are:
Prediction Clustering Anomaly Detection I am particularly interested in solving these problems under the Streaming Analysis Model. When the underlying graph is changing over time, many problems become more challenging and interesting.
If we want to apply linear algebra methods to graph analysis in streams, it is important to understand the connection between the numerical accuracy of these computations and the analysis accuracy of the methods that they support. For example in spectral partitioning, how much accuracy do we need on the eigenvectors in order to find good partitions?