Statistical Inference on Networks

This project develops rigorous statistical frameworks for analyzing network structures and relationships in online social systems. We advance fundamental methods for link prediction, community detection, and structural inference while addressing key challenges of model selection, overfitting, and generalization in network analysis.

Our work establishes principled approaches to network inference that achieve near-optimal prediction while avoiding common pitfalls of overfitting and model mis-specification.

Key methodological contributions include:

Ensemble link prediction combining multiple network models through stacking to achieve near-optimal performance
Model selection frameworks for community detection that properly balance fit and complexity
Overfitting and underfitting detection methods that identify when network models fail to generalize
Cross-validation techniques adapted for network data with dependencies between observations
Benchmark evaluation protocols establishing rigorous standards for comparing network inference methods

Our statistical methods have broad applications across computational social science, enabling more reliable inference about relationship formation, group structure, and information diffusion in complex online networks. This work provides the methodological foundation for empirical studies of platform dynamics and social behavior.

Featured Research

Stacking models for nearly optimal link prediction in complex networks Ghasemian, A., Hosseinmardi, H., Galstyan, A., Airoldi, E. M., & Clauset, A. (2020). Proceedings of the National Academy of Sciences, 117(38).

Link prediction is a fundamental problem in network analysis with applications ranging from recommender systems to identifying missing interactions in biological networks. This work develops a stacking ensemble approach that systematically combines diverse network models—including graph embeddings, similarity indices, and probabilistic models—to achieve near-optimal link prediction performance across a wide range of network types. Our framework demonstrates that no single method dominates across all networks, but principled ensemble methods can consistently approach optimal performance.

Statistical inference framework for analyzing network structures in online social systems.

Statistical Inference on Networks

Featured Research

References

2020

2019