Autonomous graph mining algorithm search with best performance trade-off

Yoon, Minji; Gervet, Théophile; Hooi, Bryan; Faloutsos, Christos

doi:10.1007/s10115-022-01683-8

Autonomous graph mining algorithm search with best performance trade-off

Regular paper
Published: 21 May 2022

Volume 64, pages 1571–1602, (2022)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Minji Yoon ORCID: orcid.org/0000-0003-3693-9127¹,
Théophile Gervet¹,
Bryan Hooi² &
…
Christos Faloutsos¹

354 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The pervasiveness of graphs today has raised the demand for algorithms to answer various questions: Which products would a user like to purchase given her order list? Which users are buying fake followers? Myriads of new graph algorithms are proposed every year to answer such questions—each with a distinct problem formulation, computational time, and memory footprint. This lack of unity makes it difficult for practitioners to compare different algorithms and pick the most suitable one for their application. These challenges create a gap in which state-of-the-art techniques developed in academia fail to be optimally deployed in real-world applications. To bridge this gap, we propose AutoGM, an automated system for graph mining algorithm development. We first define a unified framework UnifiedGM for message-passing-based graph algorithms. UnifiedGM defines a search space in which five parameters are required to determine a graph algorithm. Under this search space, AutoGM explicitly optimizes for the optimal parameter set of UnifiedGM using Bayesian Optimization. AutoGM defines a novel budget-aware objective function for the optimization to find the best speed-accuracy trade-off in algorithms under a computation budget. On various real-world datasets, AutoGM generates novel graph algorithms with the best speed/accuracy trade-off compared to existing models with heuristic parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Various Graphs in Graph Computing

The ubiquity of large graphs and surprising challenges of graph processing: extended survey

Article 29 June 2019

Siddhartha Sahu, Amine Mhedhbi, … M. Tamer Özsu

Predicting Graph Operator Output over Multiple Graphs

Notes

This paper is an extended version of [46].
https://github.com/minjiyoon/ICDM20-AutoGM.
https://github.com/fmfn/BayesianOptimization.

References

Bahmani B, Chowdhury A, Goel A (2010) Fast incremental and personalized pagerank. Proc VLDB Endow 4(3):173–184
Article Google Scholar
Bennett J, Lanning S (2007) August. The netflix prize. In Proceedings of KDD cup and workshop (Vol. 2007, p. 35)
Brochu E, Cora VM, De Freitas N (2010) A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599
Brohee S, Van Helden J (2006) Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinf 7(1):1–19
Article Google Scholar
Dreżewski R, Sepielak J, Filipkowski W (2015) The application of social network analysis algorithms in a system supporting money laundering detection. Inf Sci 295:18–32
Article MathSciNet Google Scholar
Eksombatchai C, Jindal P, Liu JZ, Liu Y, Sharma R, Sugnet C, Ulrich M, Leskovec J (2018) April. Pixie: a system for recommending 3+ billion items to 200+ million users in real-time. In: Proceedings of the 2018 world wide web conference. pp. 1775–1784
Floreano D, Dürr P, Mattiussi C (2008) Neuroevolution: from architectures to learning. Evol Intel 1(1):47–62
Article Google Scholar
Gao Y, Yang H, Zhang P, Zhou C, Hu Y (2020) July. Graph neural architecture search. IJCAI 20:1403–1409
Google Scholar
Guo M, Yi T, Zhu Y, Bao Y (2021) Jitune: Just-in-time hyperparameter tuning for network embedding algorithms. arXiv preprint arXiv:2101.06427
Hamilton W, Ying Z, Leskovec J (2017) Inductive representation learning on large graphs. Advances in neural information processing systems, 30
Hooi B, Song HA, Beutel A, Shah N, Shin K, Faloutsos C (2016) August. Fraudar: Bounding graph fraud in the face of camouflage. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. pp. 895–904
Hu Z, Dong Y, Wang K, Sun Y (2020) April. Heterogeneous graph transformer. In: Proceedings of The Web Conference 2020. pp. 2704–2710
Kandasamy K, Dasarathy G, Oliva JB, Schneider J, Póczos B (2016) Gaussian process bandit optimisation with multi-fidelity evaluations. Advances in neural information processing systems, 29
Kandasamy K, Dasarathy G, Schneider J, Póczos B (2017) July. Multi-fidelity bayesian optimisation with continuous approximations. In: International Conference on Machine Learning. PMLR. pp. 1799–1808
Kandasamy K, Neiswanger W, Schneider J, Poczos B, Xing EP (2018) Neural architecture search with bayesian optimisation and optimal transport. Advances in neural information processing systems, 31
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Proceedings of the 3rd International Conference on Learning Representations 2014
Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations 2017
Kitano H (1990) Designing neural networks using genetic algorithms with graph generation system. Complex Syst 4:461–476
MATH Google Scholar
Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632
Article MathSciNet Google Scholar
Koutra D, Ke TY, Kang U, Chau DHP, Pao HKK, Faloutsos C (2011) September. Unifying guilt-by-association approaches: theorems and fast algorithms. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 245–260). Springer, Berlin, Heidelberg
Lemaire C, Achkar A, Jodoin PM (2019) Structured pruning of neural networks with budget-aware regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 9108–9116
Li X, Zhou Y, Pan Z, Feng J (2019) Partial order pruning: for best speed/accuracy trade-off in neural architecture search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 9145–9153
Liu H, Simonyan K, Yang Y (2019) Darts: differentiable architecture search. International Conference on Learning Representations, ICLR 2019
Liu H, Simonyan K, Vinyals O, Fernando C, Kavukcuoglu K (2017) Hierarchical representations for efficient architecture search. International Conference on Learning Representations, ICLR 2018
Metropolis N, Ulam S (1949) The Monte Carlo method. J Am Stat Assoc 44(247):335–341
Article Google Scholar
Michalak K, Korczak J (2011) September. Graph mining approach to suspicious transaction detection. In 2011 Federated conference on computer science and information systems (FedCSIS) (pp. 69–75). IEEE
Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. Stanford InfoLab
Pearl J (2022) Reverend Bayes on inference engines: a distributed hierarchical approach. In Probabilistic and Causal Inference: The Works of Judea Pearl. pp. 129–138
Robert JV (2021) Linear programming: foundations and extensions. Springer, Berlin
Google Scholar
Sen P, Namata G, Bilgic M, Getoor L, Galligher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93–93
Google Scholar
Shchur O, Mumme M, Bojchevski A, Günnemann S (2018) 2018. Pitfalls of graph neural network evaluation, Relational Representation Learning Workshop, NeurIPS
Shin K, Eliassi-Rad T, Faloutsos C (2016) December. Corescope: Graph mining using k-core analysis-patterns, anomalies and algorithms. In: 2016 IEEE 16th international conference on data mining (ICDM) (pp. 469–478). IEEE
Strehl A, Ghosh J (2000) December. A scalable approach to balanced, high-dimensional clustering of market-baskets. In International Conference on High-Performance Computing (pp. 525–536). Springer, Berlin, Heidelberg
Szekeres G, Wilf HS (1968) An inequality for the chromatic number of a graph. J Comb Theory 4(1):1–3
Article MathSciNet Google Scholar
Tu K, Ma J, Cui P, Pei J, Zhu W (2019) July. Autone: Hyperparameter optimization for massive network embedding. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 216–225
Vazquez A, Flammini A, Maritan A, Vespignani A (2003) Global protein function prediction from protein-protein interaction networks. Nat Biotechnol 21(6):697–700
Article Google Scholar
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2018) 6th International Conference on Learning Representations , ICLR 2018
Wang Y, Liu S, Yoon M, Lamba H, Wang W, Faloutsos C, Hooi B (2020) November. Provably robust node classification via low-pass message passing. In: 2020 IEEE International Conference on Data Mining (ICDM) (pp. 621–630). IEEE
Wright S, Nocedal J (1999) Numerical optimization. Springer Sci 35(67–68):7
MATH Google Scholar
Wu F, Souza A, Zhang T, Fifty C, Yu T, Weinberger K (2019) May. Simplifying graph convolutional networks. In International conference on machine learning (pp. 6861–6871). PMLR
Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A comprehensive survey on graph neural networks. IEEE Trans ctions Neural Networks Learn Systems 32(1):4–24
Article MathSciNet Google Scholar
Yao L, Mao C, Luo Y (2019) July. Graph convolutional networks for text classification. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 7370-7377)
Yoon M, Jung J, Kang U (2018) April. Tpa: fast, scalable, and accurate method for approximate random walk with restart on billion scale graphs. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE) (pp. 1132–1143). IEEE
Yoon M, Jin W, Kang U (2018) April. Fast and accurate random walk with restart on dynamic graphs with guarantees. In: Proceedings of the 2018 World Wide Web Conference (pp. 409–418)
Yoon M, Hooi B, Shin K, Faloutsos C (2019) July. Fast and accurate anomaly detection in dynamic graphs with a two-pronged approach. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 647–657)
Yoon M, Gervet T, Hooi B, Faloutsos C (2020) November. Autonomous graph mining algorithm search with best speed/accuracy trade-off. In: 2020 IEEE International Conference on Data Mining (ICDM) (pp. 751–760). IEEE
You J, Ying Z, Leskovec J (2020) Design space for graph neural networks. Adv Neural Inf Process Syst 33:17009–17021
Google Scholar
Yuan Y, Wang W, Coghill GM, Pang W (2021) A novel genetic algorithm with hierarchical evaluation strategy for hyperparameter optimisation of graph neural networks. arXiv preprint arXiv:2101.09300
Zamir O, Etzioni O (1998) August. Web document clustering: a feasibility demonstration. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval (pp. 46–54)
Zhang Z, Wang X, Zhu W (2021) Automated machine learning on graphs: a survey. In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21), Survey Track
Zhong Z, Yan J, Liu CL (2017) Practical network blocks design with q-learning. arXiv preprint arXiv:1708.05552, 6
Zoph B, Le QV (2017) Neural architecture search with reinforcement learning. International Conference on Learning Representations, ICLR 2017

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, USA
Minji Yoon, Théophile Gervet & Christos Faloutsos
National University of Singapore, Queenstown, Singapore
Bryan Hooi

Authors

Minji Yoon
View author publications
You can also search for this author in PubMed Google Scholar
Théophile Gervet
View author publications
You can also search for this author in PubMed Google Scholar
Bryan Hooi
View author publications
You can also search for this author in PubMed Google Scholar
Christos Faloutsos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minji Yoon.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yoon, M., Gervet, T., Hooi, B. et al. Autonomous graph mining algorithm search with best performance trade-off. Knowl Inf Syst 64, 1571–1602 (2022). https://doi.org/10.1007/s10115-022-01683-8

Download citation

Received: 29 January 2021
Revised: 21 April 2022
Accepted: 23 April 2022
Published: 21 May 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s10115-022-01683-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Autonomous graph mining algorithm search with best performance trade-off

Abstract

Access this article

Similar content being viewed by others

The Various Graphs in Graph Computing

The ubiquity of large graphs and surprising challenges of graph processing: extended survey

Predicting Graph Operator Output over Multiple Graphs

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Autonomous graph mining algorithm search with best performance trade-off

Abstract

Access this article

Similar content being viewed by others

The Various Graphs in Graph Computing

The ubiquity of large graphs and surprising challenges of graph processing: extended survey

Predicting Graph Operator Output over Multiple Graphs

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation