Finding Dense Components in Large-Scale Network Using Randomized Binary Search Tree

Anh Phuc Trinh1, , Dang Hai Pham1, Thi Thuy Dung Phan2
1 Hanoi University of Science and Technology, No. 1, Dai Co Viet, Hai Ba Trung, Hanoi, Vietnam
2 Viettel High Technology Industries Corporation, 40th floor, 72 Landmark Keangnam, Hanoi, Vietnam

Main Article Content

Abstract

Given a simple undirected graph G=(V, E), the density of a subgraph on vertex set S is defined as a ratio between the number of edges |E(S)| and the number of vertices |S|, where E(S) is the set of edges induced by vertices in S. Finding the maximum density subgraph has become an intense study in recent years, especially in the social network era. Being based on a greedy algorithm that connects with a suitable graph data structure, we have reduced its time complexity by using a randomized binary search tree, also called treap. We make the complexity analysis in both time and memory requirements, including computational experiments in large scale real networks.

Article Details

References

[1] Yuichi Asahiro and Kazuo Iwama. Finding dense subgraphs. In John Staples, Peter Eades, Naoki Katoh and Alistair Moffat, editors, Algorithms and Computations, Springer Berlin Heidelberg (1995) 102-111.
[2] Yuichi Asahiro, Kazuo Iwama, Hisao Tamaki, and Takeshi Tokuyama. Greedily finding a dense subgraph. J Algorithms, volume 34 (2000) 203-221.
[3] Uriel Feige and Michael Seltser. On the densest k-subgraph problem. Algorithmica, 29:2001 (1997).
[4] Yon Dourisboure, Filippo Geraci, and Marco Pellegrini. Extraction and classification of dense communities in the web. In Proceedings of the 16th International Conference on World Wide Web, WWW '07, New York, NY, USA (2007) 461-470.
[5] David Gibson, Ravi Kumar, and Andrew Tomkins. Discovering large dense subgraphs in massive graphs. In Proceedings of the 31st International Conference on Very Large Data Bases, VLDB '05, VLDB Endowment (2005) 721-732.
[6] Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, volume 46 (1999) 604-632.
[7] Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, and Andrew Tomkins. Trawling the web for emerging cyber-communities. In Proceedings of the 8th International Conference on WWW, New York, NY, Elsevier North-Holland, Inc (1999) 1481-1493.
[8] V. Vinay Ravi Kannan. Analyzing the structure of large graphs. In manuscript, NY, NewYork,USA, Elsevier North-Holland, Inc (1999).
[9] Moses Charikar. Greedy approximation algorithms for finding dense components in a graph. In Proceedings of the Third International Workshop on Approximation Algorithms for Combinatorial Optimization, APPROX '00, London, UK, UK, Springer-Verlag (2000) 84-95.
[10] George B. Dantzig. A history of scientific computing. Chapter Origins of the Simplex Method. ACM, New York, NY, USA (1990) 141-151.
[11] Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection (2014).
[12] David Eppstein and Darren Strash. Listing all maximal cliques in large sparse real-world graphs. Lecture Notes in Computer Science, Springer, volume 6630 (2011) 364-375.
[13] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms. The MIT Press, 3rd edition (2009).
[14] Cecilia R. Seidel, Raimund Aragon. Randomized search trees. Algorithmica, (4/5) volume 16 (1996) 464-497.
[15] Guy E. Blelloch and Margaret Reid-Miller. Fast set operations using treaps. In Proceeding of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA '98, New York, NY, USA (1998) 16-26.
[16] J. W. J. Williams. Algorithm 232: Heapsort. Communications of the ACM, volume 7 (6) (1964) 347-348.
[17] Evgenii Adelson-Velsky, Gevorg Landis. An algorithm for the organization of information. In Proceedings of the USSR Academy of Sciences, volume 146 (1962) 263-266.