HiGraph: A Large-Scale Hierarchical Graph Dataset
Hierarchical Graph Dataset for Malware Analysis with Function Call Graphs and Control Flow Graphs
Interactive Graph Visualization
Explore the hierarchical structure of malware samples through our interactive visualization tool.
Click to explore the complete dataset
Dataset Overview
Hierarchical Graph Structure
HiGraph models each application as a hierarchical graph, preserving both local and global structural information
Download Dataset
Access the complete HiGraph dataset through Hugging Face
Compressed dataset size
11 years of samples
Creative Commons
Updates
Changelog
Latest updates and improvements to the HiGraph dataset.
Aug 17, 2025
Aug 17, 2025
Dataset update with new malware threshold.
- Updated malware detection threshold from 10 to 15, resulting in a refined dataset of 499K Function Call Graphs (previously 595K+).
- Removed abstract section from the project page.
May 16, 2025
May 16, 2025
Initial release of the HiGraph dataset.
HiGraph, a novel, large-scale dataset that models each application as a hierarchical graph, is made publicly available. This initial version includes over 200 million Control Flow Graphs (CFGs) and over 595,000 Function Call Graphs (FCGs).
Future Plans
Future Plans
Continued development and expansion of the HiGraph dataset.
- Regular updates with new samples and features.
- Integration of more advanced graph analysis tools.
- Community contributions and collaborations.