Resumen
With the blooming of blockchain-based smart contracts in decentralized applications, the security problem of smart contracts has become a critical issue, as vulnerable contracts have resulted in severe financial losses. Existing research works have explored vulnerability detection methods based on fuzzing, symbolic execution, formal verification, and static analysis. In this paper, we propose two static analysis approaches called ASGVulDetector and BASGVulDetector for detecting vulnerabilities in Ethereum smart contacts from source-code and bytecode perspectives, respectively. First, we design a novel intermediate representation called abstract semantic graph (ASG) to capture both syntactic and semantic features from the program. ASG is based on syntax information but enriched by code structures, such as control flow and data flow. Then, we apply two different training models, i.e., graph neural network (GNN) and graph matching network (GMN), to learn the embedding of ASG and measure the similarity of the contract pairs. In this way, vulnerable smart contracts can be identified by calculating the similarity to labeled ones. We conduct extensive experiments to evaluate the superiority of our approaches to state-of-the-art competitors. Specifically, ASGVulDetector improves the best of three source-code-only static analysis tools (i.e., SmartCheck, Slither, and DR-GCN) regarding the F1 score by 12.6% on average, while BASGVulDetector improves that of the three detection tools supporting bytecode (i.e., ContractFuzzer, Oyente, and Securify) regarding the F1 score by 25.6% on average. We also investigate the effectiveness and advantages of the GMN model for detecting vulnerabilities in smart contracts.