This technology provides a dynamic algorithm that can detect exact- and near-match code clones while optimizing computation time.
Code clone detection is a central tool for software update technology and code plagiarism detection. Due to the complexity of detection, current methods often utilize simplified algorithms that fail to identify all exact-match and near-match clones. Computational cost also hampers many existing approaches to clone detection. To combat time and computationally intensive algorithms, current approaches utilize simplified data representations for their detection schemes, which fail to fully identify all existing clones. There are currently no available methods to effectively and automatically identify syntactically similar code fragments or processes/programs that have similar behavior, even if their code is not necessarily alike.
This technology, termed DyCLINK, is a system to detect code relatives, such as code segments with dynamically similar execution features. These code relative detectors can be used to detect code clones, which are syntactically similar programs, enabling them to be used for tasks such as implementation-agnostic code search and classification of code with similar behavior for human understanding. This method includes generating instruction dependency graphs that are representative of behaviors of code segments, and then using these graphs to compare similarities between the various processes. Utilizing this graph structure enables this technology to be more robust as it records relationship dependencies, allowing users to find groups of program fragments which contain similar code idioms or patterns in data reuse, control flow, and context.
This technology has been validated and shown to be robust in identifying exact and near match clones.
IR CU15182
Licensing Contact: Greg Maskel