Hybrid graphs for code smells: a multi-level model for anti-pattern detection in software components
Main Article Content
Abstract
The paper proposes a hybrid, multi-level method for detecting code smells and anti-patterns in software components, where structure, semantics, metrics, and evolution are treated as first-class signals. A heterogeneous Code Property Graph (Abstract Syntax Tree + Control-flow Graph + Program Dependence Graph) is constructed and enriched with textual embeddings from a pretrained code language model, classical quality metrics (Chidamber–Kemerer, Halstead), and version-control history (churn, co-change, recency). Local idioms are summarized via a sequence–graph encoder at the method/block level, component structure is aggregated by a relation-aware Graph Neural Network at the class/module level, and project context is propagated over a component-interaction graph. To support deployment in evolving codebases, an open-set head is introduced: energy, entropy, and stochastic variance are combined to enable calibrated abstention on unfamiliar patterns. The approach is evaluated on polyglot Java Virtual Machine corpora using time-aware, cross-project splits with multi-label targets (Long Method, God Class, Feature Envy, Data Class, Shotgun-Surgery–like, No-smell). Improvements in macro Area Under the Precision–Recall Curve and F1 overrule/metric baselines, Abstract Syntax Tree-only, and text-only models are observed, while FPR@95TPR is maintained or reduced. Withheld-class experiments show that open-set gating increases Area Under ROC for Open-Set Recognition and TNR@TPR and lowers calibration error, yielding probabilities suitable for thresholded automation and human triage. Cross-language transfer (train Java → test Kotlin/Scala) is shown to be stronger than with single-view models, aided by language-agnostic typing and per-project normalization. Incremental graph maintenance confines computation to changed regions, aligning inference time with CI/CD budgets. By exposing hierarchical attention and channel gates, explanations are produced that align with practitioner reasoning. It is concluded that hybrid graphs with hierarchical reasoning and selective prediction deliver detectors that are more accurate, transferable, and operationally safer for evolving software systems.