Abstract: |
Understanding software complexity is fundamental to improving software quality, yet traditional complexity metrics often fail to capture the cognitive effort required for code comprehension. This dissertation addresses the limitations of existing metrics by introducing a neuroscience-based, human-centric approach to complexity assessment, integrating eye tracking and electroencephalography (EEG) to measure programmers’ cognitive load during code understanding.
Through controlled experiments, this research evaluates and compares traditional complexity metrics (e.g., McCabe’s V(g), Halstead metrics), cognitive complexity from SonarSource tools, and behavioral metrics derived from eye-tracking data, including reading time and revisit patterns (NRevisit). Cognitive load measured via EEG serves as the ground truth for assessing metric accuracy. Results highlight that traditional static metrics, particularly V(g) and CC Sonar, often misrepresent perceived complexity, whereas gaze-based metrics, especially NRevisit, exhibit strong correlations with EEG-measured cognitive load (ranging from 0.906 to 0.950). Furthermore, a data-driven hybrid model integrating multiple complexity metrics significantly enhances predictive accuracy (R² = 0.8742), demonstrating the necessity of combining complementary metrics. Extending this work, a larger-scale experiment involving 62 participants and seven Java programs was combined to assess cognitive complexity at a finer granularity. By combining region-level gaze behavior (NRevisit) with EEG-based cognitive load measurements, the study confirmed that dynamic behavioral metrics are especially effective for novice programmers. This is also observed for advanced programmers. However, for this population it is seen that traditional static metrics remain predictive. These findings reinforce the need for personalized, expertise-aware models of code comprehension.
This dissertation advances the field by bridging the gap between software complexity assessment and cognitive neuroscience, advocating for a shift from purely static code metrics to dynamic, programmer-specific evaluation methods. The findings provide a foundation for improving complexity measurement techniques, optimizing software development workflows, and enhancing AI-assisted coding tools, ultimately fostering more maintainable and comprehensible software.
|