Chapter 6: Limitations
Limitations
For the purposes of selecting programmer populations and sorting by experience, I used a conventional industry classification of years of experience (Carlson 2014) as barometer of expertise. This is widely panned in popular literature (Atwood 2008) and not very well established in the research literature. Many studies are completed with self-selected experts (Mastaglio and Rieman 1991) or use distinctions between number of courses taken and skill level (Soloway and Ehrlich 1984). After this study began, new tools that gamify programming such as HackerRank and CodeFights emerged that provide more a more granular analysis of expertise than years of experience. Such tools, if they measure skill and expertise more distinctly, may be better to use in subsequent studies to target experts and novices. Without using such tools, it would be helpful to calibrate participants with 1-2 practice problems before conducting the study and having a panel of experts review their responses to more accurately assign them.
The survey used to measure the perceived cognitive load contained the same language between the experimental and control conditions, but not the same number of questions. It is impossible to rule out that the longer refactored survey had results impacted by respondent fatigue. Although its mean cognitive load was still computed to be lower than the control, in future studies the number of questions should be held constant across both.