Your results might seem spurious because of the small sample size, but when aggregating the results of all the participants they will have enough data to be able to conclude how many people did act like you did with apparent preferences due to chance, and how many actually where "biased" in some way.
Fair point. If they look at all respondents who answered the 13 questions the same exact way I did, then they'd be able to see if the doctor/criminal, fat/thin, female/male distributions are noisy or correlated.