In your comment, you said:
bq. I’d love to see some statistics on repeated tests for the same user, using different tables each time. Perhaps you would find that some users consistently benefit a great deal from zebra striping, while others do not.
Both the first and the second studies were exactly that - repeated tests for the same user, using different tables each time (unless I’ve misunderstood you).
What I haven’t done is looked at the results at an individual level to see if for some users there is a clear bias towards zebra striping. Because there were only 6 & 8 questions and 2 & 3 treatments respectively in each study, I dare say we wouldn’t have enough data at the individual level to really see a difference.
For example, the very first participant in the second study had 2 questions asked with the plain version of the table, 4 with the lined version and 2 with the plain version. They got all of the lined questions right and one each of the striped and plain questions right.
Can we deduce something from this? What if the differences we see are somehow related to which questions were presented with which treatments?
I don’t mean to suggest that such studies/analysis are not worthwhile: they’re just tricky and resource intensive! Having said that, I do feel a niggling question around why preference continues to be so strong when the task performance data is comparatively weak.