Could the responses we have seen so far be due to what we call "response bias?" In other words, what if people don't actually hear the note as longer or shorter, they just say it was longer or shorter because the video was long or short? Maybe they don't hear the notes differently at all! In this case, the differences in perceived note length that participants report would not mean what we think they mean. We think they mean that participants really perceive the note differently when different videos go with it. Maybe, though, they just say they did.
Ultimately, the "timbre" experiment provides strong evidence against this "response bias" hypothesis (see the "Shiver me Timbres!" link above). If the video just caused people to say notes were longer even though it did not change how the notes sounded, we would expect the video to change any note that goes with it, such as trumpet, clarinet, and sung notes. This was not the case. We found that only notes that could plausibly be created by an impact were effected, which suggests that the video really does change how we perceive those notes.
We set out to confirm this idea with a "text" experiment. In this experiment instead of showing a person hitting a marimba note along with a sound, we just had the word "long" or "short" appear along with a note. Since these words could not affect how a person perceives a note, any influence of the text would have to be response bias.
As expected, the words "long" and "short" had much less influence on perceived note length than the "long" and "short" marimba videos did. The small influence they had we attribute to response bias.