In Uence Of Reading Speed On Pupil Size As A Measure Of .

4m ago
22 Views
0 Downloads
1.04 MB
6 Pages
Transcription

Influence of Reading Speed on Pupil Size as aMeasure of Perceived RelevanceOswald Barral, Ilkka Kosunen, and Giulio JacucciHelsinki Institute for Information Technology HIIT,Department of Computer Science,University of Helsinki, }@helsinki.fiAbstract. Depending on the task or the environment, we read texts atdifferent speeds. Recently, a substantial amount of literature has risen inthe field of predicting relevance of text documents through eye-derivedmetrics to improve personalization of information retrieval systems. Nevertheless, no academic work has yet addressed the possibility of such measures behaving differently when reading at different speeds. This studyfocuses on pupil size as a measure of perceived relevance, and analysesits dependence on reading speed. Our results are followed by a discussion around the need of taking into account reading speed when usingeye-derived measures for implicit relevance feedback.Keywords: pupillometry, perceived relevance, reading behavior11.1BackgroundReading behaviorWhen using information retrieval systems to seek for information, the user adoptsdifferent reading behaviors, depending on several factors. The task to achieve,the environment or time pressure are some of them. The main component ofreading behavior addressed in this study is reading speed.Different reading speeds are usually associated to different reading tasks.Skimming can be helpful when there is a need to address a large amount of information and retain the most relevant parts of it. However, reading at fast ratesinvolves less comprehension [1, 2]. If the goal of the reading process is to comprehensively understand the text, a normal reading speed will be adopted. On theother side, if there is a reduced available time and the amount of information islarge, a faster reading speed will be more adequate, in order to focus just on therelevant parts of the text. The information seeker will therefore always adopt anoptimal reading speed for every situation.Having said that, as the amount of information available increases, the userstend to adopt faster reading rates, especially when seeking for information. Liumade an extensive survey addressing the changes of reading behavior in peopleranging between 30 and 45 years old [3]. The participants in the study were

2Oswald Barral, Ilkka Kosunen, and Giulio Jacucciasked to answer a set of questions regarding how their reading characteristicshad changed over the past ten years. One of the outcomes of the survey was that80% of the participants reported to have increased the time spent scanning andbrowsing, which are reading behaviors that imply high reading rates.1.2Pupil size as a measure of perceived relevanceEye tracking technologies have been used in the field of information retrieval andpersonalized access over the past years as eye-derived metrics have proven to beuseful to indicate users subjective perception of relevance [4–6]. In the goal ofpersonalizing results, these implicit metrics are highly valuable as they providean intrinsically individualized feedback.Studies have shown a relationship between pupil size and user attention [7,8]. It is well known that pupil size and cognitive load are highly correlated,different researches having approached the matter. Experiments have rangedfrom mathematical operations to search tasks [9]. Interestingly, Oliveira et al.[10] showed how pupil size could be of special interest when analyzing relevancein web search results. They studied both relevance of images and documents.Focusing on changes in pupil diameter, they were able to claim pupil size tobe a carrier of interest-related information. Their experiments were on a verycontrolled level, letting the demonstration of similar conclusions in less controlledexperiments as future research.2The present studyGiven the above-mentioned reading behaviors, especially the increasing trendto read at fast reading rates, we consider highly relevant to study eye-derivedimplicit measures of relevance under the influence of different factors. In thepresent study, we focus on pupil size under the influence of reading speed. Wedesigned an experiment in order to study whether reading speed has a directimpact on the ability of pupil size to indicate perceived relevance in documents.2.1ApparatusThe machine used to run the experiment was a 64bit processor Intel Core i73930k3.20GHz 3.20GHz 16GB RAM, OS Windows 7 Enterprise SP1 with NVIDIAGEForce GTX580 GPU. The display device was a Dell 1703FPt 17” LCD Monitor at a 1280x1024 resolution. The experiment was developed using ePrimeSoftware. The texts were displayed in an 85% window (I.e. 1088x870.4 pixels)with a 22-point font size. The subject was asked to sit 40-50 cm away from thescreen approximately and to take a comfortable position. A Mirametrix S2 eyetracker operating at 60 Hz was situated under the screen and slightly movedto best fit to the subject eyes according to his natural and more comfortableposition. The number of clock ticks since the booting of the operative system

Influence of Reading Speed on Pupil Size and Perceived Relevance3was used as reference for the synchronization between the Mirametrix S2 eyetracker and the ePrime software.A first eye tracking calibration procedure was carried out at the beginningof the experiment and another one at the middle of the experiment. Each calibration procedure lasted for about 5 minutes, depending on the subject. Theprocess was repeated up to five times to ensure optimal calibration (average error 40 pixels). If the threshold was not reached within the first attempts, theaverage error margin was augmented in 10 pixels. The subject was rejected ifafter 5 additional attempts the average error was not fewer than 50 pixels. Twosubjects out of ten were rejected due to calibration impossibility.2.2Participants and ProcedureTen students (four undergraduate and six master’s) participated in the experiment. Two of them were women. Eight participants reported to have advancedEnglish reading level, and two reported a medium English reading level. None ofthem was a native English speaker. All of them had normal or corrected to normal vision. As already pointed out, two of the participants did not overcome thecalibration procedure due to technical difficulties and their data was rejected.At the beginning of the experiment the participants were asked to sign a consentform and to indicate basic information about themselves. The data was savedanonymously in order to preserve participants privacy.The participants were first conducted through a training session. The trainingconsisted of two parts. The first one intended to get the users familiar with thethree different speeds. As the reading speed is relative to the user’s expertise orabilities, among other factors, instead of using an absolute word per minute ratefor each of the speeds, an approach similar to the one by Dayson and Haselgrovewas implemented [2]. The participants were first asked to read a document at acomfortable reading speed in order to be able to understand everything. Theywere instructed to reproduce that speed when they would be asked to read ata normal speed. They were then presented another text and asked to read it astwice as fast as the first text. If the time spent reading was higher than 70% ofthe previous one, they were presented a new text and asked to read faster, untilthey managed to spend less than 70% of the original time reading the text. Theywere then instructed to reproduce that speed every time they would be asked toread at a fast speed. An homologous procedure was used to train the skimmingspeed. Different texts were used in each of the phases in such a way that thefamiliarity with the text could not influence the reading speed. The participantswere told explicitly to try to do their best to reproduce each of those speedsduring the experiment. The second part of the training consisted of using theactual system until the participants explicitly recalled to have fully understoodhow they were supposed to interact with the system.We decided to split the recording session into two parts as the participants ofa pilot study reported to feel tired after having gone through the whole sequenceof abstracts. Also, this allowed the recalibration of the eye-tracking device, avoiding the accumulation of systematic error [11]. Each of the two parts consisted

4Oswald Barral, Ilkka Kosunen, and Giulio Jacucciof three topics. For each of the topics, the participants were asked to read ina given speed a sequence of abstracts. For each abstract, they were asked toassess as soon as possible using the left and right arrows whether the text wasrelevant to the topic (binary-rating). The participants were asked to keep reading until the end of the text at that given speed and to press space when done.Then, they were asked to grade, in a scale from 0 to 9, how relevant was the abstract to the topic (scale-rating) and how confident they felt about their answer(confidence-rating) .For each of the six topics six abstracts were shown, half of them being relevantand the other half being non-relevant. The participants had to read two of theabstracts at a normal speed, two at a fast speed and two at a skimming speed.The order of the topics and the abstracts, as well as the reading speeds, wasrandomized. The topics were selected to be of common understanding and theparticipants were allowed to ask to the experimenter any question regardingthe understanding of those. The topics were also selected in a way that theirsemantic meaning would not overlap. The relevant abstracts were selected notto be too obvious in the first lines. The non-relevant abstracts were selected tobe completely non relevant to any of the topics.3Analysis and ResultsFor each abstract we took a time window of 10 seconds (i.e. five seconds beforeand five seconds after binary-rating) and averaged the values of the pupil each500 milliseconds. We normalized the pupil data in each text by subtracting themean of the pupil size over the entire text. Only the data of texts where thebinary-rating and the scale-rating were congruent, and where confidence-ratingwas higher than 6 were taken into account (i.e. valid-trials). In these cases weobserved a clear spike in the pupil size about 1 to 1.5 seconds after assessing thebinary-rating. This was not surprising as the maximal pupil dilation has beenreported between the event attracting attention and 1.3 seconds after [8].In order to test for statistical significance between the spikes when assessingtexts as relevant and when assessing texts as non-relevant we first took, for everyabstract, the average value of the normalized pupil size in the time window of0 to 1.3 seconds after the response time. Then, for the overall texts, as well asfor each speed and each condition (the user answered relevant or answered nonrelevant) we averaged the values within subjects. Finally, we performed Wilcoxonsigned-rank text on the resulting paired samples.In overall, pupil size was significantly higher when assessing texts as relevant (Mdn 0.8) than when assessing texts as non-relevant (M dn 0.66),z 2.366, p 0.05, r 0.63. When analyzing the texts read at normalspeed, pupil size was also found to be significantly higher when assessing relevant (M dn 0.93) than when assessing non-relevant (M dn 0.8), z 2.197,p 0.05, r 0.59. However, when analyzing the texts read at fast speed –relevant (M dn 0.91), non-relevant (M dn 0.7), z 1.690, r 0.45– and

Influence of Reading Speed on Pupil Size and Perceived Relevance5Fig. 1. Beginning from top-left: Pupillary response when confidence-rating is below 6;Pupillary response when binary-rating and scale-rating are not congruent; Pupillaryresponse in the valid-trials. Beginning from bottom-left: Pupillary response for validtrials read at normal speed, fast speed and skimming speed. The red line indicatesthe moment of binary-rating. The blue line represents the non-relevant and the greenline represents the relevant texts. The plotted values are normalized within trials andaveraged across participants.skimming speed –relevant (M dn 66), non-relevant (M dn 0.59), z 0.676,r 0.18– no statistical significance was found.4DiscussionThe results showed a clear relationship between the pupil dilation and the participants’ subjective judgments. On top of that, the analysis of pupil size confirmedour hypothesis that its behavior would differ when reading documents at different speeds. When looking at the data without taking into account the speed inwhich the document was read, statistical analysis showed a significantly biggerresponse-related spike when the user perceived the document as relevant thanwhen perceiving it as irrelevant. Nevertheless, when having a look at the samedata but splitting the analysis by reading speed, the data showed statistical significance only when the user was reading at normal speed. That is, when thesubject was given the instruction to read at faster rates than the comfortablenormal reading speed, the response-related spike in the pupil size did not carrystatistically relevant information regarding the judgement of the participant.With this study we aim to raise a discussion around the fact that, whendealing with documents, different reading behaviors might have a direct impacton the reliability of our eye-derived measures. Thus, reading behaviors shouldbe controlled and studied in order to have more accurate implicit feedback and,

6Oswald Barral, Ilkka Kosunen, and Giulio Jacucciconsequently, better personalization. As with pupil size, we believe that fixationderived features used to infer relevance in documents will also behave differentlywhen reading at different speeds and, therefore, need a closer look when the aimis to build realistic personalized search engines based on implicit feedback [12].We encourage