By contrast with the two previous sections, we start with some background information (up to Section 8.1) on the state of distraction.
The globally accepted definition of driver distraction follows: it is a diversion of attention, away from activities critical for safe driving (the primary task) and toward a competing activity (178; 180).
Inattention, sometimes used—mistakenly—as a synonym of distraction, is defined as a diminished attention to activities that are critical for accomplishing a primary task, but not necessarily in the presence of a competing activity (180). Therefore, driver distraction is one particular form of driver inattention (181). Inattention is a broader term as it can be caused, e.g., by drowsiness. It indeed occurs in a wide range of situations in which the driver fails to attend to the demands of driving, such as when a desire to sleep overcomes a drowsy driver.
Driver distraction can be caused by any cognitive process such as daydreaming, mind wandering, logical and mathematical problem solving, decision making, using any kind of in-vehicle system, e.g., for entertainment, navigation, communication (including a cell phone), and any other activity that may affect the driver’s attention to driving (7). It is helpful to distinguish between four types of distractions (44; 67): (1) manual distraction (e.g., manually adjusting the volume of the radio), (2) visual distraction (e.g., looking away from the road), (3) auditory distraction (e.g., answering a ringing cell phone), and (4) cognitive distraction (e.g., being lost in thought). Several distracting activities may, however, involve more than one type of distraction (e.g., talking on the phone while driving creates at least an auditory distraction and a cognitive distraction, under the assumption that a hands-free system is used, thereby avoiding manual distraction).
When distracted, the driver looses awareness of the current driving situation. Being aware of a situation (whether for driving or for some other activity) is often called situational awareness (SA). A loss of SA while driving results in a reduction of vigilance and in an increase of the risk of accident. In driving, a major aspect of SA is the ability to scan the driving environment and to sense dangers, challenges, and opportunities, in order to maintain the ability to drive safely. As a driver moves through the environment, he/she must—to avoid getting into an accident—identify the relevant information in rapidly changing traffic conditions (e.g., distance to other vehicles, closing speed), and be prepared to react to suddenly-appearing events (e.g., braking because of an obstacle, obeying a road sign). To achieve SA, a driver must thus perceive correctly his/her driving environment (46), be attentive, and have a working memory (232). It follows that any distraction that harms the driver’s attention may adversely impact SA (97).
Kircher and Ahlström (103) argue that existing definitions of distraction have limitations because they are difficult to operationalize, and they are either unreasonably strict and inflexible or suffering from hindsight bias, the latter meaning that one needs to know the outcome of the situation to be able (1) to tell what the driver should have paid attention to and, then, (2) to judge whether he/she was distracted or not. The authors are also concerned that distraction-detection algorithms (1) do not take into account the complexity of a situation, and (2) generally cover only eyes-off-road (EOR) and engagement in non-driving related activities (NDRA). They thus developed a theory, named MiRA (minimum required attention), that defines the attention of a driver in his/her driving environment, based on the notion of SA. Instead of trying to assess distraction directly, one does it indirectly, by first trying to assess attention. Recall that distraction is a form of inattention.
Based on the MiRA theory, a driver is considered attentive at any time when he/she samples sufficient information to meet the demands of the driving environment. This means that a driver should be classified as distracted only if he/she does not fulfill the minimum attentional requirements to have sufficient SA. This occurs when the driver does not sample enough information, whether or not simultaneously performing an additional task. This theory thus acknowledges (1) that a driver has some spare capacity at his/her disposal in the less complex driving environments, and (2) that some glances toward targets other than the roadway in front of him/her may, in some situations, be needed for the driving task (like looking at, or for, a vehicle coming from each of the branches at a crossroad). This means that EOR and engagement in NDRA do not necessarily lead to driver distraction.
The MiRA theory does not conform to the traditional types of distraction (manual, visual, auditory, cognitive) as it does not prescribe what sensory channel a certain piece of information must be acquired through.
In an attempt to operationalize the MiRA theory, Ahlström et al. (3) present an algorithm for detecting driver distraction that is context dependent and uses (1) eye-tracking data registered in the same coordinate system as an accompanying model of the surrounding environment and (2) multiple buffers. Each buffer is linked to a corresponding glance target of relevance. Such targets include: windshield, left and right windows, (rear-view) mirrors, and instrument cluster. Some targets and their buffers are always present (like the roadway ahead via the windscreen, and behind via the mirrors), while some other targets and their buffers appear as a function of encountered traffic-regulation indications and infrastructural features. Each buffer is periodically updated, and its update rate can vary in time according to requirements that are “static” (e.g., the presence of a specific on-ramp that requires one to monitor the sides and mirrors) or “dynamic” (e.g., a reduced speed that lessens the need to monitor the speedometer). At each scheduled update time, a buffer is incremented if the driver looks at the corresponding target, and decremented otherwise; this is a way of quantifying the “sampling” (of the environment) performed by the driver. A buffer running empty is an indication that the driver is not sampling enough the corresponding target; he/she is then considered to be inattentive (independently of which buffer has run empty). Until declared inattentive, he/she is considered attentive.
This completes the background information on the state of distraction. We now successively consider the four types of distraction and, for each, its indicators and the related sensors.
Description Manual distraction, also called biomechanical distraction, occurs when the driver is taking one or both of his/her hands off the steering wheel. The driver may do so to answer a call or send a text message, grab food and eat, or grab a beverage and drink, all while driving. According to the National Highway Traffic Safety Administration (NHTSA), texting while driving is the most alarming distraction. It is mainly due to manual distraction, but, inevitably, it also includes both visual distraction and cognitive distraction.
Indicators Unsurprisingly, the best indicator used to detect manual distraction is the behavior of the driver’s hands, mainly through their positions and movements. For safe driving, these hands are expected to be, most of the time, exclusively on the steering wheel, the gearshift, or the turn-signal lever. On the contrary, a hand using a phone, adjusting the radio, or trying to grab something on the passenger seat indicates a manual distraction (218).
Vehicle-based indicators can also be used, as shown in (118). Using naturalistic driving data, the authors studied the correlation between (1) performance metrics linked to the steering wheel behavior and to the vehicle speed, and (2) manual and visual driver distractions induced, e.g., by texting. They found a good correlation between the steering movements and the manual-visual distraction of the driver.
The above information allows one to fill, in Table 4, the relevant cells of the “Manual distraction” column.
Sensors The most common solution to analyze the behavior of the driver’s hands is to use a camera placed inside the vehicle, usually near the central mirror, looking down in the direction of the driver.
Le et al. (111, 112) propose an approach to detect (111) and classify (112) human hand regions in a vehicle using CNNs. Their technique for hands detection is robust in difficult conditions caused, e.g., by occlusions, low resolution, and/or variations of illumination.
Using deep CNNs, Yan et al. (240) classify six actions involving the driver’s hands, i.e., calling, eating, smoking, keeping hands on the steering wheel, operating the gearshift, and playing on the phone. Similarly, both Baheti et al. (16) and Masood et al. (136) use ten classes to detect when the driver is engaged in activities other than safe driving, and to identify the cause of distraction.
Vehicle-based indicators can be obtained from the CAN bus of the vehicle (60; 116).
The above information allows one to fill the relevant cells of Table 5.
Description Visual distraction occurs when the driver is looking away from the road scene, even for a split second. It is often called EOR, and is one of the most common distraction for a driver. Examples of activities causing EOR are (1) adjusting devices in the vehicle (like a radio or navigation system), (2) looking towards other seats, (3) regarding a new message on the phone or glancing at the phone to see who is calling, and (4) looking outside when there is a distraction by the roadside. All generally result in the driver not looking straight ahead, which is what he/she needs to be doing for safe driving.
Indicators The gaze is the main indicator used to detect a visual distraction of a driver. The duration of EOR is probably the most-used metric. The longer the EOR duration is, the lower the SA of the driver is, and the higher the visual distraction of the driver is (242). The glance pattern and the mean glance duration are other metrics (178).
Sometimes, the head direction is used to approximate the gaze direction in order to characterize the driver visual distraction (57; 58). For example, Fridman et al. (57) classify driver gaze regions on the sole basis of the head pose of the driver. Fridman et al. (58) compare classifications of driver gaze using either head pose alone or both head pose and eye gaze. They classify, based on facial images, the focus of the attention of the driver using 6 gaze regions (road, center stack, instrument cluster, rear-view mirror, left, and right). To do so, they consecutively perform face detection, face alignment, pupil detection, feature extraction and normalization, classification, and decision pruning. Vicente et al. (223) similarly classify the driver gaze, but use 18 regions instead of 6.
Visual distraction can also be inferred using vehicle-based indicators such as wheel steering, braking behavior, and speed. Indeed, a driver generally slows down when distracted by a visual stimulus (52; 244), and visual distraction impairs lateral control because the driver needs to compensate for errors made when taking his/her eyes off the road, which leads to larger deviations in lane positioning (119; 244). Such deviations have various causes, including drowsiness and visual distraction. This re-emphasizes the need to use as many indicators as possible. This also explains why more and more vehicles are equipped with systems that keep the vehicle within its lane whenever possible.
The above information allows one to fill, in Table 4, the relevant cells of the “Visual Distraction” column.
Sensors In order to monitor driver visual distraction, one mainly uses at least one camera facing the driver, thus as for manual distraction. The camera can be placed in various positions as long as the head pose and/or gaze of the driver can be obtained.
Naqvi et al. (154) use a near-infrared (NIR) camera (with wavelengths of 0.75 − 1.4 μm) placed in the dashboard in conjunction with a deep-learning-based gaze detection system, classifying driver gaze into 17 gaze zones.
Mukherjee and Robertson (148), similarly to Fridman et al. (57), present a CNN-based model to estimate human head pose and to classify human gaze direction. They use, however, low-resolution RGB-depth (RGB-D), thus with a camera providing depth information.
The above information allows one to fill the relevant cells of Table 5.
Description Auditory distraction occurs when some sound prevents the driver from making the best use of his/her hearing, because his/her attention is drawn to the source of the sound. Hearing a phone ringing, listening to a passenger, listening to music, and following navigation instructions can all lead to auditory distraction.
This component of driver distraction is the least studied in the literature, likely because (1) it is often accompanied by at least one other more-easily detectable source of distraction falling among the other three types, and (2) it poses lower safety risks in comparison to the other types of distraction, in particular visual distraction (207).
The literature does not appear to introduce the concept of “auditory indicators”, which would characterize (1) the sounds captured both inside and outside of the vehicle, and, preferably, (2) the distraction they create. By using several microphones (including arrays thereof), and techniques for separating audio sources (226), one could imagine breaking down and localizing the various sources of sounds both inside and outside the vehicle.
Indicators When the driver appears to be auditorily distracted, there occur changes in pupil diameter (67; 93) and blink frequency (67; 73). Brain activity (EEG) (194) can also be used as an indicator of auditory distraction. Sonnleitner et al. (209) describe the impact of an auditory secondary task on a driver during a primary driving task, and show changes in braking reaction and brain activity.
The above information allows one to fill, in Table 4, the relevant cells of the “Auditory distraction” column.
Sensors As already indicated, obtaining the pupil diameter is challenging in real conditions due to illumination conditions and/or camera resolution, among others. Furthermore, brain activity cannot, at this time, be measured both in real time and in a non-intrusive, reliable way. Blink frequency can, however, be monitored via a camera, and braking behavior via the CAN bus.
Although microphones and, even better, arrays thereof, both inside and outside the vehicle, would be natural sensors to provide values for auditory indicators, we did not find any references considering such sensors for characterizing auditory distraction. One can also envision using the microphone(s) of a smartphone linked to a DMS.
The above information did not lead to the addition of any reference to Table 5.
Description In the context of driving, cognitive distraction is defined by NHTSA (158) as the mental workload associated with a task that involves thinking about something other than the (primary) driving task. A driver who is cognitively distracted due to a secondary task, such as mind wandering, experiences an increase in his/her mental workload (the state discussed in Section 7). The characterization of his/her cognitive distraction could therefore be achieved (1) by examining how his/her mental workload evolves over time and (2) by finding characteristics of this evolution allowing one to decide whether or not it is caused by cognitive distraction. The monitoring of cognitive distraction is thus, before all, a monitoring of the mental workload and/or its time variations. Section 7 shows that there are (1) many ways to characterize mental workload, and (2) many indicators thereof. The challenge is to be able to pinpoint the components of, or changes in, the mental workload that are due to distraction.
Cognitive distraction occurs when a driver is thinking about something that is not related to the driving task. In the driving context, while visual distraction can be summarized by EOR, cognitive distraction can similarly be viewed as “mind-off-road” (MOR). While it is relatively easy to monitor EOR (with a camera facing the driver), it is difficult to monitor MOR. It has, however, been shown that, when a driver is cognitively distracted, his/her visual behavior is impacted. Mind-wandering and daydreaming are two causes of cognitive distraction.
Indicators As cognitive distraction induces mental workload, the indicators allowing one to detect and characterize these two states are similar, if not identical. Therefore, it is difficult, if not impossible, to distinguish, in the driving context (as well as others), between these two states since they have nearly the same influences on the indicators.
Among the four types of distractions, cognitive distraction has proven to be the most difficult to detect and characterize. This is because it happens inside the brain, and, obviously, “observing” the brain of a driver is more challenging than observing his/her hands and eye(s).
As for visual distraction, cognitive distraction can be characterized by indicators of both driving performance and eye movements (122), including (1) vehicle-based indicators, such as speed (177), wheel steering (119), lane discipline (119; 177; 211), and braking behavior (72), and (2) driver-based, behavioral indicators, such as gaze parameters (e.g., fixation duration, glance frequency, and gaze distribution) (72; 120; 208; 212) and head orientation. A driver makes significantly fewer high-speed saccadic eye movements and spends less time looking to the relevant periphery for impending hazards with increasing complexity of the secondary task(s). He/She also spends less time checking his/her instruments and mirrors (72).
Cognitive distraction can also be measured through a variety of driver-based, physiological indicators. Among these, brain activity (210) and pupil diameter may be the most convincing. Studies of EDA and HR show only weak relationships between these indicators and cognitive distraction (244).
Among the subjective measures, the NASA TLX (76) is commonly used in driving-distraction studies even though it is a subjective measure of mental workload, and, thus, not a measure specific to cognitive distraction.
The above information allows one to fill, in Table 4, the relevant cells of the “Cognitive distraction” column.
Sensors Since the main indicators of cognitive distraction are driving performance and gaze parameters, the main sensors to characterize it are vehicle-centric sensors and cameras.
The above information did not lead to the addition of any reference to Table 5.