3 Survey of literature on driver monitoring

This section describes our survey of the literature on DM and DMSs. The subsections below successively describe (1) our strategy for building an initial set of references, (2) some conclusions drawn from these references, (3) the design of a table for organizing them, (4) comments about the content of this table, and (5,6) trends observable in it or in some references. The analysis performed here guides the developments in subsequent sections.

3.1 Strategy for building an initial set of references, and number of these

To build an initial set of relevant references, we used an approach inspired from Gutiérrez et al. (70). The block (or flow) diagram of Figure 1 describes it.


PIC

Figure 1: The flow diagram (1) illustrates the strategy used for our survey of the literature on driver monitoring (DM) and driver-monitoring systems (DMSs), and (2) shows the number of publications at each stage of the process.


Our search focused on surveys, reviews, or similar studies about DM and DMSs. We independently performed two searches during February 2021. The first focused on publications from IEEE, Sensors, and ScienceDirect, and the second on publications from ResearchGate; these four databases appeared well-suited for providing a useful set of initial references. We used the search engine specific to each database and a boolean query equivalent to (“survey” OR “review”) AND (“driver” OR “driving”) AND (“detection” OR “detecting” OR “behavior” OR “state” OR “monitoring”). We limited the search to publications in English, and did not place any constraint on the dates of publication. The two searches yielded 124 and 30 items, respectively. After removing 16 duplicates, we obtained a set of 138 references. We manually screened these, and only kept the ones satisfying the two criteria of (1) being in scientific journals or conference proceedings, and (2) providing a survey, review, or similar study of one or more aspects of the domain of interest. This screening led to 56 references, which appear in the first column of Table 23 , and in the References section, which contains additional references quoted later.

3.2 Conclusions from preliminary analysis of 56 initial references

The preliminary analysis of the 56 initial references led to the following high-level conclusions:

1.
To characterize the (global) state of a driver, one should consider the five main substates of drowsiness, mental workload, distraction, emotions, and under the influence.
2.
A wide variety of parameters, which we call “indicators”, are used to characterize each of these substates, and some indicators are applicable to more than one substate.
3.
Ideally, a DMS should monitor, not only the driver, but also the (driven) vehicle and the (driving) environment.
4.
A value for each indicator is obtained by processing data (mainly signals and images) obtained from sensors “observing” the driver, the vehicle, and the environment.
5.
A DMS generally involves one or more types and/or instances of each of the following: substate, indicator, and sensor.

These conclusions guided the structuring and writing of the bulk of the paper.

When the context is clear, we use “state” for the global state and each of the five substates. The plural “states” and the phrase “state i” each imply that one is talking about one or more substates.

3.3 Design of the structure of the table organizing the initial references

We used the above conclusions to design the structure of a table—namely Table 2—for organizing the 56 initial references in a useful way, in particular for the later synthesis in this paper.


Table 2: The first column of the table lists, by alphabetical order of first author, the 56 references that resulted from our survey on driver monitoring (DM) and related systems (DMSs). The next three megacolumns and the last column briefly describe, for each reference, the states, indicators, sensors, and test conditions considered therein.















References
States
Indicators
Sensors

Tests



























































































Drowsiness

Mental workload
Distraction
Emotions

Under the influence
Driver
Vehicle
Environment
Driver
Vehicle
Environment







Physiological Behavioral Subjective
















Ahir and Gohokar (2)

V

HR, brain gaze, blink, PERCLOS, facial, body wheel, lane, speed cam, elec ext cam

real, sim
















Alluhaibi et al. (6)

V

V ang

speech wheel, lane, brake, speed cam*, mic* V*
















Arun et al. (13)

vis, cog

HR, brain, EDA, pupil gaze, blink, body V wheel, lane, brake, speed cam, wea d, eye t V

sim
















Balandong et al. (18)

V

HR, brain gaze, blink, PERCLOS, body V wheel, lane, brake, speed elec

sim
















Begum (20)
V

V stress

HR, brain seat, ste w, saf b, wea d

real, sim
















Chacon-Murguia and Prieto-Resendiz (28)

V

HR, brain, EDA gaze, blink, body wheel, lane, brake, speed ste w, cam radar

real
















Chan et al. (29)
V

HR, brain blink, PERCLOS, facial, body wheel, brake, speed cam*, mic*

real
















Chhabra et al. (31)
V

V

alc

breath gaze, PERCLOS, facial, body wheel road seat, cam*, mic* V*

real, sim
















Chowdhury et al. (32)

V

HR, brain, EDA blink, PERCLOS

sim
















Chung et al. (34)

stress

HR, breath, brain, EDA, pupil speech V wheel, lane, brake, speed cam, wea d V

real, sim
















Coetzer and Hancke (35)

V

brain gaze, PERCLOS, facial, body wheel, lane, speed cam V

real, sim
















Dababneh and El-Gindy (37)

V

brain, EDA, pupil blink, PERCLOS, body wheel, lane, speed road cam, wea d radar

real, sim
















Dahiphale and Rao (38)

V

V

gaze, blink, facial, body wheel cam

real
















Dong et al. (44)
V

V

HR, brain, pupil gaze, blink, PERCLOS, facial, body V wheel, lane, speed road, wea cam V

real
















El Khatib et al. (51)
V

man, vis, cog

HR, breath, brain, EDA, pupil gaze, blink, PERCLOS, facial, body, hands wheel, lane, speed cam V* ext cam, radar

real, sim
















Ghandour et al. (65)

man, vis, aud, cog stress

HR, breath, brain, EDA gaze, facial, body, speech V wheel, brake, speed cam, wea d

real, sim
















Hecht et al. (78)
V

V

V

HR, brain, EDA, pupil gaze, blink, PERCLOS, facial, body V elec, eye t

real, sim
















Kang (95)
V

V

HR, breath, brain, EDA gaze, blink, facial, body wheel, lane, brake, speed seat, ste w, cam V

real, sim
















Kaplan et al. (96)
V

V

HR, brain gaze, blink, PERCLOS, facial, body, speech wheel, lane, brake, speed ste w, cam*, mic*, wea d V

real, sim
















Kaye et al. (98)

V

stress

HR, breath, brain, EDA V

real, sim
















Khan and Lee (99)

V

man, vis, aud, cog

HR, brain, EDA gaze, PERCLOS, body wheel, lane, brake, speed wea d

real
















Kumari and Kumar (107)

V

HR, brain gaze, blink, PERCLOS, body V wheel, lane cam
















Lal and Craig (108)

V

HR, brain, EDA PERCLOS, facial cam

sim
















Laouz et al. (109)

V

HR, brain, EDA blink, PERCLOS, facial, body V wheel, speed seat, cam, wea d ext cam

real
















Leonhardt et al. (114)

HR, breath seat, ste w, saf b, cam

real
















Liu et al. (126)

V

HR, brain, pupil gaze, blink, PERCLOS, body wheel, lane, speed cam V

real
















Marquart et al. (132)

V

pupil gaze, blink, PERCLOS V eye t

real, sim
















Marina Martinez et al. (130)

ang

brake, speed V*
















Mashko (134)

V

HR, brain, EDA gaze, blink, body wheel, lane, brake, speed cam, wea d V ext cam, radar

real, sim
















Mashru and Gandhi (135)

V

HR, breath blink, PERCLOS, facial, body V wheel, lane seat, ste w, cam, wea d

sim
















Melnicuk et al. (141)

V

V

cog stress, ang

HR, brain blink, PERCLOS, facial wheel, brake, speed road, traf, wea seat, ste w, saf b, cam*, wea d V*

real
















Mittal et al. (146)

V

HR, brain, pupil blink, PERCLOS, body V wheel, lane, brake, speed cam, elec V ext cam

real
















Murugan et al. (151)

V

HR, breath, brain, EDA, pupil blink, PERCLOS, body V wheel, lane, speed cam, elec V

sim
















Nair et al. (153)

V

V

alc

gaze, PERCLOS, facial, body lane seat, cam* V radar
















Němcová et al. (160)

V

stress

HR, breath, brain, EDA gaze, blink, PERCLOS, facial, body wheel, lane, brake, speed seat, ste w, cam, wea d, eye t V

real, sim
















Ngxande et al. (156)

V

blink, PERCLOS, facial, body cam
















Oviedo-Trespalacios et al. (164)

V

V

gaze wheel, lane, brake, speed

real, sim
















Papantoniou et al. (167)

V

V

HR, breath, brain gaze, blink, speech V wheel, lane, speed cam ext cam, radar

real, sim
















Pratama et al. (174)

V

HR, brain, EDA gaze, blink, PERCLOS, facial, body, hands V wheel, lane cam, wea d, elec ext cam

real, sim
















Ramzan et al. (176)

V

HR, breath, brain blink, PERCLOS, facial, body wheel, lane, speed cam, wea d, elec V

real, sim
















Sahayadhas et al. (186)

V

HR, brain, pupil gaze, blink, PERCLOS, body V wheel, lane seat, ste w, cam, wea d V

real, sim
















Scott-Parker (195)

stress, ang

HR, brain, EDA gaze, facial V wheel, lane, brake, speed traf eye t ext cam

real, sim
















Seth (196)

V

cam V

real
















Shameen et al. (198)

V

brain gaze, blink elec

sim
















Sigari et al. (202)

V

gaze, blink, PERCLOS, facial, body cam

real
















Sikander and Anwar (203)

V

HR, brain, pupil gaze, blink, PERCLOS, body V wheel, lane seat, ste w, saf b, cam, wea d, elec

real
















Singh and Kathuria (205)

V

V

V V

pupil gaze, blink, PERCLOS, facial wheel, brake, speed road, traf cam, wea d V ext cam, radar

real
















Subbaiah et al. (213)

V

HR, brain, pupil blink, PERCLOS, facial, body cam

real, sim
















Tu et al. (219)

V

HR, brain blink, PERCLOS, facial, body wheel, lane, speed cam*, wea d, elec V

real, sim
















Ukwuoma and Bo (220)

V

HR, breath, brain blink, PERCLOS, facial, body wheel, lane, brake cam, wea d, elec

real
















Vilaca et al. (225)

V

V

brain gaze, body wheel, lane, brake, speed cam, mic V ext cam
















Vismaya and Saritha (227)

V

gaze, blink, PERCLOS, body cam, eye t

real, sim
















Wang et al. (229)

V

brain, pupil gaze, blink, PERCLOS, body lane cam, wea d

real, sim
















Welch et al. (230)

stress, ang

HR, breath, brain, EDA blink, facial, speech wheel, brake, speed seat, ste w, cam, wea d V

real, sim
















Yusoff et al. (244)

vis, cog

HR, brain, EDA, pupil gaze, body V lane, speed eye t
















Zhang et al. (248)

V

HR, brain gaze, blink, PERCLOS, body lane, speed cam ext cam

real, sim

















The 56 references are listed in the first column, labelled “References”, by alphabetical order of first author. The three megacolumns following the first column successively correspond to the three key items above, and are accordingly labelled “States”, “Indicators”, and “Sensors”. The last column, labelled “Tests”, indicates whether the technique or system described in a reference was tested in the laboratory, or in real conditions (“in the wild”), or both.

The “States” megacolumn is divided into 5 columns corresponding to the 5 (sub)states of interest. Each of the “Indicators” and “Sensors” megacolumns is divided into 3 columns corresponding to the 3 previously-listed items that a DMS should ideally monitor, i.e., the driver, vehicle, and environment. The column corresponding to the indicators for the driver is further divided into 3 subcolumns corresponding to the qualifiers “physiological”, “behavioral”, and “subjective”. Some other columns could be further subdivided, such as for “Distraction”, but the table deals with such additional subdivisions in a different way.

3.4 Description of the content of the table of references

We successively describe the three megacolumns of Table 2.

States For each of the 56 papers, we indicate which particular (sub)state(s) it addresses. If a paper addresses drowsiness, we place the checkmark “V” in the corresponding column, and similarly for mental workload. For the three other states, we either use a general “V” or give more specific information, often via an abbreviation. There are four types of distraction, i.e., manual, visual, auditory, and cognitive, respectively abbreviated via man, vis, aud, and cog. These types are self-explanatory, but they are addressed later. For emotions, we indicate the type, i.e., stress or anger (ang). For under the influence, we also indicate the type; in all cases, it turns out to be alcohol (alc).

As an example, the second paper, by Alluhaibi et al. (6), addresses drowsiness, distraction, and the emotion of anger.

All abbreviations used in Table 2, for this and other (mega)columns, are defined in Table 3.

Indicators The indicator(s) used by a paper is (are) indicated in the same way as above.

Sensors The sensor(s) used by a paper is (are) indicated in a similar, but not identical, way. If a sensor is embedded in a mobile device (typically a smartphone), rather than in the vehicle, we add a “*”, leading to cam*/mic* for a camera/microphone of a mobile device. In the vehicle column, “V” indicates that the sensor is integrated in the vehicle, whereas “V*” indicates that it is part of a mobile device. For example, the vehicle speed can be obtained via the controller-area-network (CAN) system/bus or a mobile device.


Table 3: The table defines the abbreviations used in Table 2. They are organized according to the megacolumns and columns of Table 2, and are listed in alphabetical order.








States
Indicators
Sensors
Tests








Distraction
Driver
Driver
real
real conditions





















aud
auditory blink blink dynamics
cam
camera
sim
simulated conditions



cog
cognitive body body posture
elec
electrode(s)
man
manual brain brain activity
eye t
eye tracker
vis
visual breath breathing activity
mic
microphone



Emotions
EDA electrodermal activity
saf b
safety belt



ang
anger facial facial expressions
ste w
steering wheel






Under the influence
hands hands parameters
Environment






alc
alcohol HR heart rate/activity
ext cam
external camera






pupil pupil diameter



Vehicle



brake braking behavior
lane lane discipline
wheel wheel steering



Environment



road road geometry
traf traffic density
wea weather




3.5 Trends observable in the table

Table 2 reveals the following trends.

Drowsiness is the most covered state (with 44 references among the total of 56), distraction is the second most covered (with 20 references), and more than one (sub)state is considered in only 19 references.

Indicators are widely used in most references, in various numbers and combinations. Subjective indicators are not frequent (which is to be expected given the constraints of real-time operation). While several authors, such as Dong et al. (44) and Sahayadhas et al. (186), emphasize the importance of the environment and of its various characteristics (e.g., road type, weather conditions, and traffic density), few references (and, specifically, only 6) take it into account.

While the three “Sensors” columns seem well filled, several references either neglect to talk about the sensor(s) they use, or cover them in an incomplete way. Some references give a list of indicators, but do not say which sensor(s) to use to get access to them. References simply saying that, e.g., drowsiness can be measured via a camera or an eye tracker do not help the reader. Indeed, these devices can be head- or dashboard-mounted, and they can provide access to a variety of indicators such as blink dynamics, PERCLOS, and gaze parameters.

Many systems are tested in real conditions, perhaps after initial development and validation in a simulator. Many papers do not, however, systematically document the test conditions for each method that they describe.

3.6 Other trends observable in references

Other trends are not directly observable in Table 2, but can be identified in some individual references.

Experts agree that there does not exist any globally-accepted definition for each of the first four states that we decided to consider. For example, even though many authors try to give a proper definition for drowsiness, there remains a lot of confusion and inconsistencies about the concepts of drowsiness and fatigue, and the difference between them. There is thus a need to define, as precisely as possible, what the first four states are, and this is done in the sequel.

In the more recent references, one sees a trend, growing with time, in the use of mobile devices, and in particular of smartphones (62931519596130141153219). A smartphone is relatively low-cost, and one can easily link it to a DMS. This DMS can then use the data provided by the smartphone’s many sensors, such as its inertial devices, microphones, cameras, and navigation system(s). A smartphone can also receive data from wearable sensors (e.g., from a smartwatch), which can provide information such as heart rate (HR), skin temperature, and electrodermal activity (EDA). A smartphone can also be used for its processing unit.