Using body sensors for evaluating the impact of smart cycling technologies on cycling experiences: a systematic literature review and conceptual framework

Digital technologies in, on, and around bicycles and cyclists are gaining ground. Collectively called Smart Cycling Technologies (SCTs), it is important to evaluate their impact on subjective cycling experiences. Future evaluations can inform the design of SCTs, which in turn can help to realize the abundant benefits of cycling. Wearable body sensors and advanced driver assistance systems are increasingly studied in other domains, however evaluation methods integrating such sensors and systems in the field of cycling research were under-reviewed and under-conceptualized. This paper therefore presents a systematic literature review and conceptual framework to support the use of body sensors in evaluations of the impact of SCTs on perceptions, emotions, feelings, affect, and more, during outdoor bicycle rides. The literature review (n = 40) showed that there is scarce research on this specific use of body sensors. Moreover, existing research designs are typically not tailored to determine impact of SCTs on cycling experience at large scale. Most studies had small sample sizes and explored limited sensors in chest belts and wristbands for evaluating stress response. The evaluation framework helps to overcome these limitations, by synthesizing crucial factors and methods for future evaluations in four categories: (1) experiences with SCTs, (2) experience measurements, (3) causal analysis, (4) confounding variables. The framework also identifies which types of sensors fit well to which types of experiences and SCTs. The seven directions for future research include, for example, experiences of psychological flow, sensors in e-textiles, and cycling with biofeedback. Future interactions between cyclists and SCTs will likely resemble a collaboration between humans and artificial intelligence. Altogether, this paper helps to understand if future support systems for cyclists truly make cycling safer and more attractive.

frequently and for longer distances while feelings of unsafety discourage people from cycling [14,74].
In the transportation literature, there is a significant body of research on the relevance and impact of cycling infrastructure and built environment on cycling experiences [54,56].However, much less is known about the impact of Smart Cycling Technologies (SCTs) on cycling experiences.Research on SCTs is growing, and these technologies are increasingly impacting cycling experiences [11,64,86].SCTs can be considered as the equivalent of Advanced Driver Assistance Systems (ADAS) for motorized vehicles [10,92].SCTs often utilize Artificial Intelligence and Internet of Things features for "smart" and "connected" features.Examples of SCTs are systems for route planning, speed adaptation, collision avoidance, and e-bike charging.This paper adopts the term "smart cycling technologies" without delving into terminological debate.Several conceptualizations of SCTs exist already [11,64,86].Evaluating (that is, to measure and understand) the impact of SCTs on cycling experiences is essential as these evaluations inform the design of SCTs.Improving the design of SCTs can improve cycling experiences, which in turn can help to make cycling safer and more attractive.

Measuring cycling experiences
In various academic and commercial domains including tourism, marketing, and videogaming, there is an increasing interest in using data from wearable sensors for analysing subjective experiences [6,50,60,98].For example, theme park visitors have been equipped with smartwatches to collect physiological data during rollercoaster rides [8], and the effects of various types of marketing messages on EEG readings have been explored [95].Such research is motivated by insights from a research domain called affective computing, which deals with digital systems that can recognize and adapt to human emotions [118,119,135].
Also in cycling experience research, there is an increasing interest in sensor systems for understanding what people experience on a perceptual, affective, and emotional level while riding bicycles [14,74].The key argument for using such sensors is that they can offer more objective, continuous, and real-time insights than traditional methods such as surveys, interviews, or crash statistics [107].The use of wearable sensors on the human body is a natural extension of using sensors within instrumented bicycles, which already occurs frequently [47].
To distinguish between wearable sensors that do and do not measure aspects of the human body, this review adopts the term "body sensor".Body sensors are placed directly at or in the human body and are increasingly integrated into body sensor networks [72], which are illustrated in Fig. 1.Body sensors can measure processes related to, for example, physiology, neurology, cognition, or movement.Examples of measurable variables are heart rate, electrical activity in the brain, and movement of body parts.Despite advancements in measuring cycling experiences with body sensors, there are knowledge gaps, which will be explained next.

Knowledge gaps, aims, and scope
In this context, the knowledge gaps are as follows.Existing and relevant reviews focused on sensors in bicycles to analyse cycling behaviour [47], links between body sensor data and stress response during cycling [14,74], Fig. 1 Overview of a potential architecture of a body sensor network.Reprinted from [72].Abbreviations are explained in Table 1 and the usage of textual and visual methods to examine cycling in urban environments [76].However, these studies do not review cycling with SCTs, and do not review the use of body sensors in evaluations of SCTs.Contextand subject-level variables that influence both the cycling experience and the functioning of a SCT (called confounding variables in this paper) are acknowledged to be important, but how studies control for them is not well reviewed.Altogether, body sensors are used to evaluate cycling experiences in general, but knowledge is missing about the specific use of body sensors in evaluations of the impact of SCTs on cycling experience.
This paper addresses this knowledge gap by pursuing two primary aims.This paper aims first to conduct a comprehensive and systematic literature review on the use of body sensors in evaluations of cycling experiences, providing valuable insights into the existing state of research.Second, the paper aims to develop and present a conceptual framework that shows which factors and methods are important in future studies that use body sensor data to evaluate the impact of SCTs on cycling experiences.This framework will be based on an integration of review results, focusing on research on SCTs, experiences, and sensor systems.
The scope for these two aims includes experiences of private individuals who ride on fully or partially humanpowered bicycles.This paper focuses specifically on in-the-moment experiences perceived while cycling, acknowledging that experiences as a whole also include antecedents and consequences [50].The scope is limited to outdoor and naturalistic cycling.Stronger investigations of outdoor cycling are necessary because many other studies with body sensors were conducted under lab conditions [118].Also, outdoor investigations are necessary to complement simulations, lab tests and virtual reality setups [70,100,103].Additionally, this review focuses on factors relating to Human-Computer Interaction during experiences with SCTs.

Scientific contribution and paper outline
This paper contributes significantly to existing literature.Firstly, the evaluation framework accommodates evaluations of diverse experiences and SCTs.Accommodating a diversity of experiences and SCTs is significant and valuable, because earlier evaluations of cycling experience focused typically on a single SCT [9,26,38,139] or on negative experiences of stress and discomfort [14,74].Secondly, recent studies show that research domains of transportation and human-computer interaction are not well-connected, meaning that there is little interdisciplinary research that combines knowledge from both these fields [28,102].This paper contributes to tightening the integration of these research domains.Closer integration helps to move beyond measurement and mapping of cycling stress [71], towards a better understanding of using body sensors for understanding the experiences that result specifically from cycling with SCTs.Furthermore, this paper reviews and reinforces recent findings in the emerging research area of using body sensor data for real-time Human-Computer interaction with SCTs that increasingly use Artificial Intelligence.This research area holds both academic and commercial potential [3].Finally, the paper provides recommendations for future research, needed to implement the conceptual framework in practice.
The paper is structured as follows: Sect. 2 outlines the approach for the systematic literature review.Section 3 presents a summary of the review results.In Sect.4, a conceptual framework for evaluations is proposed based on the integration of review findings and domain knowledge.Section 5 presents a research agenda, and Sect 6 concludes the paper.

Literature review methodology
The approach for the systematic literature review is based on the PRISMA method (Preferred Reporting Items for Systematic reviews and Meta-Analyses [91]) for searching, screening, and selecting the literature that is to be included in the review.This method ensures reproducibility, transparency, and extended reach into the literature, which overcomes omissions in many previous reviews that did not describe the methodology used [136].Figure 2 visualizes the PRISMA search and selection process.The next sections describe the search query, databases, selection criteria, and search results.

Search query and databases
The query used for the definitive search consists of three parts: (1) keywords related to the bicycle, (2) keywords related to experience, and (3) terms related to evaluation of experience.These keywords were chosen based on a trade-off between broadness and specificity of search keywords.Keywords should be broad enough to capture diverse types of quantitative measurements of subjective experiences while riding bicycles.Keywords should also be focused enough to leave out most of irrelevant literature.For example, many results concerned studies about cyclical processes in chemistry.Furthermore, the choice of broad keywords was motivated by initial literature reading and searching which revealed a lack of standardization in definitions for experiences and SCTs [11,63].The query was employed in December 2022 in the Scopus, Web of Science (WoS), Transportation Research International Documentation (TRID), and Google Scholar (GS) databases.These databases were chosen because they offer extensive access to the literature.The query is slightly adjusted to accommodate the different search engines and follows the following structure.Attachment 1 displays the full query for all four databases.
• ("bike" OR bicycl* OR "biking" OR "cycling" OR "cyclist") AND ("experience" OR "emotion" OR "perception") AND (evaluat* OR measur* OR quantif* OR determin* OR "assess" OR "impact") The results for the Scopus search engine were limited to relevant research domains and to the publication years 2005 and later.The choice for 2005 and later aligns with [64], who found that before 2005, no relevant studies on SCTs were found.For Google Scholar, the first 300 hits were screened as recommended in [53].Snowballing through publication lists of the "Cycling@CHI" and "Cycling@MobileHCI" communities was conducted [25,109].These communities are working groups associated with academic conferences (CHI being the Conference on Human Factors in Computing Systems), with participants active in the field of Human-Computer Interaction (commonly abbreviated as HCI) design for cyclists.

Inclusion and exclusion criteria
The following inclusion criteria were developed and used to screen the results from the search engines.All studies that did not meet these criteria were excluded:

Selection and data extraction
Employing the search query in Scopus, WoS, and TRID resulted in a total of 7133 results.Since each search Fig. 2 Flowchart depicting the systematic search and screening process engine differs in which topics are covered most, the number of results per search engine is different [19,94].After deduplication, 5216 articles were screened.The first author of this review conducted all screening and selection.The selection included only papers that met the selection criteria.The first screening was based on the title and abstract.Then, a full text review was conducted.
Snowballing through reference lists of selected articles was conducted.One paper was added manually to the selection, since this paper is relevant to the review yet does not mention "bicycling" or any similar terms in the title, abstract, and keywords [71].
For all selected papers, the following data points were extracted: type of experience, type of sensor, position of sensor, number of sensors, route choice, number of participants, location of the study, whether an SCT was evaluated, approach for determining causes for experiences if applicable, statistical analysis approach, contextual variables, and subject level variables.Approaches for determining causation were extracted by scanning the papers for words such as "effect", and "cause", "influence", "impact", "related", "correlated", "coefficient", and "linked".

Search results
The search process led to a selection of 40 papers that explicitly and specifically cover the use of body sensors in evaluations of bicycling experience.Twenty-two of the selected studies focused on statistical analysis of causes for changes in cycling experiences; 18 studies did not.Causes originated mostly from the built environment and traffic system.For example, road crowdedness, landscape type, overtaking distance, road layout, intersection type, and so forth.Seven of the 40 papers evaluate experiences with SCTs.
The selected papers are of the following types: 30 journal papers, nine conference papers, and one grey literature article.The geographical spread over global continents is strongly biased towards Western European countries.Twenty studies were conducted in Europe, 13 in North America, one in South America, two in Asia, and three in Australia.One study collected data in both North America and Europe.

Results from analysis of selected literature
This chapter presents the results from the literature review, addressing the first aim of the paper.The results will be presented in terms of experiences and sensors, route choice and participant samples, sensor-based evaluations of experiences with SCTs, data validation and analysis approaches, and confounding variables.Figures 3, 4, 5, 6, 7 and 8 distinguish between evaluations with and without SCTs.The SCTs that were evaluated, are described in Sect.3.3.Table 1 summarizes the data that was extracted during the review process.To keep the table readable, only selected data is included in this table.A supplementary file provided as attachment to this paper contains an extensive table with all data extractions from all selected papers.

Experiences and sensors
Figure 3 shows a categorization of types of experiences that were evaluated in selected studies.Negative types of experiences are undesirable by cyclists, for example, stress and anxiety.Mixed types of experiences can be negative, neutral, or positive for cyclists.Attention, arousal, and risk perception are examples of mixed experiences.Positive types of experiences are desirable by cyclists, for example, happiness and comfort.It is remarkable that studies without an SCT focus mostly on negative experiences, yet those with a SCT focus more on mixed experiences.Two studies did not specify exactly which type of cycling experience was evaluated [35,106], indicating a drawback in the description of the study design.
Figure 4 shows the types of sensors that have been used in the selected literature.It is remarkable that some studies use one sensor to study multiple types of experiences, while other studies use multiple sensors to study one type of experience.Some types of experiences have been studied with multiple types of sensors, e.g., attention was studied both by EEG and ET.The following sensors have not yet been investigated in existing reviews: EEG, ET, EMG, gyroscope, and Hall effect sensors [14,74].Some studies did not specify which exact type of sensors was used.For example, Pejhan et al. [93] mentioned that HR was measured but did not specify whether a ECG or PPG sensor was used.This lack of specification indicates a drawback in the reporting because it limits comparisons across studies.
Nearly half of the studies utilized ECG, ST, and PPG sensors to measure stress and risk perception.Notably, this finding remains consistent even after considering studies that were not included in previous reviews [14,74].The Hall effect sensor used in one study is a sensor that measures movement via movement of a magnet.This sensor will not be explored nor discussed in this paper, because it was not encountered anywhere else in cycling experience literature, and body posture can be detected with gyroscopes in more detailed ways.
Regarding the positioning of sensors, Fig. 5 shows that sensor positioning correlates to the type of sensors used.The Empatica E4 wristband and Polar H7 chest belt are relatively often used to capture PPG, EDA, ST, and ECG signals.A minority of studies used sensors at multiple places on the body.

Route choice and participant samples
The research design of the reviewed studies will be described in terms of route choice, sample sizes, and recruitment strategies.In most studies, participants followed a predefined route in a controlled study design (Fig. 6).Variation was created by choosing a route that includes route segments with multiple characteristics.The studies that used a so-called "in the wild" approach enabled participants to choose freely wherever they wanted to cycle.
Figure 7 visualizes the number of participants per study.The number of study participants is relatively low across all studies, which limits statistical power in quantitative analysis.The column "unspecified" includes studies that did not describe the sample size.It is a drawback that the sample size was not described, because it limits comparisons in meta reviews.It is not clear why these studies did not describe the sample size.These studies were, nevertheless, selected for the literature review because they met the inclusion criterium of having 1 or more human participants.
The participant recruitment strategies reveal more drawbacks.None of the studies used probability sampling for selecting participants, limiting representativeness of the population at large.Twenty-one studies did not describe the recruitment strategy, complicating comparisons across studies.It is not clear why these studies did not describe the recruitment strategy.Twelve studies used a convenience sample, meaning that participants were selected without randomization and based on ease of access and availability to the research team.Participants were mostly found via personal networks, email lists, and advertisement boards, introducing potential self-selection bias.The five studies that used a purposive sample selected participants because they had characteristics that were required in the study.For example, one study aimed to compare young and elderly cyclists and selected participants accordingly [80].

Studies with body sensors and SCTs
This review found that seven studies used body sensors in an evaluation of experiences with a SCT.These studies focus on a single prototype with a small number of participants (between 9 and 37), and they do not use an existing typology to describe the evaluated SCTs.The seven studies examined different SCTs and used different research designs to measure user experiences.The studies can be categorized in three groups as follows.
Firstly, three studies evaluated experiences with an SCT using body sensor measurements as real-time input for human-computer interaction while these studies did not use body measurements as input for expert evaluation of the resulting experience.The resulting experiences were analysed via interviews and grounded theory methodology.Two SCTs adjust the motor support level to improve rider engagement and safety [4,5].One SCT communicates heartbeat rates visually and in real time via a helmet to support social engagement among cyclists [133].Secondly, three studies evaluated experiences with an SCT that used real-time measurements, and these studies also used the measurements as input for expert evaluation and quantitative analysis of the resulting experiences.One study uses tactile feedback on the feet to support finding the right cadence [13].Another study adjusts motor support to fitness levels [30].The third study measures muscle fatigue to tailor motor support for reducing fatigue [69].
Thirdly, in one study the body measurements were input for the expert evaluation of cycling with an SCT while the SCT itself did not use data from the body sensors in real time [127].The SCT in this study is an alwayson bicycle light to improve cycling safety.

Data validation and analysis approaches
Next, the approaches for validating and analysing data in selected studies will be reviewed.Validation and analysis are challenging, as they raise fundamental questions about labelling sensor data, establishing the certainty of participants' experiences, and determining if those experiences were caused by the studied phenomenon.Additionally, identifying reliable patterns that indicate specific types of experiences is crucial.
To validate physiological data related to the stress response, studies by Kyriakou et al. [71], Teixeira et al. [125], and Zeile et al. [145] serve as exemplary illustrations.These studies collected labels for sensor data by utilizing an experience sampling method, in which participants provided self-reports through button presses on a handlebar smartphone.The ratings provided insights into the intensity of the stress response during the ride.Comparisons were made between these ratings and data from HR, ST, and EDA sensors to identify frequent data patterns associated with high and low stress intensity.This approach helped establish links between specific data patterns and levels of stress response.
However, for other types of sensors and experiences, there is a lack of robust knowledge regarding groundtruthing processes.In the case of EEG data, studies have attempted to identify which patterns in electrical brain activity correspond to which environmental and behavioural conditions [99,111,112,146].These three studies are however too different in research design to draw conclusions about which data patterns can serve as ground truths.Studies employing eye tracking, muscle activity data, or body balance data did not explicitly discuss ground truths, making it challenging to present findings in this regard.
Figure 8 visualizes the data analysis approaches in the selected studies.Notably, most studies that evaluated experiences with an SCT did not do so with quantitative analysis of measurements of experiences.Three studies utilized grounded theory methodology to analyse the resulting experiences with SCTs using body sensor data.
Most studies that drew conclusions about reasons for experiences employed data triangulation from various sources, such as body sensors, camera recordings, selfreports, and surveys.Of these studies, eleven controlled statistically for confounders, while seven discussed confounding without statistical control of the confounding variables.Two studies did not triangulate data and used only data from body sensors to infer causation from statistical correlations between variables of interest.Nine studies did not analyse causation; these studies consistently used framing such as "links, " "relations, " and "associations." One study lacked clarity regarding its data analysis approach.Statistical data analysis techniques used most frequently include ANOVA and regression models (logistic, linear, and multilevel modelling).Eight studies employed rule-based decision processes or thematic analysis, which will not be discussed further.Eye tracking studies utilized proprietary software associated with the eye tracking devices, while one study employed propensity score matching, commonly used for analysing time series data in observational studies.

Confounding variables
A consistent finding across studies is that confounding variables are acknowledged to be important, but only a few studies control for them statistically, and even then, only a limited set is controlled.Three different and important types of confounding variables emerge from the review: (1) infrastructure and spatial environment, (2) participants as research subjects, and (3) weather.
Regarding the infrastructure and spatial environment, most studies used a subset of variables relating to noise, road width, traffic volume, surface type, land cover type, and intersection type.18 out of 40 studies did not mention if and how variables in this category were analysed.Regarding subject-level variables, studies collected mostly sociodemographic data such as age, gender, height, weight, years and level of cycling experience, and attitudes about safety and comfort.Thirteen out of 40 studies did not mention subject-level variables.Regarding the weather, even fewer studies controlled for weather variables: 24 out of 40 studies did not control nor discuss weather variables.The studies that did mention weather mentioned mostly wind and temperature ranges.A few studies mentioned that data collection took place in same-weather conditions.Most of the studies that mentioned weather mentioned it only in the discussion section.

Conceptual framework for evaluations
This section now presents and discusses the conceptual framework, addressing the second aim of the paper.The framework is developed by integrating the review of selected literature in Sect. 3 with domain knowledge.
The key principle for the framework is that evaluations should triangulate data from multiple sources to determine which changes in experiences can be attributed to the use of SCTs.This data triangulation should include quantitative data analysis and statistical control for confounding variables.Figure 9

Cyclist experiences with SCTs
Determining the specific type(s) of experiences to be evaluated in conjunction with SCTs is a challenging endeavour.The field of research on conceptualizations of experience is continuously evolving, necessitating careful consideration and definition of the relevant dimensions and aspects to be assessed.Furthermore, the emergence of novel and advanced systems, particularly those employing Artificial Intelligence and physical intervention, is predicted to have profound effects on sensorial, perceptual, cognitive, and affective processes [48].
The types of experiences that have been evaluated in the selected studies (see Sect. 3.1) represent only a small subset of the diverse range of experiences that cyclists can have.Existing research has predominantly focused on negative experiences associated with stress, anxiety, and risk perception.It is important to recognize that cyclists encounter a much broader spectrum of experiences which is worth exploring in future research.Table 2 summarizes existing conceptualizations of cycling experiences, and this table guides choices in future evaluations about which types of experiences to focus on.
An important limitation in the selected studies is the small number of participants-too small for understanding causation and variation between types of experiences.These small numbers means that future studies are recommended to increase sample sizes and to use knowledge about choosing appropriate sample sizes [36].Regarding SCT factors, until now, only a limited set of SCT design factors has been evaluated with body sensor data.The SCTs evaluated with body sensor data thus far were positioned in the bicycle and on the cyclist but not yet on vehicles or in the infrastructure.Table 3 summarizes additional factors in SCTs that warrant evaluation.
The limited set of SCTs evaluated so far means that more attention should be given to research that investigates experiences in which SCTs utilize real-time measurements of user experiences for Human-Computer Interaction and decision-making purposes.The following quote captures this way of using experience data effectively: "The wearable Human-Machine Interface acts as a direct communication path between humans and machines, which involves obtaining physical or electrophysiological signals from consumers and further driving the machine to perform specific functions accordingly." [142].Studies show that this is an upcoming trend [4,5,69].A recent dissertation, yet to be replicated, argued that SCTs that act on experience measurements lead to specific types of experiences [3].Using experience data for the control of SCTs aligns with recent findings that neurological data in Brain-Computer Interfaces can control, for example, robots and prosthetic arms [15,97].Generally, it is recommended to study the implications of novel, future, and highly advanced forms of Human-Computer Interaction [40,48,82,83].
It is important to acknowledge that cycling experiences have consequences, for example, on subjective wellbeing, self-identity, and travel behaviour.For example, malfunctioning or suboptimal design causes frustration, which impacts mood, which in turn impacts subjective wellbeing.Also, it has been argued that experiences with future autonomous and intelligent systems lead to changes in one's identity and perception of self [48].Because the focus of this paper is on evaluating experiences with SCTs, these consequences will not be discussed further but are an important avenue for future research.

Experience measurements
The previous section explained the types of SCTs and cycling experiences that can and should be evaluated.Now, it becomes important to understand which types of sensors link well to types of SCTs and cycling experiences, because different data and methods are relevant for different types of experiences.
Complementing the conceptual framework in Figs. 9 and 10 provides more detail on which types of SCTs, cycling experiences, and body sensors can be linked.Specific links between SCTs, experiences, and sensors are  explained in the next paragraph but were not visualized in Fig. 10 because that would make the figure incomprehensible.Figure 10 has been derived from analysis of both the literature selected via the systematic search process in this review, and literature found in adjacent fields (that is, the wider body of research on cycling experiences, SCTs, and sensor technology).The figure distinguishes between findings from both sources of literature.
An arrow from body sensors to SCTs is included, to represent the trend that SCTs increasingly use body sensor data as real-time input.
The following links emerged from the review and literature analysis process: • Although data from HR, HRV, GSR, and ST sensors has weak links to perceived stress [74], once these links are strengthened it is expected that these sensor types will be useful to study SCTs that aim to improve stress, comfort, and perceived safety during cycling.• ET can help to evaluate the impact of visual stimuli from SCTs.Existing literature confirms the value of ET for investigating links between visual stimuli and attention and distraction during experiences [49,79].• EEG readings may fit well to understanding impacts of SCTs on cognitive and mental aspects of cycling experiences, even though it may seem that measuring EEG signals during cycling is unfeasible due to technical challenges and errors induced by move-ment.Studies succeeded in capturing cognitive and mental factors during cycling via EEG readings [5,146].The advent of low-cost, wireless, lightweight, portable EEG devices is also promising for cycling research [44,110].Research has already shown high prediction accuracy for deriving emotions from EEG data [134].• To date, only one of the 40 selected articles used EMG to measure muscle fatigue.Nevertheless, literature shows that EMG has received increasing attention in experience research [124,143].EMG may be suitable to understand how SCTs impact tiredness and fatigue during cycling.
Next to these described links, it is important to be aware that this list of links is only intended to provide starting points and is not intended to be an exclusive or exhaustive list of all possible links.For example, one recent study used ET to measure stress of athletes in virtual reality [122], which is a link that has not been explored yet in the selected articles.
Due to rapid advances in sensor technology, it is challenging to list all potentially relevant sensor types.Existing reviews cover wearable sensor innovations [66,115].Notably, findings from these reviews mean that four trends in sensor technology are relevant: 1. Miniaturization and mainstream adoption of sensors like those in smartwatches [67] offer opportunities for large scale data collection with existing hardware.
A commonplace example are inertial measurement units (often abbreviated as IMUs), which are almost universally included in modern-day smartphones.IMUs typically include a gyroscope and accelerometer, to measure physical balance and acceleration.
They have become prevalent and can measure body movement which links intuitively to SCTs that influence steering, braking, and accelerating [2].2. Multi-sensor networks integrate different sensors [1,72,135], enabling richer datasets via simultaneous measurements like HRV, EMG, and brain oxygen via portable infrared imaging [7,123].3. Implantable and ingestible sensors are rising [62,68,73,117,142], collecting unique data despite privacy and ethical concerns.4. Sensors are integrated into clothing [42,104,140].
So-called e-textiles offer opportunities for both wearability and as communication devices.
Additional data sources beyond body sensors, like experience sampling methods, sensors in bicycles, surveys, and environmental recordings, are also crucial for comprehensive insights.Such sources will not be further discussed here, since they have been reviewed extensively recently [27,63,138].

Confounding variables
Within experience research, it is common knowledge that factors from the context and human participants have a strong influence on what is experienced [103].The reviewed articles show a high number of factors that have been included in the analysis.It is beyond the scope of this paper to discuss confounding factors in the evaluation framework in detail because these factors have already been reviewed thoroughly in the literature from the transportation domain [14,56].Validated questionnaires that can be employed before or after data collection with wearables have also been reviewed [114].Additionally, there is a growing amount of open data platforms, such as OpenCycleDataHub and National Portals of Road Data, which can provide possible relevant data about confounding variables [85,113].
A trade-off is to be made between more controlled outdoor evaluations on the one hand and more "in the wild" approaches on the other hand.Controlled routes offer a stronger understanding of the effects of confounding variables, however, the experiences are less naturalistic and lead therefore to lower generalizability of findings.Studies where participants can choose freely wherever and whenever they want ("in the wild") offer a great deal of insight into the context of experiences as they occur naturally, however, these studies will be subject to larger error terms and limitations than more controlled studies [70].
It is noteworthy that the selected studies did not explicitly address attitudes and digital skills for new technologies.These factors are important because individuals with higher levels of scepticism towards new technologies and/or with lower digital skills may face potential disadvantages [37,77,128].As an illustration, it has been shown that designing mobile technologies for elderly people requires adherence to specific design guidelines [59].The lack of attention for factors related to attitudes and digital skills for new technologies means that these factors need more prominent attention in future evaluations.

Data analysis
To understand the impact of SCTs on cycling experience, it is necessary to have an answer to the following important but challenging question: which changes in experiences are caused by SCTs?Reviewing selected studies shows that the study designs in the reviewed studies are not always suitable to substantiate claims about causation.The reasons for this lack of suitability are relatively small sample sizes, lack of statistical control for confounding variables, and lack of data analysis approaches that can deal with large quantities of labelled and unlabelled data.Relating the review of data analysis approaches to knowledge about causal inferences reveals that the following is necessary to identify causes for experiences: 1. Mixed method approaches should triangulate data from multiple sources to find robust links between sensor data and experience types.Triangulation should include the ground-truthing of sensor data.Combining a body sensor system with camera recordings, experience sampling, and control for subject-level variables via pre-and post-ride surveys has been shown to be an effective combination [71].Via this combination, "objective" sensor data can be cross-checked with subjective self-reports and confounding factors from the environment of the ride.2. Control for the influence of confounding factors should be in place, preferably via statistical means.Studies should explicitly describe which confounding factors were controlled for and which were not controlled but could have played a role.The study by Fitch et al. [43] is an excellent example of both listing and statistically controlling for confounding factors.3. Statistical and/or machine learning methods should be used for determining associations between factors of interest.An excellent example of an exten-sive approach for statistical analysis to determine relationships between chosen factors is the study by Yang et al. [141].4. Evaluations should intensify the use of criteria for causality [57,101] to facilitate reflection about the ways in which the findings at hand can be explained.5. Evaluations should be aware that achieving a complete understanding of all the causes for experiences is nearly impossible.This is because of the subjective, complex, and diverse nature of experiences.It is also necessary to stay aware that some types of SCTs may have only a marginal or negligible influence on cycling experiences [127].It may be that moods, attitudes, or the environment have a stronger effect on cycling experience than SCTs.
It is noteworthy that none of the selected studies used approaches such as Bayesian networks, Granger causality, and deep learning.The multilevel regression models that are currently used in many of the reviewed studies are not able to address, for example, raw camera and microphone recordings.This is noteworthy because it is reasonable to expect that future datasets about cycling experiences will increase in size and complexity.Automated data retrieval about subject-and context-level variables via road and weather authorities is expected to enrich these datasets.Additionally, the number of participants is expected and recommended to increase.With complex datasets containing both labelled and unlabelled data from 100s, 1000s, or even more individuals, more advanced data analysis approaches are necessary.
Last, it is remarkable that only one of the selected studies presented or discussed optimal ranges for experience measurements.Only the study by Mantuano et al. [78] reported that an equilibrium in visual attention was found in the combination of attention for the central trajectory and lateral parts of the visual scene.For stress and excitement, a similar balance is expected to exist because it can be argued that cyclists desire to reach a middle ground between a too exciting and a too boring ride.The domain of HRV and stress response analysis provides insights for presenting and discussing such optimal ranges, since various norms for HRV measures are available [17,116].

Discussion
This section will now briefly discuss the outcomes, limitations, and recommendations of this paper.

Outcomes
The first key outcome from the paper is the systematic literature review.The literature review showed that research has focused on evaluating stress response, mostly using data from HR, EDA, and ST sensors in chest belts and wrist bands.The current review found only 7 sensor-based evaluations of an SCT.These studies had relatively small sample sizes, no probability sampling approaches, and no extensive control for confounders.This finding means that methods for evaluating SCTs are falling behind research in other domains such as affective gaming and neuromarketing [8,50,95,98].The relatively low number of sensor-based evaluations of SCTs is remarkable, considering the importance of subjective cycling experiences and the rise of SCTs [63,89].The findings mean that future research should focus on different types of experiences, including positive experiences, with different types of multimodal sensor networks.This is especially important because cyclists become more susceptible for positive experiences once basic safety conditions are met [54].The findings also mean that larger-scale field trials are necessary with randomly selected participants, different types of SCTs, and more advanced analysis tools.
The second key outcome from the paper is the conceptual framework.The framework synthesizes insights from research on sensor systems, human-computer interaction, cycling experience, machine learning, and more.The synthesis provides crucial factors and methods for future impact evaluations, as practical a guidance for experts who prepare and conduct evaluations.For example, what experiences are to be included, and how are they to be measured, in a naturalistic evaluation of an intelligent speed adaptation (ISA) system for cyclists with 500 participants and multimodal sensor data?An ISA system can provide speed advice to cyclists and may in the future reduce motor assistance levels.Recent progress by the City of Amsterdam motivates an interest in evaluating ISA [61].A key principle in the framework is that ideally, in evaluations, data from multiple sources is triangulated.
It is noteworthy to point out how the framework guides evaluation in practice.Findings in the first category of the framework, "experiences with SCTs", mean that experts should choose which aspects of experience and SCTs to focus on.If we follow the example of evaluating ISA, one possible avenue is to understand if and how changes in motor support levels are linked to physiological measures and to psychological flow.Understanding these links can help to design an optimal ISA system, for example, by identifying how interventions help and harm flow.Such approach complements a more traditional approach of evaluating objective traffic flow in terms of throughput and cycling speeds [107].Findings in the second category, "experience measurements", mean that new types of sensors integrated in new materials, positions, and form factors provide new avenues for understanding subjective dimensions of cycling.Examples include integrating infrared imaging sensors into headbands [126], push buttons on bicycle handlebars for experience sampling [90], and using a combination of HRV and cadence data to measure psychological flow [18].The third category "data analysis" provides insights on methods for moving from association to causation.Criteria to establish causality help to understand whether associations are strong, consistent, specific, biologically plausible, and so forth [57].Example questions include whether the effects of an ISA system are reproducible in various countries and cultures, how patterns in sensor data match to ground truths from experience sampling approaches, and how the participant recruitment strategy influences analysis results.Findings in the fourth and last category, "confounding variables", emphasize a need to control for other variables at play.Public databases can provide the data for such variables, for example, about aspects like road width, traffic volume, weather, and so forth.
Overall, it is important to acknowledge that research on the future of Human-Computer Interaction means that it is valuable to imagine future interactions with SCTs as a type of collaboration between humans and Artificial Intelligence.In such collaboration, SCTs may use realtime body sensor data to deliver communications and interventions to cyclists via biofeedback mechanisms.Attempts to understand effects Human-AI-collaboration while cycling have been started [4], and continuing this avenue of research holds tremendous potential.

Limitations
The paper is subject to the limitation that the specific search query may have left out articles due to little consistency in terminology in this field of research [11,63].The current review should therefore be considered an extensive exploration rather than a complete overview.
Another limitation is that this review excluded studies that evaluated an SCT without body sensor data.These studies were excluded because they did not match the aim of the study.Excluding these studies leaves this review with a relatively low number of studies that used body sensor data in evaluating experiences with SCTonly seven studies have been found.Studies on SCTs without body sensors may provide valuable insights about, for example, interaction patterns, communication modalities, effect sizes, and control for confounders.Thus, reviewing such studies is part of the recommendations for future research.

Research directions
Seven recommendations for future research emerge from the literature review and conceptual framework.The first recommendation is to validate the conceptual framework in practice, to validate whether the framework captures all necessary and relevant factors and methods.The framework can then help to develop knowledge about which factors in SCTs that are most and least effective in supporting subjective experiences.
The second recommendation is to broaden to perspective on cycling experiences, to include more positive, diverse, and nuanced aspects of cycling experience.For example, experiences of shared flow, transformational experiences, and changes in self-perception.Reference lists are desired, to know which measurement values point to "optimal" experiences with SCTs, analogous to a list of normal values for heart rate variability [87].
The third recommendation is to utilize advancements in multimodal body sensor networks.This utilization should include interdisciplinary efforts that include knowledge from affective computing [108,114], sensors inside the human body and in e-textiles [129,140], and multi-sensor networks [7,16,52].
The fourth recommendation is to tailor computational methods for causal analysis to large and unstructured datasets about cycling experience.Analyses of cycling experience data would benefit from advancements in sensor fusion, deep learning, and neural networks [135].Close attention should be paid to statistical control for confounding variables.Automating the retrieval of weather, traffic, and spatial data [138] is expected to help in this regard.Attitudes and skills related to new technologies need more attention [37].Validated rating scales should be used more often to establish questionnaires for self-reports and sociodemographic data [114].
The fifth recommendation is to improve consistency in reporting.Although it sounds trivial, to allow better comparison of results, future research papers should better describe and justify research design aspects such as experience type, sample sizes, sensor types, bicycle types, confounding variables, route choice, and so forth.Also, considering the wide variety of terminology used to describe types of cycling experience, the theoretical grounding and consistency of key terminology for cycling experiences and SCTs should be strengthened.
The sixth recommendation for further research is to widen the scope of the review.For example, evaluations of SCTs without body sensors and commercial frameworks for measuring cyclist emotions [24,34,120,131] could be included.
The seventh and last recommendation relates to SCTs that are increasing in smartness level [64].Such systems are increasingly using body sensors in real time and are also increasingly intervening physically in the ride [22,32].Therefore, SCTs that intervene physically, with autonomy and intelligence, and with biofeedback systems should be investigated more closely.In this regard, cycling experience research may benefit from alignment with the field of "Human-Computer Integration" [39,48,82].

Conclusions
To conclude, this paper fits to a need to promote cycling and to evaluate how cycling can be made safer and more attractive.Numerous types of SCTs are emerging and it is important to have robust methods to establish whether applications like green wave systems, ISA, and collision avoidance systems truly improve cyclist perceptions, emotions, and feelings.That is necessary, because such subjective dimensions influence decisions to start and keep cycling.Sensor-based methods to evaluate impacts on these subjective dimensions had not been well investigated in cycling experience research.To contribute to this context, this paper set out to conduct a systematic review and to develop and present a conceptual framework, both to guide future sensor-based evaluations of the impact of SCTs on cycling experience.
Regarding the aim to conduct a systematic literature review, it is concluded that most of the reviewed studies (n = 40) do not use a research design that is readily available for large-scale naturalistic evaluations of experiences with SCTs.Exploratory studies with non-random sampling and a lack of strong control for contextual variables limit understanding cause-effect relationships.Nevertheless, some studies provide useful insights, for example the studies that used experience sampling for collecting ground truths for sensor data.To deal with these drawbacks and opportunities, this study developed the conceptual framework for future evaluations.
Regarding the aim to present and develop a conceptual framework for future impact evaluations, it is concluded that the key principle for future evaluations is that data from multiple sources should be triangulated.Triangulation is important to future evaluations, to distinguish between effects of SCTs and effects of other context-and subject-level variables.For example, emotions during a bike ride could be due to an intervention by a SCT, a disturbing phone call, or an aggressive car driver.Confounding variables should be statistically controlled, to enrich the understanding of cause-effect relationships.Especially studies with SCTs that use biofeedback and collaboration between cyclists and artificial intelligence are worthy of future research.
Altogether, this paper provides important guidance for future evaluations of the impact of SCTs on cycling experience.These evaluations contribute to the design of future support systems for cyclists, thereby unlocking the myriad advantages associated with cycling.

Fig. 3
Fig. 3 Types of experiences evaluated in selected studies

Fig. 4 Fig. 5
Fig.4 Types of sensors used in selected studies

Fig. 6
Fig. 6 Route types in selected studies

Fig. 7
Fig. 7 Number of participants in selected studies

Fig. 9 A
Fig. 9 A conceptualization of evaluations of the impact of SCTs on cycling experience

Fig. 10
Fig. 10 SCTs, cycling experiences, and body sensors that should be linked in evaluations

Table 1
A summary of key characteristics of selected studies

Table 1 (
visualizes the framework with factors in blue and methods in yellow, grouped into four categories: experiences with SCTs, experience measurements, confounding variables, and causal analysis.The next sections explain each of the categories.ECG electrocardiogram, PPG photoplethysmogram, ECG and PPG sensors were used to measure heartbeat and heartbeat variability rates.EDA electrodermal activity, EMG electromyography, ET eye tracking, ANOVA analysis of variance continued)

Table 2
[63]]ptualizations of cycling experiencesRundio et al.[105]Experiences of personal transformation, e.g., identity change in case of an elderly person losing the ability to cycle.Extraordinary experiences, e.g., experiences of mindset and habit changes after a 10.000 km cycling journeyAndres[3]SCTs as thrillers, partners, detractors, and assistants, which result in 12 different types of cycling experiences.E.g., SCTs as thrillers lead to experiences of competition, SCTs as detractors lead to experiences of discouragement Kalra et al.[63]Perceived safety, perceived comfort, aggression, anxiety, risk perception, emotional stress, conflicts, threats