Since the advent of moving images and audiovisual media, tempo has been manipulated in various forms. Media consumers, be it for videos on social networks, commercial films, or video clips of musical performances, are increasingly confronted with displays of human motion that deviate from standard movement patterns. Given the high relevance of body movements and visual information on the perception of musical performances (Platz & Kopiez, 2012) and the vital link between tempo, arousal, and emotion (Droit-Volet et al., 2013), this study investigated how tempo manipulation in point-light recordings of dance movements, accompanied by drumbeats, were emotionally perceived in relation to original, un-manipulated movements at the same speeds.
Manipulating time by slowing down or speeding up has been a common technique for elevating emotions in films (Wöllner et al., 2018), commercial video marketing (Yin et al., 2021), or the replay of highlights in sports matches and games (Pan et al., 2001). In the film Matrix (Wachowski & Wachowski, 1999), when Neo bent backward to dodge the bullets in slow motion, film critics called this moment “bullet time” and referred to this moment as majestic and “incredibly cool to see, even after so many years since the film's release in 1999” (Dutta, 2022, para. 1). In contrast, accelerated visual scenes usually speed up the processes that otherwise span over hours and longer in standard time or create an amusing effect such as jittered human movements. Understanding the effects of tempo manipulation on emotions can help leverage such tools by acceleration and deceleration to capture attention (Compton et al., 2003), enhance memories (Levine & Pizarro, 2004), or change the perceived time (Droit-Volet & Berthon, 2017). In music and dance practices, by manipulating tempo, artists can shift the emotional expressions in their performances to create unique interpretations of the same work (Quinn & Watt, 2006). Moreover, tempo manipulation may also be useful in music therapy when assisting treatment (Brownlow, 2017) or regulating emotional experiences.
Relatively few studies have shed light on how tempo manipulations affect emotions. Decelerated slow-motion film scenes, ballet, and sports excerpts were perceived as lower in emotional arousal and more positive in valence than tempo-matched, real-time footage (Wöllner et al., 2018). These results were also present in the physiological responses of participants. Further analyses with tempo-adapted film scenes found smaller pupil dilation and higher fixation ratios with slow-motion than with the real-time videos (Hammerschmidt & Wöllner, 2018), suggesting a lower arousal level and increased attention to detail. The findings are consistent with research that found decreased arousal with slow music (Droit-Volet et al., 2013). Furthermore, manipulation of movement speed is frequently employed in popular culture to enhance the audience’s attention to the scenes' details and increase aesthetic pleasure (Nebens, 2021). However, to our knowledge, no studies have directly examined the effects of tempo manipulation on the two dimensions of emotions: Arousal and valence.
To examine the link between movement speeds and emotions, it is crucial to define how emotions are theorized and measured. Research has developed several systems to describe emotions (Mauss & Robinson, 2009). Among them, the circumplex model (Russell, 1980), which entails a two-dimensional system of valence and arousal, has been adopted to evaluate facial expressions (Calder et al., 2001) and human movements (Pollick et al., 2001). Following Pollick et al.’s (2001) research, several studies suggest that emotion inferences were also possible from static body postures (Coulson, 2004), gestures (Castillo & Neff, 2019), gait (Kang & Gross, 2016), social interaction (Clarke et al., 2005), movements in joint musical improvisation (Wöllner, 2020), and dance movements (Burger & Toiviainen, 2020a). Together, these findings indicate a connection between movement features and the two emotion dimensions.
Studies further investigated the link between emotions, kinematic features, and movement angularity, the latter referring to the perspectives and geometrical relations of movements in a three-dimensional space. Castro and Boone (2015) found a connection between the accuracy of emotion perception and movement angularity. In this case, happy and sad body postures, being located respectively in the circumplex model, were recognized more accurately when participants were more sensitive toward the angles of geometric line patterns. Kinematic features may be critical indicators of emotions when viewing full-body movements. Several kinematic features have been identified to relate to specific basic emotions. A further study on the motion-emotion link revealed that music-induced emotions, reflected in spontaneous dance movements, were characterized by a set of kinematic features (Burger et al., 2013). In this regard, movement fluidity provides an index quantifying the smoothness of the motion, whereas complexity relates to how many dimensions or directions of the movement are used, with higher dimensionality being more complex than movements of lower dimensionality. Burger et al.’s (2013) study indicated that high fluidity correlated with music-inducing emotions of low arousal yet moderately positive valence. In contrast, high movement complexity correlated with music that induced emotions of moderately high arousal and positive valence. This finding was in line with other studies that suggested similar features, and movement fluidity and complexity were reliable indicators of basic emotional states such as happiness or sadness when enacting (Van Dyck et al., 2013) and perceiving human motions (Camurri et al., 2003; Montepare et al., 1999).
Tempo changes influence kinematic features. A study with professional drummers performing in various tempi found that movement fluidity was higher with slow compared to fast tempo, while movement complexity was the highest with slow tempo when the amount of drummer’s movement peaks (Burger & Wöllner, 2023). The finding indicates the influence of tempo but not of tempo manipulation. It is yet to be found whether acceleration and deceleration of motion affect the kinematic features in visual displays. Motion capture with point-light displays (PLDs) is often used to derive kinematic features. PLDs of human movements, initially developed by Johansson (1973), entails the technique that extracts movement patterns by attaching reflective markers to human bodies and recording the movement of these markers using an optical motion capture system, receiving an accurate 3-dimensional representation of the movement (e.g., Burger & Toiviainen, 2020b). PLDs have been widely used in emotion research. A study investigating participants’ discrimination sensitivity towards fluctuations in emotional intensity with PLDs suggested that dynamic PLDs were associated with higher sensitivity than static PLDs on a level comparable to full-light displays (Atkinson et al., 2004). The studies showed the potential of accurate emotion recognition with dynamic PLDs. In addition, PLD does not carry confounding factors such as the age or clothing of the performer, making it ideal for experimental designs.
Parallel to kinematic features, movement tempo influences the perceived emotions. Temporal attributes of PLDs were found to strongly predict the actor's emotional states in motion-emotion research (Pollick & Paterson, 2008). Fast movements were often associated with happiness and anger, while slow movements were associated with sadness and a neutral mood (de Meijer, 1989; Montepare et al., 1999; Pollick et al., 2001; Roether et al., 2009). Taking this a step further, according to the circumplex model (Feldman Barrett & Russell, 1998), a fast tempo was related to a high arousal level and a slow tempo to low arousal. In contrast, slow and fast movements were found at both ends of the spectrum of the valence scale. While the link between tempo and arousal has often been observed, the correlation between movement tempo and emotional valence is less clear. It cannot be ruled out that a specific tempo represents multiple emotions regarding valence. In a passive viewing task, fast movements were rated angry or happy, while slow movements were rated sad or neutral (Montepare et al., 1999). The ambiguity of the tempo-valence relationship in identifying dance-conveyed emotions was echoed by studies adopting local movements such as knocking (Gross et al., 2010), drinking (Pollick et al., 2001), and walking (Roether et al., 2009). The findings, therefore, call upon further validations, which are part of the goals of the current study.
In addition to tempo, a movement's natural appearance may affect the perceived emotions. The tempo-original and -manipulated movements may be distinguishable from how natural they look. Despite few studies on this topic, naturalness has been frequently used to measure the validity of artificially generated motion in virtual reality (Knopp et al., 2019). Highly natural movements represent a crucial overlap between the temporal patterns of the virtual and realistic movements that match human prediction. Interestingly, Nilsson and colleagues (2015) found that the tempo thresholds for natural treadmill walking in virtual reality are higher than walking in place (WIP), suggesting expectations for the pace of perceptually natural walking depend highly on contexts (treadmill or WIP). The definition of natural walking tempo differs by whether a person is moving forward or not. The finding indicates the possibility that viewers who controlled the avatar for walking in virtual space also considered the temporal attributes of different types of motion. Similarly, when viewing real-time compared to decelerated or accelerated movements, naturalness may be tied to how much they fit with typical, realistic movement features at the corresponding speeds. However, research has yet to explore the connection between tempo and tempo manipulation and perceived naturalness. Although no direct evidence has been found, Chen and colleagues’ (2023) study revealed that prototypical (most representative of the motion type) walking was perceived as more natural and aesthetically pleasing than atypical walking. This study suggested that high naturalness mediates the effects of visual attributes on aesthetic pleasure. Such an effect might be extended to positive emotional valence. The current study aims to investigate whether tempo manipulation affects the perceived naturalness of body movements and, if yes, whether increases in naturalness affect emotional valence positively.
Similar to tempo, the sensory modality also affects the perceived emotions. Preferences for multimodal rather than unimodal information were found for an emotion recognition task, in which facial expressions of fear and disgust, with or without vocal sounds of the consistent emotions, were presented to viewers (Collignon et al., 2008). A significant improvement in emotion discrimination performances was observed when PLDs of human movements were integrated with voices that expressed affective states, such as anger and fear, compared to neutral voices (Jessen et al., 2012). Multimodal information also increases perceived arousal. A study investigated how visual kinematic features and auditory information contribute to emotion perception in musical performances (Vuoskoski et al., 2016). The findings suggest that the audiovisual condition was perceived higher in emotional arousal than the visual-only condition, while the latter was also rated less positive in valence. The impact of tempo (e.g., comparing 72 BPM to 184 BPM in Droit-Volet et al., 2013; Wöllner et al., 2018), particularly tempo-manipulated movements, is yet to be investigated. Tempo acceleration and deceleration are typically accompanied by auditory information in movies, commercials, or sports, giving rise to strong emotional responses (e.g., Wöllner et al., 2018; Yin et al., 2021). Understanding the effect of sensory modalities in tempo manipulation will shed light on the realistic use of movements in audiovisual media. Therefore, another goal of the current study is to compare visual and audiovisual presentations of the same movements in tempo-original and -manipulated conditions for their effects on the perceived emotions.
Taken together, the current study aims to investigate how real-time (original), decelerated, and accelerated (audio-) visual stimuli influence perceived emotions. With the results, we hope to provide insights into how the presentations of music and dance performances affect the viewers’ emotional experiences. Additionally, we hope to offer solutions to regulate one’s moods by managing the tempo of videos and music playlists in everyday life.
We predict that tempo manipulation affects the movement features, which in turn affects emotional valence and arousal. When movements are presented at decelerated speeds, fluidity should increase, and complexity decrease, leading to lower perceived arousal and higher perceived valence than the tempo-original condition (Burger et al.,2013). Furthermore, we hypothesize that tempo manipulation as an independent variable affects naturalness as a dependent variable: The larger the extent of manipulation (manipulated – original tempo), the less natural body movements are perceived. We further expect that naturalness acts as a mediator to the effect of tempo manipulation on perceived valence and arousal. The larger the extent of manipulation, the less natural the stimulus is perceived, and the more negative valence and the lower arousal should be perceived. Therefore, naturalness will act as an independent variable in this analysis to predict changes in emotional arousal and valence as dependent variables. According to Pollick et al. (2001), we hypothesize that a faster presentation tempo leads to higher perceived arousal. Finally, the presentation modality is expected to moderate the perceived emotions such that the presence of auditory drumbeats, synchronized with the visual movements, should lead to higher arousal and more positive valence.
Method
Participants
To determine the minimum sample required to test our hypotheses, a prior power analysis was conducted using G*Power Version 3.1.9 (Faul et al., 2009), suggesting a minimum of 53 participants for achieving 80% power, a medium effect size (f2 = 0.3) for linear multiple regression (fixed model; significance level α set at 0.05). An international sample of 62 participants was recruited for an online experiment using the platform SoSci Survey (Leiner, 2019) (29 females, one gender undisclosed; Mage = 29.23 years, SDage = 8.83). According to the Music Training dimension from the Goldsmith Music Sophistication Index (Müllensiefen et al., 2014), participants have been trained for an average of 3.31 years (SD = 4.06). The Dance Training dimension from the Goldsmith Dance Sophistication Index (Rose et al., 2020) suggested that participants, on average, have 0.79 years of active dance training (SD = 1.63). This suggests a low professional music and dance training prevalence in the sample population. The majority of the sample had achieved bachelor’s (N = 22) or master’s (N = 27) degrees, standing for a generally well-educated group (overall 79%). The study was approved by the Ethics Committee at the Faculty of Humanities, University of Hamburg, and participants provided their informed consent before the study. A lottery of two prizes worth € 30 was carried out at the end of data collection, including those who had opted to leave their email address.
Apparatus
The stimuli consisted of visual and audiovisual presentations of human movements at three tempi: 86 (slow), 130 (medium), and 195 BPM (fast). The visual stimuli, detailed in Allingham et al. (2021), were human movements recorded by an eleven-camera motion-capture system (Qualysis Oqus 700) at 200 frames per second (framerate). The performer (male, 32-year-old) jumped from one leg to the other while raising the arms parallel to the ground, flexing and extending the wrists ipsilaterally to the leg motions. The movements, each lasting 10 seconds, entailed bilateral hand flaps and left-right jumps (see Figure 1) and were recorded at the three above-mentioned original tempi. The MATLAB Motion Capture (MoCap) Toolbox (Burger & Toiviainen, 2013) was used to time-shift the original data and create animations that matched each original movement with the other two tempi; for example, the performance at 130 BPM was slowed down to match 86 BPM as well as sped up to 195 BPM (see Figure 2).
Figure 1
Figure 2
The tempo transformation yielded nine video excerpts: slow-original, slow-to-medium, slow-to-fast, medium-original, medium-to-slow, medium-to-fast, fast-original, fast-to-slow, fast-to-medium. A slow-to-fast stimulus, for example, has an original tempo of 86 BPM and a presentation tempo of 195 BPM. Additionally, nine temporally synchronized audiovisual presentations were produced by aligning the visual material with an auditory drum beat (sequence of isochronous beats at the respective BPM synthesized in an online beat generator, https://drumbit.app, in Apple iFilm 10.1.12). In addition, the stimulus set included eight catch trials, which varied in duration to examine if participants paid attention to the displays. Half of the catch trials lasted 5 seconds, while the other half took 15 seconds. All of them were presented with synchronized drumbeats. In total, 18 experimental trials and eight catch trials were created.
Procedure
Invitations to the online experiment on SoSci Survey (https://www.soscisurvey.de) were distributed through email lists and social media sites. Participants were provided information about the experiment and asked to give their informed consent. Due to the restrictions of online experiments, no control was imposed on screen resolution, the distance to the screen, or the device through which sounds were played. However, at the beginning of the experiment, explicit instructions were given that participation should take place in a quiet environment on a computer with compatible browsers (Chrome or Firefox) and with head- or earphones. A 15-second music excerpt was presented to test the sound volume. Participants were instructed to adjust to a comfortable sound level and to keep the level consistent throughout the experiment. They were then asked to fill in demographic information and short questionnaires about their music training and active dance experience (Factor 3 Musical Training from the Goldsmith Musical Sophistication Index, Müllensiefen et al., 2014, and Factor 4 Dance Training from the Goldsmith Dance Sophistication Index, Rose et al., 2020).
In the experiment, participants were presented with two blocks of randomized stimuli, each consisting of 18 experimental trials, including the complete sets of audiovisual and visual-only presentations and four catch trials balanced in lengths (two for 5 s and two for 15 s). The stimulus (700 x 394-pixel resolution) appeared at the center of the screen. A repeated-measures design was used to control for within-subject variability and increase the experiment’s efficiency. Following each stimulus, participants were asked to rate emotional arousal from 1 calm to 7 excited, emotional valence from 1 negative to 7 positive, and naturalness from 1 unnatural to 7 natural. No time restriction was imposed, though the video could be watched only once. A test trial using a different dancer was presented at the beginning to familiarize the participants with the experiment. With regard to the catch trials, all participants differentiated reliably between the different durations of catch trials and experimental trials, F(2, 2481) = 68.46, p < .001. The effect size ( = 0.05) indicates a medium effect. Therefore, no participant was excluded.
Analyses
In the first analysis, the model was intended to determine whether tempo manipulation influenced the perceived naturalness, which, in turn, might influence emotional arousal and valence. A mixed linear regression was conducted. A group of independent variables was adopted to identify possible predictors of the perceived naturalness from a group of relevant variables: Movement complexity, fluidity, presentation tempo, tempo manipulation, and stimulus modality. The multilinear regression can be found in Equation I in the Appendix. The model was selected based on the lowest Akaike Information Criterion (AIC) values. To select the model of the highest goodness of fit for each dependent variable, maximum likelihood ratio (MLR) tests were conducted (see Table A4 in the Appendix). In the MLR tests, predictors were added one after another from the baseline model, in which only the random effects were present. The variances of participants and conditions were considered as random effects. While adding significantly to the previous model, models with the lowest AIC were considered the final models.
The dependent variable for Equation I is perceived naturalness, and the participants rated it on a 7-point scale in response to the question, “Please rate how natural the movement feels.” 1 represents the least natural movement, and 7 represents the most natural movement. The predictor variables were created in the same way as the variables for the second and third models (Equations II and III).
The second and third analyses also used a mixed linear regression to identify possible emotional arousal and valence predictors from relevant variables: Movement complexity, fluidity, perceived naturalness, presentation tempo, tempo manipulation, and stimulus modality. Post-hoc analyses with Bonferroni correction were conducted to follow up on significant main and interaction effects. The independent variables selected include movement complexity and fluidity. The models can be found in Equations II and III in the Appendix.
The dependent variables from the arousal and valence models include the following:
Perceived arousal: Ratings from 1 (calm) to 7 (excited).
Perceived valence: Ratings from 1 (negative) to 7 (positive) in response to the question, “Please rate whether the emotion of the video is negative or positive.”
The independent variables from all models above include the following:
Fluidity of body movement: The smoothness of the movement (the ratio between velocity and acceleration). This variable had been standardized as follows: 1 (least fluid) to 2 (most fluid) by the equation: (x – min(x)) + (max(x) – min(x))/(max(x) – min(x))
Movement complexity: This refers to the dimensionality of the movements, which is based on a Principal Component Analysis of the movement data. In this case, the movement is considered simple when the first five principal components can explain a large amount of the variance. The movement would be more complex when the first five components explain less variance. See Burger et al. (2013) for more explanation of the features. This variable has also been standardized from 1 (least complex) to 2 (most complex).
Modality: The modality (visual-only, audiovisual) of each stimulus.
Presentation tempo: The tempo presented to the participants after manipulation. For tempo-original stimuli, the presentation tempo equals the original tempo.
Tempo manipulation: The extent and direction of tempo manipulation, where -1 represents the largest deceleration, 0 represents no manipulation, and 1 represents the largest extent of acceleration. The index is calculated as follows: First, calculate the gap between the manipulated and original tempo as x, set the new anchor to -1 (new min), representing the largest gap in minus). 1 (new max) represents the largest gap above 0. Then run linear standardization of the value ((x – min(x)) / (max(x) – x)) * (new_max – new_min) + new_min.
Perceived naturalness: Ratings from 1 (least natural) to 7 (most natural) were selected as a control variable to disentangle the contributions of the movement features to changes in emotional valence and arousal.
Results
Two kinematic features were extracted from the motion capture data of the nine tempo-original and tempo-manipulated performances. Since original performance tempi were tempo-manipulated in two directions (acceleration, deceleration) or stayed the same, movements at the same presentation tempo may have different fluidity and complexity features (see Figure 3), depending on whether they were accelerated, decelerated, or performed initially at this speed.
Figure 3
Pearson correlation coefficients were computed to examine the linear relationship between presentation tempo and movement features. Negative correlations between tempo and fluidity (r = –0.89, p < .001) and between manipulation (acceleration) and fluidity r = –0.92, p < .001) were found. Thus, the faster the presentation tempo and the larger the extent of tempo acceleration, the less fluid the body movements. Positive correlations were found between tempo and complexity (r = 0.09, p < .001) and between acceleration and complexity (r = 0.70, p < .001), indicating that the faster the presentation tempo, the larger the extent of acceleration, the more complex the movements.
The Effects of Movement Features on Perceived Naturalness
The linear mixed models revealed significant main effects of tempo manipulation on the perceived naturalness. A significant two-way interaction between the presentation tempo and manipulation was found (see Table 1). Post-hoc comparison with Bonferroni correction suggested that, when the presentation tempo is medium, accelerated movements are perceived more natural (M = 4.57, SD = 1.39) than original (M = 4.40, SD = 1.39) and decelerated movements (M = 4.41, SD = 1.42), p < .001. The effect size of the model, as measured by Cohen’s f2, was f2 = 0.04, indicating a small effect. Please note that the comparison was not possible with the other tempo conditions as neither have all levels of manipulation.
Table 1
Variable | B | SE B | t | p |
---|---|---|---|---|
0.37 | 0.09 | 4.03 | < .001*** | |
–0.14 | 0.03 | -5.24 | < .001*** |
Note. For tempo manipulation, -1 to 0 represents deceleration, 0 stands for no manipulation, and 0–1 represents acceleration. For modality, 1 = Visual-only, 2 = Audiovisual. Values and definitions of the abbreviations are consistent with those in Table 1. For the full table, please refer to A2 in the Appendix.
***p < .001.
A one-way ANOVA was run to examine the overall effect of manipulation on the perceived naturalness across all presentation tempi, F(2, 2229) = 6.51, p = .002, with a small effect ( = 0.006). Significant main effects were found, suggesting that decelerated movements were perceived as significantly less natural than tempo-original ones (see Figure 4). No significant difference between accelerated and original movements was found.
Figure 4
The Effects of Movement Features on Emotional Arousal
A mixed linear regression revealed significant main effects of presentation tempo, modality, and perceived naturalness on emotional arousal. The effect size of the model, as measured by Cohen’s f2, was f2 = 0.71, indicating a large effect. A faster tempo was associated with higher arousal. Audiovisual stimuli, including the drumbeats, are perceived higher in arousal than the visual-only ones. In addition, higher naturalness is associated with higher arousal (see Figure 5 lower pane, Table 2). A significant interaction between fluidity and presentation tempo was found. Post-hoc comparison with Bonferroni correction suggested that when tempo is medium, low fluidity (lower than 50 percentiles) was associated with higher arousal (M = 4.47, SD = 1.21) compared to high fluidity (M = 3.98, SD = 1.29), p < .001. Please note that the comparison was impossible with the other tempo conditions as neither has both fluidity levels.
Figure 5
Table 2
Variable | B | SE B | t | p |
---|---|---|---|---|
–2.91 | 0.99 | –2.94 | .003** | |
0.10 | 0.02 | 0.10 | < .001*** | |
0.28 | 0.05 | 0.28 | < .001*** | |
3.60 | 0.81 | 3.60 | < .001*** | |
–1.74 | 0.58 | –3.03 | .002** |
Note. Tempo = Presentation tempo. Fluid = Fluidity of body movements. Natural = Perceived naturalness. Modality = Sensory modality: 1 = visual-only; 2 = audiovisual. For the full table, please refer to Table A3 in the Appendix.
**p < .01. ***p < .001.
A one-way ANOVA was run to examine the overall effect of fluidity on emotional arousal across all tempo conditions. Fluidity was split into three levels: Low (lower than 33 quantiles), medium (33 to 67 quantiles), and high (higher than 67 quantiles) fluidity, and resulted in a significant effect on arousal, F(2, 2229) = 445.54, p < .001 (see Figure 5, upper pane). The effect size ( = 0.29) indicates a large effect. The significant main effect suggests that high fluidity was perceived to be significantly lower in arousal for all tempo conditions than medium and low fluidity.
The Effects of Movement Features on Emotional Valence
The mixed linear regression revealed significant main effects of modality and perceived naturalness (Table 3), while movement features, tempo or manipulation showed no significant effects. Increases in naturalness are linked with higher valence (Figure 6). The audiovisual stimuli were perceived to be less positive compared to the visual-only stimuli. The effect size of the model, as measured by Cohen’s f2, was f2 = 0.27, indicating a medium effect.
Figure 6
Discussion
The current study investigated the effects of tempo manipulation on perceived emotional arousal and valence. Variables investigated were movement fluidity and complexity, the tempo-manipulated and original movements' perceived naturalness, the presentation tempo, and the sensory modality. Results suggest that: 1) Arousal was influenced by movement fluidity and naturalness. The higher the movement fluidity and the lower the naturalness, the lower the arousal. 2) Naturalness was affected by tempo manipulation. At a medium presentation tempo, accelerated movements were rated more natural than decelerated or even original ones. Overall, decelerated movements were perceived to be the least natural. 3) Emotional valence was most strongly affected by naturalness rather than tempo or kinematic features, such that more natural movements were also rated more positive in valence. Taken together, changing the speed of movements in audiovisual presentations has manifold consequences for the perception of movement qualities and emotions.
The manipulation of presentation tempo, as is often undertaken in video clips and films for various emotional purposes (Wöllner et al., 2018), affected the fluidity of movements. Higher movement fluidity, in turn, was associated with lower emotional arousal, confirming the results of previous studies that found, for example, tender-related motions to be less jerky than anger-elicited ones (Burger et al., 2013). According to Dahl and Friberg (2007), anger was related to low movement smoothness and happiness with high smoothness—the latter was similar to anger in its high emotional arousal but is characterized by higher emotional valence. Similar results were found for dance (Camurri et al., 2003) and music-induced movements (Boone & Cunningham, 2001), such that happiness and anger which are both relatively high in the arousal dimension, induced more frequent tempo changes and higher jerkiness in movements than sadness. Dance movements with high fluidity were also correlated with tenderness and sadness (Burger & Toiviainen, 2020b).
In our study, given that tempo acceleration was strongly correlated with lower fluidity (see Figure 5), participants distinguished between the tempo manipulations in terms of fluidity differences in the tempo-original and tempo-manipulated stimuli. They also perceived the emotional arousal of the two types of movements accordingly. Fluidity might thus be a more salient indicator of tempo manipulation than complexity. It should be stated that movement complexity is inherently related to tempo for stimuli of the same movement types: The faster the tempo, the more movement samples are displayed in a fixed window of time, leading to an increase in complexity. Thus, a potential effect of complexity on arousal can also be partially attributed to changes in tempo. Since the movements in the current study were highly controlled PLD displays, it should be further investigated if such effects can be replicated with naturalistic stimuli such as movie scenes or sports video clips. On the other hand, fluidity affects arousal but not valence, which could be caused by the different salience of the two emotional dimensions in perceived movements. Movements with positive or negative associations may thus exhibit similar fluidity, while valence perception in movements could be more ambiguous (Gross et al., 2010; Pollick et al., 2001).
On the other hand, valence perception was significantly affected by naturalness: The more natural a stimulus is perceived, the more positive the valence (see Figure 6), and the tempo-original and accelerated movements were rated to be more natural than tempo-decelerated ones. As the perceived validity of the stimuli can define the naturalness (Knopp et al., 2019), the effects may be due to an incongruence between the participants’ expectations of a given movement and how the PLD movements appeared. The smaller the gap, the more pleasant it was perceived. Movements perceived more positively in valence may also possess a unique combination of features, also known as movement “fingerprints” (Van Vugt et al., 2013), that led to an elevated quality. Van Vugt and colleagues (2013) investigated the movement of “fingerprints” or individuality via professional pianists’ movements in unexpressive and muted performances. Their study found that the timing of the pianists’ movements is different for each individual. Participants may expect similar movement features with real-life motions that manipulated motions may not have. In this way, complex motions that are less smooth in their trajectories could be most plausible. Similarly, a study found a congruence effect such that audiovisual stimuli consistent in arousal and valence level induced significant psychophysiological responses compared to inconsistent ones (Christensen et al., 2014). However, the link between naturalness and valence should be explored with a more extensive variety of movements.
Not surprisingly, we found that faster presentation tempo was linked to higher arousal. The effect of tempo on emotional arousal has been shown in various studies showing that the faster the sequence, the higher the arousal (Droit-Volet et al., 2013; Sievers et al., 2013; Wöllner et al., 2018), both in music and human movements. Furthermore, the absence of a tempo effect on emotional valence is also consistent with previous findings. Hence, tempo did not predict the emotional valence but rather the arousal level (Khalfa et al., 2008). In addition, our results suggest that emotional arousal was significantly higher in audiovisual than in visual-only presentations, whereas emotional valence was more negative in audiovisual than in visual-only presentations. Audiovisual stimuli evoke stronger emotional arousal than unimodal stimuli (e.g., Vuoskoski et al., 2016; Wöllner et al., 2018). Our finding implies that the multisensory effect persists despite tempo manipulations with the visual inputs, thus shedding light on, for example, the usage of multimedia and slow-motion videos in real-world scenarios.
A limitation of the study lies in the generalizability of the current findings, which could be enhanced with a larger number of PLD movements than that of the current sample. In future research, movements could include more scenarios that cover, for instance, interpersonal interactions and a variety of emotions. Furthermore, a higher number of smaller tempo manipulation steps may allow more detailed comparisons between original and tempo-manipulated conditions. In our experiment, the movements are generated specifically for the experiment in order to match various performance and presentation tempo conditions in a discrete and recognizable way and could thus have been unusual in comparison with day-to-day activities. In Burger et al.’s (2013) and Burger and Toiviainen’s (2020b) studies, the PLD movements are extracted from humans moving or dancing naturally to music.
Conclusions
In this study, we investigated the impact of tempo manipulation on emotional arousal and valence. Our findings reveal that increased movement fluidity led to decreased perceived emotional arousal, and decreases in perceived naturalness with tempo-decelerated movements resulted in negative emotional valence. Thus, tempo manipulations influence both emotional dimensions, particularly when slowing down from the original tempo. The ratings from our study corroborated earlier research examining the emotional and peripheral physiological responses to slow-motion scenes from movies, sports, and dance (Hammerschmidt & Wöllner, 2018; Wöllner et al., 2018).
The findings also point to the possible mechanism of how tempo manipulations are experienced through the perception of movement features such as fluidity. Identifying the gap between manipulated vs. tempo-original movements may provide insights into how artificial movements could be produced with high plausibility (Chen et al., 2023), creating advertisements with high perceived arousal and positive valence (Yin et al., 2021), or simply understanding the emotional responses that movements could elicit in various scenarios. Our results shed light on the possibilities of shifting the emotional experiences of viewers with tempo-manipulated body movements through music/dance performances or day-to-day media consumption. Future research may investigate the extent to which fluidity affects the perceived emotions, which other movement features apart from tempo affect the perceived emotions, and whether the movement features play the same role in a set of naturalistic and manipulated scenes, for which perceived naturalness should be a key factor.