Correlation-based Estimation of Ego-Motion and Structure from Motion and Stereo
- 格式:pdf
- 大小:266.57 KB
- 文档页数:7
第47卷第1期Vol.47No.1计算机工程Computer Engineering2021年1月January2021基于姿态估计与GRU网络的人体康复动作识别闫航1,2,陈刚1,2,佟瑶2,3,姬波1,胡北辰1(1.郑州大学信息工程学院,郑州450001;2.郑州大学互联网医疗与健康服务协同创新中心,郑州450001;3.郑州大学护理与健康学院,郑州450001)摘要:康复锻炼是脑卒中患者的重要治疗方式,为提高康复动作识别的准确率与实时性,更好地辅助患者在居家环境中进行长期康复训练,结合姿态估计与门控循环单元(GRU)网络提出一种人体康复动作识别算法Pose-AMGRU。
采用OpenPose姿态估计方法从视频帧中提取骨架关节点,经过姿态数据预处理后得到表达肢体运动的关键动作特征,并利用注意力机制构建融合三层时序特征的GRU网络实现人体康复动作分类。
实验结果表明,该算法在KTH和康复动作数据集中的识别准确率分别为98.14%和100%,且在GTX1060显卡上的运行速度达到14.23frame/s,具有较高的识别准确率与实时性。
关键词:康复训练;动作识别;姿态估计;门控循环单元;注意力机制开放科学(资源服务)标志码(OSID):中文引用格式:闫航,陈刚,佟瑶,等.基于姿态估计与GRU网络的人体康复动作识别[J].计算机工程,2021,47(1):12-20.英文引用格式:YAN Hang,CHEN Gang,TONG Yao,et al.Human rehabilitation action recognition based on pose estimation and GRU network[J].Computer Engineering,2021,47(1):12-20.Human Rehabilitation Action Recognition Based onPose Estimation and GRU NetworkYAN Hang1,2,CHEN Gang1,2,TONG Yao2,3,JI Bo1,HU Beichen1(1.College of Information Engineering,Zhengzhou University,Zhengzhou450001,China;2.Internet Medical and Health Service Collaborative Innovation Center,Zhengzhou University,Zhengzhou450001,China;3.College of Nursing and Health,Zhengzhou University,Zhengzhou450001,China)【Abstract】Rehabilitation exercise is an important treatment method for stroke patients.This paper proposes a rehabilitation action recognition algorithm,Pose-AMGRU,which combines pose estimation with Gated Recurrent Unit (GRU)in order to improve the accuracy and real-time performance of rehabilitation action recognition,and thus assist patients in in-home long-term rehabilitation training.The algorithm uses OpenPose pose estimation method to extract the skeleton joints from video frames,and the pose data is preprocessed to obtain the key action features that represent body movement.Then a GRU network with three-layer time series features is constructed by using the attention mechanism to realize rehabilitation action classification.Experimental results on KTH dataset and rehabilitation action dataset show that the proposed algorithm increases the recognition accuracy to98.14%and100%,and its running speed on GTX1060 reaches14.23frame/s,which demonstrates its excellent recognition accuracy and real-time performance.【Key words】rehabilitation training;action recognition;pose estimation;Gated Recurrent Unit(GRU);attention mechanism DOI:10.19678/j.issn.1000-3428.00582010概述脑卒中发病人数逐年上升,已成为威胁全球居民生命健康的重大疾病,具有极高的致残率,其中重度残疾者约占40%[1]。
组织变革担当的影响因素和效果探析-组织行为学论文-社会学论文——文章均为WORD文档,下载后可直接编辑使用亦可打印——摘要:变革担当是指员工自愿付出建设性努力来发起组织功能性变革, 以便在自己的岗位、部门或组织情境中更加有效地开展工作。
文章介绍了变革担当的概念、测量以及前因后效。
其中前因包括个体因素(如前瞻性人格、组织支持感、积极情绪等) 和情境因素(如工作自主性、管理开放性、创新氛围等) 两大类, 后效主要有工作绩效评价、工作态度和变革型领导知觉等。
未来的研究需要进一步完善测量工具、考察组织外部因素的影响、检验影响后效的其他调节因素以及探讨领导者的变革担当行为。
关键词:组织公民行为; 挑战行为; 变革担当; 影响因素; 影响效果;Abstract:In recent years, change-oriented organizational citizenship behaviors (OCBs) have received a great deal of attention from scholars in the field of managerial psychology. There has been growingemphasis on extra-role behavior or employee behavior that goes beyond role expectations in the organizational behavior literature. Scholars have argued that this phenomenon is critical for organizational effectiveness because managers cannot fully anticipate the activities that they may desire or need employees to perform. Although these extra-role activities are important, they are not sufficient for ensuring the continued viability of an organization, and organizations also need employees who are willing to challenge the present state of operations to bring about constructive changes. Hence, in this study, we focus on a form of extra-role behavior that has been largely neglected, namely taking charge.Taking charge refers to voluntary and constructive efforts by individual employees to affect organizationally functional change with respect to how work is executed within the contexts of their jobs, work units, or organizations. This paper introduced taking charges definition, measurement, and relationships with relevant variables, and then summarized the antecedents and consequences of such behavior. Taking charge is conceptually distinct from these more traditional forms of extra-role behavior, such as OCB, models that have been advanced to explain those behaviors are inappropriate for explaining taking charge, and scholars suggest that it is motivated by factors that have not previously been studied in the context of these more traditional forms of extra-role behavior. Taking charge may be viewed as threatening bypeers or supervisors. Thus, an employee who is trying to bring about improvement may actually incite disharmony and tension that will detract from performance.The factors that positively affect taking charge can be classified into two categories: (1) Individual-level factors, such as proactive personality, perceived organizational support and positive emotions; and (2) Contextual factors, such as job autonomy, management openness, and innovative climate. The consequences of taking charge that past research has examined include in-role performance evaluation, job satisfactory, affective commitment, and perception of transformational leadership. Finally, the paper recommends that future research should focus particularly on the following four aspects: (1) Improving the measurement of taking charge;(2) Examining the impact of factors outside the organizations (e.g., environment dynamism and industry competition) ; (3) Investigating more contingencies that moderate the consequences of taking charge, and (4) exploring the issue of leader taking charge in Chinese organizational context.This study expands current understanding of extra-role behavior and suggests ways in which organizations can motivate employees to go beyond the boundaries of their jobs to bring about positive changes. Despite a growing body of work in this area, existing research has provided a limited view of extra-role behavior by neglecting activities aimed at changing the status quo. We provideinsight into more challenging, risky and effortful forms of discretionary employee behavior. It thereby broadens current conceptualizations of extra-role behaviors within organizations, going beyond the more mundane cooperative and helping behaviors that have been the focus of the existing research.Keyword:organizational citizenship behavior; challenging behavior; taking charge; antecedents; consequences;1、前言长期以来, 组织行为学领域的学者对组织公民行为及其前因与后效一直保持着极大的研究热情。
毕业设计(论文)外文资料翻译院:专业:姓名:学号:外文出处:(用外文写)附件:1。
外文资料翻译译文;2.外文原文。
附件1:外文资料翻译译文智能停车辅助系统的系统配置智能停车场管理系统采用先进技术和高度自动化的机电设备,将机械、电子计算机和自控设备以及智能IC卡技术有机地结合起来,通过电脑管理可实现车辆出入管理、自动存储数据等功能,实现脱机运行并提供-种高效管理服务的系统。
新型的智能停车场将生活理念和建筑艺术、信息技术、计算机电子技术等现代高科技完美结合,提供的是一种操作简单、使用方便、功能先进的人性化系统.它依靠高科技,以人为本,采用图形人机界面操作方式,提供一种更加安全、舒适、方便、快捷和开放的智能化、信息化生活空间,促进了人文环境的健康发展.本文介绍了目前开发的投资促进机构配置(智能停车辅助系统)。
IPAS允许司机指定目标位置的三个免费方法:基于单眼视觉的停车位置标记识别,基于超声波传感器的停车位置识别,和拖放GUI(图形用户界面)。
IPAS生成最优路径马赫指定的目标位置。
在停车场的运作,并估计自我车辆姿势使用ESP(电子稳定程序)的传感器,如车轮需要银行脚踏开关和传感器,转向角传感器。
IPAS自动控制制动和转向通过发送所需的促动ESP和轨迹通过可以EPS(电动助力转向)。
IPAS通知当前驱动器通过对停车操作后视图的图像轨迹估计,这是通过实验验证了系统的车辆。
关键词智能停车辅助系统,驾驶员辅助系统介绍我们都由六部分组成:自我车辆姿态估计,路径生成器和流浪汉;跟踪,主动制动系统,主动转向系统,和HMI(人机界面).IPAS既可进行半自动的停车辅助系统,其中的转向操作自动化。
而且自动泊车辅助系统,其中转向和制动操作自动化。
指定目标位置定位的自动/半自动停车操作的目标位置。
我们开发了三个互补的方法:基于单目视觉的停车插槽标记识别,基于超声传感器的平行泊车插槽识别,和拖放的GUI(图形用户界面)。
自我车辆姿态估计实现通过利用各种传感器,包括轮速传感器的车辆姿态估计的阿克曼模型,转向角度传感器,制动踏板开关,和车轮角度传感器。
AB实验的⾼端玩法系列1-AB实验⼈群定向个体效果差异HTEUpliftModel论⽂gi。
⼀直以来机器学习希望解决的⼀个问题就是'what if',也就是决策指导:如果我给⽤户发优惠券⽤户会留下来么?如果患者服了这个药⾎压会降低么?如果APP增加这个功能会增加⽤户的使⽤时长么?如果实施这个货币政策对有效提振经济么?这类问题之所以难以解决是因为ground truth在现实中是观测不到的,⼀个已经服了药的患者⾎压降低但我们⽆从知道在同⼀时刻如果他没有服药⾎压是不是也会降低。
这个时候做分析的同学应该会说我们做AB实验!我们估计整体差异,显著就是有效,不显著就是⽆效。
但我们能做的只有这些么?当然不是!因为每个个体都是不同的!整体⽆效不意味着局部群体⽆效!如果只有5%的⽤户对发优惠券敏感,我们能只触达这些⽤户么?或者不同⽤户对优惠券敏感的阈值不同,如何通过调整优惠券的阈值吸引更多的⽤户?如果降压药只对有特殊症状的患者有效,我们该如何找到这些患者?APP的新功能部分⽤户不喜欢,部分⽤户很喜欢,我能通过⽐较这些⽤户的差异找到改进这个新功能的⽅向么?以下⽅法从不同的⾓度尝试解决这个问题,但基本思路是⼀致的:我们⽆法观测到每个⽤户的treatment effect,但我们可以找到⼀群相似⽤户来估计实验对他们的影响。
我会在之后的博客中,从CasualTree的第⼆篇Recursive partitioning for heterogeneous causal effects开始梳理下述⽅法中的异同。
整个领域还在发展中,⼏个开源代码都刚release不久,所以这个博客也会持续更新。
如果⼤家看到好的⽂章和⼯程实现也欢迎在下⾯评论~Uplift Modelling/Causal Tree1. Nicholas J Radcliffe and Patrick D Surry. Real-world uplift modelling with significance based uplift trees. White Paper TR-2011-1,Stochastic Solutions, 2011.2. Rzepakowski, P. and Jaroszewicz, S., 2012. Decision trees for uplift modeling with single and multiple treatments. Knowledge andInformation Systems, 32(2), pp.303-327.3. Yan Zhao, Xiao Fang, and David Simchi-Levi. Uplift modeling with multiple treatments and general response types. Proceedings ofthe 2017 SIAM International Conference on Data Mining, SIAM, 2017.4. Athey, S., and Imbens, G. W. 2015. Machine learning methods forestimating heterogeneous causal effects. stat 1050(5)5. Athey, S., and Imbens, G. 2016. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy ofSciences.6. C. Tran and E. Zheleva, “Learning triggers for heterogeneous treatment effects,” in Proceedings of the AAAI Conference on ArtificialIntelligence, 2019Forest Based Estimators1. Wager, S. & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of theAmerican Statistical Association .2. M. Oprescu, V. Syrgkanis and Z. S. Wu. Orthogonal Random Forest for Causal Inference. Proceedings of the 36th InternationalConference on Machine Learning (ICML), 2019Double Machine Learning1. V. Chernozhukov, D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, and a. W. Newey. Double Machine Learning for Treatment andCausal Parameters. ArXiv e-prints2. V. Chernozhukov, M. Goldman, V. Semenova, and M. Taddy. Orthogonal Machine Learning for Demand Estimation: HighDimensional Causal Inference in Dynamic Panels. ArXiv e-prints, December 2017.3. V. Chernozhukov, D. Nekipelov, V. Semenova, and V. Syrgkanis. Two-Stage Estimation with a High-Dimensional Second Stage.2018.4. X. Nie and S. Wager. Quasi-Oracle Estimation of Heterogeneous Treatment Effects. arXiv preprint arXiv:1712.04912, 2017.5. D. Foster and V. Syrgkanis. Orthogonal Statistical Learning. arXiv preprint arXiv:1901.09036, 2019Meta Learner1. C. Manahan, 2005. A proportional hazards approach to campaign list selection. In SAS User Group International (SUGI) 30Proceedings.2. Green DP, Kern HL (2012) Modeling heteroge-neous treatment effects in survey experiments with Bayesian additive regression trees.Public OpinionQuarterly 76(3):491–511.3. Sören R. Künzel, Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. Metalearners for estimating heterogeneous treatment effects usingmachine learning. Proceedings of the National Academy of Sciences, 2019.Deep Learning1. Fredrik D. Johansson, U. Shalit, D. Sontag.ICML (2016). Learning Representations for Counterfactual Inference2. Shalit, U., Johansson, F. D., & Sontag, D. ICML (2017). Estimating individual treatment effect: generalization bounds and algorithms.Proceedings of the 34th International Conference on Machine Learning3. Christos Louizos, U. Shalit, J. Mooij, D. Sontag, R. Zemel, M. Welling.NIPS (2017). Causal Effect Inference with Deep Latent-VariableModels4. Alaa, A. M., Weisz, M., & van der Schaar, M. (2017). Deep Counterfactual Networks with Propensity-Dropout5. Shi, C., Blei, D. M., & Veitch, V. NeurIPS (2019). Adapting Neural Networks for the Estimation of Treatment EffectsUber专场最早就是uber的博客在茫茫paper的海洋中帮我找到了⽅向,如今听说它们AI LAB要解散了有些伤感,作为HTE最多star的开源⽅,它们值得拥有⼀个part1. Shuyang Du, James Lee, Farzin Ghaffarizadeh, 2017, Improve User Retention with Causal Learning2. Zhenyu Zhao, Totte Harinen, 2020, Uplift Modeling for Multiple Treatments with Cost3. Will Y. Zou, Smitha Shyam, Michael Mui, Mingshi Wang, 2020, Learning Continuous Treatment Policy and Bipartite Embeddings forMatching with Heterogeneous Causal EffectsOptimization4. Will Y. Zou,Shuyang Du,James Lee,Jan Pedersen, 2020, Heterogeneous Causal Learning for Effectiveness Optimizationin User Marketing想看更多因果推理AB实验相关paper的⼩伙伴看过来持续更新中 ~。
NeuronArticleEpisodic Future Thinking ReducesReward Delay Discounting through an Enhancement of Prefrontal-Mediotemporal InteractionsJan Peters1,*and Christian Bu¨chel11NeuroimageNord,Department of Systems Neuroscience,University Medical Center Hamburg-Eppendorf,Hamburg20246,Germany*Correspondence:j.peters@uke.uni-hamburg.deDOI10.1016/j.neuron.2010.03.026SUMMARYHumans discount the value of future rewards over time.Here we show using functional magnetic reso-nance imaging(fMRI)and neural coupling analyses that episodic future thinking reduces the rate of delay discounting through a modulation of neural decision-making and episodic future thinking networks.In addition to a standard control condition,real subject-specific episodic event cues were presented during a delay discounting task.Spontaneous episodic imagery during cue processing predicted how much subjects changed their preferences toward more future-minded choice behavior.Neural valuation signals in the anterior cingulate cortex and functional coupling of this region with hippo-campus and amygdala predicted the degree to which future thinking modulated individual preference functions.A second experiment replicated the behavioral effects and ruled out alternative explana-tions such as date-based processing and temporal focus.The present data reveal a mechanism through which neural decision-making and prospection networks can interact to generate future-minded choice behavior.INTRODUCTIONThe consequences of choices are often delayed in time,and in many cases it pays off to wait.While agents normally prefer larger over smaller rewards,this situation changes when rewards are associated with costs,such as delays,uncertainties,or effort requirements.Agents integrate such costs into a value function in an individual manner.In the hyperbolic model of delay dis-counting(also referred to as intertemporal choice),for example, a subject-specific discount parameter accurately describes how individuals discount delayed rewards in value(Green and Myer-son,2004;Mazur,1987).Although the degree of delay discount-ing varies considerably between individuals,humans in general have a particularly pronounced ability to delay gratification, and many of our choices only pay off after months or even years. It has been speculated that the capacity for episodic future thought(also referred to as mental time travel or prospective thinking)(Bar,2009;Schacter et al.,2007;Szpunar et al.,2007) may underlie the human ability to make choices with high long-term benefits(Boyer,2008),yielding higher evolutionaryfitness of our species.At the neural level,a number of models have been proposed for intertemporal decision-making in humans.In the so-called b-d model(McClure et al.,2004,2007),a limbic system(b)is thought to place special weight on immediate rewards,whereas a more cognitive,prefrontal-cortex-based system(d)is more involved in patient choices.In an alternative model,the values of both immediate and delayed rewards are thought to be repre-sented in a unitary system encompassing medial prefrontal cortex(mPFC),posterior cingulate cortex(PCC),and ventral striatum(VS)(Kable and Glimcher,2007;Kable and Glimcher, 2010;Peters and Bu¨chel,2009).Finally,in the self-control model, values are assumed to be represented in structures such as the ventromedial prefrontal cortex(vmPFC)but are subject to top-down modulation by prefrontal control regions such as the lateral PFC(Figner et al.,2010;Hare et al.,2009).Both the b-d model and the self-control model predict that reduced impulsivity in in-tertemporal choice,induced for example by episodic future thought,would involve prefrontal cortex regions implicated in cognitive control,such as the lateral PFC or the anterior cingulate cortex(ACC).Lesion studies,on the other hand,also implicated medial temporal lobe regions in decision-making and delay discounting. In rodents,damage to the basolateral amygdala(BLA)increases delay discounting(Winstanley et al.,2004),effort discounting (Floresco and Ghods-Sharifi,2007;Ghods-Sharifiet al.,2009), and probability discounting(Ghods-Sharifiet al.,2009).Interac-tions between the ACC and the BLA in particular have been proposed to regulate behavior in order to allow organisms to overcome a variety of different decision costs,including delays (Floresco and Ghods-Sharifi,2007).In line with thesefindings, impairments in decision-making are also observed in humans with damage to the ACC or amygdala(Bechara et al.,1994, 1999;Manes et al.,2002;Naccache et al.,2005).Along similar lines,hippocampal damage affects decision-making.Disadvantageous choice behavior has recently been documented in patients suffering from amnesia due to hippo-campal lesions(Gupta et al.,2009),and rats with hippocampal damage show increased delay discounting(Cheung and Cardinal,2005;Mariano et al.,2009;Rawlins et al.,1985).These observations are of particular interest given that hippocampal138Neuron66,138–148,April15,2010ª2010Elsevier Inc.damage impairs the ability to imagine novel experiences (Hassa-bis et al.,2007).Based on this and a range of other studies,it has recently been proposed that hippocampus and parahippocam-pal cortex play a crucial role in the formation of vivid event repre-sentations,regardless of whether they lie in the past,present,or future (Schacter and Addis,2009).The hippocampus may thus contribute to decision-making through its role in self-projection into the future (Bar,2009;Schacter et al.,2007),allowing an organism to evaluate future payoffs through mental simulation (Johnson and Redish,2007;Johnson et al.,2007).Future thinking may thus affect intertemporal choice through hippo-campal involvement.Here we used model-based fMRI,analyses of functional coupling,and extensive behavioral procedures to investigate how episodic future thinking affects delay discounting.In Exper-iment 1,subjects performed a classical delay discounting task(Kable and Glimcher,2007;Peters and Bu¨chel,2009)that involved a series of choices between smaller immediate and larger delayed rewards,while brain activity was measured using fMRI.Critically,we introduced a novel episodic condition that involved the presentation of episodic cue words (tags )obtained during an extensive prescan interview,referring to real,subject-specific future events planned for the respective day of reward delivery.This design allowed us to assess individual discount rates separately for the two experimental conditions,allowing us to investigate neural mechanisms mediating changes in delay discounting associated with episodic thinking.In a second behavioral study,we replicated the behavioral effects of Exper-iment 1and addressed a number of alternative explanations for the observed effects of episodic tags on discount rates.RESULTSExperiment 1:Prescan InterviewOn day 1,healthy young volunteers (n =30,mean age =25,15male)completed a computer-based delay discounting proce-dure to estimate their individual discount rate (Peters and Bu ¨-chel,2009).This discount rate was used solely for the purpose of constructing subject-specific trials for the fMRI session (see Experimental Procedures ).Furthermore,participants compiled a list of events that they had planned in the next 7months (e.g.,vacations,weddings,parties,courses,and so forth)andrated them on scales from 1to 6with respect to personal rele-vance,arousal,and valence.For each participant,seven subject-specific events were selected such that the spacing between events increased with increasing delay to the episode,and that events were roughly matched based on personal rele-vance,arousal,and valence.Multiple regression analysis of these ratings across the different delays showed no linear effects (relevance:p =0.867,arousal:p =0.120,valence:p =0.977,see Figure S1available online).For each subject,a separate set of seven delays was computed that was later used as delays in the control condition.Median and range for the delays used in each condition are listed in Table S1(available online).For each event,a label was selected that would serve as a verbal tag for the fMRI session.Experiment 1:fMRI Behavioral ResultsOn day 2,volunteers performed two sessions of a delay dis-counting procedure while fMRI was measured using a 3T Siemens Scanner with a 32-channel head-coil.In each session,subjects made a total of 118choices between 20V available immediately and larger but delayed amounts.Subjects were told that one of their choices would be randomly selected and paid out following scanning,with the respective delay.Critically,in half the trials,an additional subject-specific episodic tag (see above,e.g.,‘‘vacation paris’’or ‘‘birthday john’’)was displayed based on the prescan interview (see Figure 1)indicating which event they had planned on the particular day (episodic condi-tion),whereas in the remaining trials,no episodic tag was pre-sented (control condition).Amount and waiting time were thus displayed in both conditions,but only the episodic condition involved the presentation of an additional subject-specific event tag.Importantly,nonoverlapping sets of delays were used in the two conditions.Following scanning,subjects rated for each episodic tag how often it evoked episodic associations during scanning (frequency of associations:1,never;to 6,always)and how vivid these associations were (vividness of associa-tions:1,not vivid at all;to 6,highly vivid;see Figure S1).Addition-ally,written reports were obtained (see Supplemental Informa-tion ).Multiple regression revealed no significant linear effects of delay on postscan ratings (frequency:p =0.224,vividness:p =0.770).We averaged the postscan ratings acrosseventsFigure 1.Behavioral TaskDuring fMRI,subjects made repeated choices between a fixed immediate reward of 20V and larger but delayed amounts.In the control condi-tion,amounts were paired with a waiting time only,whereas in the episodic condition,amounts were paired with a waiting time and a subject-specific verbal episodic tag indicating to the subjects which event they had planned at the respective day of reward delivery.Events were real and collected in a separate testing session prior to the day of scanning.NeuronEpisodic Modulation of Delay DiscountingNeuron 66,138–148,April 15,2010ª2010Elsevier Inc.139and the frequency/vividness dimensions,yielding an‘‘imagery score’’for each subject.Individual participants’choice data from the fMRI session were then analyzed byfitting hyperbolic discount functions to subject-specific indifference points to obtain discount rates (k-parameters),separately for the episodic and control condi-tions(see Experimental Procedures).Subjective preferences were well-characterized by hyperbolic functions(median R2 episodic condition=0.81,control condition=0.85).Discount functions of four exemplary subjects are shown in Figure2A. For both conditions,considerable variability in the discount rate was observed(median[range]of discount rates:control condition=0.014[0.003–0.19],episodic condition=0.013 [0.002–0.18]).To account for the skewed distribution of discount rates,all further analyses were conducted on the log-trans-formed k-parameters.Across subjects,log-transformed discount rates were significantly lower in the episodic condition compared with the control condition(t(29)=2.27,p=0.016),indi-cating that participants’choice behavior was less impulsive in the episodic condition.The difference in log-discount rates between conditions is henceforth referred to as the episodic tag effect.Fitting hyperbolic functions to the median indifference points across subjects also showed reduced discounting in the episodic condition(discount rate control condition=0.0099, episodic condition=0.0077).The size of the tag effect was not related to the discount rate in the control condition(p=0.56). We next hypothesized that the tag effect would be positively correlated with postscan ratings of episodic thought(imagery scores,see above).Robust regression revealed an increase in the size of the tag effect with increasing imagery scores (t=2.08,p=0.023,see Figure2B),suggesting that the effect of the tags on preferences was stronger the more vividly subjects imagined the episodes.Examples of written postscan reports are provided in the Supplemental Results for participants from the entire range of imagination ratings.We also correlated the tag effect with standard neuropsychological measures,the Sensation Seeking Scale(SSS)V(Beauducel et al.,2003;Zuck-erman,1996)and the Behavioral Inhibition Scale/Behavioral Approach Scale(BIS/BAS)(Carver and White,1994).The tag effect was positively correlated with the experience-seeking subscale of the SSS(p=0.026)and inversely correlated with the reward-responsiveness subscale of the BIS/BAS scales (p<0.005).Repeated-measures ANOVA of reaction times(RTs)as a func-tion of option value(lower,similar,or higher relative to the refer-ence option;see Experimental Procedures and Figure2C)did not show a main effect of condition(p=0.712)or a condition 3value interaction(p=0.220),but revealed a main effect of value(F(1.8,53.9)=16.740,p<0.001).Post hoc comparisons revealed faster RTs for higher-valued options relative to similarly (p=0.002)or lower valued options(p<0.001)but no difference between lower and similarly valued options(p=0.081).FMRI DataFMRI data were modeled using the general linear model(GLM) as implemented in SPM5.Subjective value of each decision option was calculated by multiplying the objective amount of each delayed reward with the discount fraction estimated behaviorally based on the choices during scanning,and included as a parametric regressor in the GLM.Note that discount rates were estimated separately for the control and episodic conditions(see above and Figure2),and we thus used condition-specific k-parameters for calculation of the subjective value regressor.Additional parametric regressors for inverse delay-to-reward and absolute reward magnitude, orthogonalized with respect to subjective value,were included in theGLM.Figure2.Behavioral Data from Experiment1Shown are experimentally derived discount func-tions from the fMRI session for four exemplaryparticipants(A),correlation with imagery scores(B),and reaction times(RTs)(C).(A)Hyperbolicfunctions werefit to the indifference points sepa-rately for the control(dashed lines)and episodic(solid lines,filled circles)conditions,and thebest-fitting k-parameters(discount rates)and R2values are shown for each subject.The log-trans-formed difference between discount rates wastaken as a measure of the effect of the episodictags on choice preferences.(B)Robust regressionrevealed an association between log-differences indiscount rates and imagery scores obtained frompostscan ratings(see text).(C)RTs were signifi-cantly modulated by option value(main effectvalue p<0.001)with faster responses in trialswith a value of the delayed reward higher thanthe20V reference amount.Note that althoughseven delays were used for each condition,somedata points are missing,e.g.,onlyfive delay indif-ference points for the episodic condition areplotted for sub20.This indicates that,for the twolongest delays,this subject never chose the de-layed reward.***p<0.005.Error bars=SEM.Neuron Episodic Modulation of Delay Discounting140Neuron66,138–148,April15,2010ª2010Elsevier Inc.Episodic Tags Activate the Future Thinking NetworkWe first analyzed differences in the condition regressors without parametric pared to those of the control condi-tion,BOLD responses to the presentation of the delayed reward in the episodic condition yielded highly significant activations (corrected for whole-brain volume)in an extensive network of brain regions previously implicated in episodic future thinking (Addis et al.,2007;Schacter et al.,2007;Szpunar et al.,2007)(see Figure 3and Table S2),including retrosplenial cortex (RSC)/PCC (peak MNI coordinates:À6,À54,14,peak z value =6.26),left lateral parietal cortex (LPC,À44,À66,32,z value =5.35),and vmPFC (À8,34,À12,z value =5.50).Distributed Neural Coding of Subjective ValueWe then replicated previous findings (Kable and Glimcher,2007;Kable and Glimcher,2010;Peters and Bu¨chel,2009)using a conjunction analysis (Nichols et al.,2005)searching for regions showing a positive correlation between the height of the BOLD response and subjective value in the control and episodic condi-tions in a parametric analysis (Figure 4A and Table S3).Note that this is a conservative analysis that requires that a given voxel exceed the statistical threshold in both contrasts separately.This analysis revealed clusters in the lateral orbitofrontal cortex (OFC,À36,50,À10,z value =4.50)and central OFC (À18,12,À14,z value =4.05),bilateral VS (right:10,8,0,z value =4.22;left:À10,8,À6,z value =3.51),mPFC (6,26,16,z value =3.72),and PCC (À2,À28,24,z value =4.09),representing subjective (discounted)value in both conditions.We next analyzed the neural tag effect,i.e.,regions in which the subjective value correlation was greater for the episodic condi-tion as compared with the control condition (Figure 4B and Table S4).This analysis revealed clusters in the left LPC (À66,À42,32,z value =4.96,),ACC (À2,16,36,z value =4.76),left dorsolateral prefrontal cortex (DLPFC,À38,36,36,z value =4.81),and right amygdala (24,2,À24,z value =3.75).Finally,we performed a triple-conjunction analysis,testing for regions that were correlated with subjective value in both conditions,but in which the value correlation increased in the episodic condition.Only left LPC showed this pattern (À66,À42,30,z value =3.55,see Figure 4C and Table S5),the same region that we previously identified as delay-specific in valuation (Petersand Bu¨chel,2009).There were no regions in which the subjective value correlation was greater in the control condition when compared with the episodic condition at p <0.001uncorrected.ACC Valuation Signals and Functional Connectivity Predict Interindividual Differences in Discount Function ShiftsWe next correlated differences in the neural tag effect with inter-individual differences in the size of the behavioral tag effect.To this end,we performed a simple regression analysis in SPM5on the single-subject contrast images of the neural tag effect (i.e.,subjective value correlation episodic >control)using the behavioral tag effect [log(k control )–log(k episodic )]as an explana-tory variable.This analysis revealed clusters in the bilateral ACC (right:18,34,18,z value =3.95,p =0.021corrected,left:À20,34,20,z value =3.52,Figure 5,see Table S6for a complete list).Coronal sections (Figure 5C)clearly show that both ACC clusters are located in gray matter of the cingulate sulcus.Because ACC-limbic interactions have previously been impli-cated in the control of choice behavior (Floresco and Ghods-Sharifi,2007;Roiser et al.,2009),we next analyzed functional coupling with the right ACC from the above regression contrast (coordinates 18,34,18,see Figure 6A)using a psychophysiolog-ical interaction analysis (PPI)(Friston et al.,1997).Note that this analysis was conducted on a separate first-level GLM in which control and episodic trials were modeled as 10s miniblocks (see Experimental Procedures for details).We first identified regions in which coupling with the ACC changed in the episodic condition compared with the control condition (see Table S7)and then performed a simple regression analysis on these coupling parameters using the behavioral tag effect as an explanatory variable.The tag effect was associated with increased coupling between ACC and hippocampus (À32,À18,À16,z value =3.18,p =0.031corrected,Figure 6B)and ACC and left amygdala (À26,À4,À26,z value =2.95,p =0.051corrected,Figure 6B,see Table S8for a complete list of activa-tions).The same regression analysis in a second PPI with the seed voxel placed in the contralateral ACC region from the same regression contrast (À20,34,22,see above)yielded qual-itatively similar,though subthreshold,results in these same structures (hippocampus:À28,À32,À6,z value =1.96,amyg-dala:À28,À6,À16,z value =1.97).Experiment 2We conducted an additional behavioral experiment to address a number of alternative explanations for the observed effects of tags on choice behavior.First,it could be argued thatepisodicFigure 3.Categorical Effect of Episodic Tags on Brain ActivityGreater activity in lateral parietal cortex (left)and posterior cingulate/retrosplenial and ventro-medial prefrontal cortex (right)was observed in the episodic condition compared with the control condition.p <0.05,FWE-corrected for whole-brain volume.NeuronEpisodic Modulation of Delay DiscountingNeuron 66,138–148,April 15,2010ª2010Elsevier Inc.141tags increase subjective certainty that a reward would be forth-coming.In Experiment 2,we therefore collected postscan ratings of reward confidence.Second,it could be argued that events,always being associated with a particular date,may have shifted temporal focus from delay-based to more date-based processing.This would represent a potential confound,because date-associated rewards are discounted less than delay-associated rewards (Read et al.,2005).We therefore now collected postscan ratings of temporal focus (date-based versus delay-based).Finally,Experiment 1left open the question of whether the tag effect depends on the temporal specificity of the episodic cues.We therefore introduced an additional exper-imental condition that involved the presentation of subject-specific temporally unspecific future event cues.These tags (henceforth referred to as unspecific tags)were obtained by asking subjects to imagine events that could realistically happen to them in the next couple of months,but that were not directly tied to a particular point in time (see Experimental Procedures ).Episodic Imagery,Not Temporal Specificity,Reward Confidence,or Temporal Focus,Predicts the Size of the Tag EffectIn total,data from 16participants (9female)are included.Anal-ysis of pretest ratings confirmed that temporally unspecific and specific tags were matched in terms of personal relevance,arousal,valence,and preexisting associations (all p >0.15).Choice preferences were again well described by hyperbolic functions (median R 2control =0.84,unspecific =0.81,specific =0.80).We replicated the parametric tag effect (i.e.,increasing effect of tags on discount rates with increasing posttest imagery scores)in this independent sample for both temporally specific (p =0.047,Figure 7A)and temporally unspecific (p =0.022,Figure 7A)tags,showing that the effect depends on future thinking,rather than being specifically tied to the temporal spec-ificity of the event cues.Following testing,subjects rated how certain they were that a particular reward would actually be forth-coming.Overall,confidence in the payment procedure washighFigure 4.Neural Representation of Subjective Value (Parametric Analysis)(A)Regions in which the correlation with subjective value (parametric analysis)was significant in both the control and the episodic conditions (conjunction analysis)included central and lateral orbitofrontal cortex (OFC),bilateral ventral striatum (VS),medial prefrontal cortex (mPFC),and posterior cingulate cortex(PCC),replicating previous studies (Kable and Glimcher,2007;Peters and Bu¨chel,2009).(B)Regions in which the subjective value correlation was greater for the episodic compared with the control condition included lateral parietal cortex (LPC),ante-rior cingulate cortex (ACC),dorsolateral prefrontal cortex (DLPFC),and the right amygdala (Amy).(C)A conjunction analysis revealed that only LPC activity was positively correlated with subjective value in both conditions,but showed a greater regression slope in the episodic condition.No regions showed a better correlation with subjective value in the control condition.Error bars =SEM.All peaks are significant at p <0.001,uncorrected;(A)and (B)are thresholded at p <0.001uncorrected and (C)is thresholded at p <0.005,uncorrected for display purposes.NeuronEpisodic Modulation of Delay Discounting142Neuron 66,138–148,April 15,2010ª2010Elsevier Inc.(Figure 7B),and neither unspecific nor specific tags altered these subjective certainty estimates (one-way ANOVA:F (2,45)=0.113,p =0.894).Subjects also rated their temporal focus as either delay-based or date-based (see Experimental Procedures ),i.e.,whether they based their decisions on the delay-to-reward that was actually displayed,or whether they attempted to convert delays into the corresponding dates and then made their choices based on these dates.There was no overall significant effect of condition on temporal focus (one-way ANOVA:F (2,45)=1.485,p =0.237,Figure 7C),but a direct comparison between the control and the temporally specific condition showed a significant difference (t (15)=3.18,p =0.006).We there-fore correlated the differences in temporal focus ratings between conditions (control:unspecific and control:specific)with the respective tag effects (Figure 7D).There were no correlations (unspecific:p =0.71,specific:p =0.94),suggesting that the observed differences in discounting cannot be attributed to differences in temporal focus.High-Imagery,but Not Low-Imagery,Subjects Adjust Their Discount Function in an Episodic ContextFor a final analysis,we pooled the samples of Experiments 1and 2(n =46subjects in total),using only the temporally specific tag data from Experiment 2.We performed a median split into low-and high-imagery participants according to posttest imagery scores (low-imagery subjects:n =23[15/8Exp1/Exp2],imagery range =1.5–3.4,high-imagery subjects:n =23[15/8Exp1/Exp2],imagery range =3.5–5).The tag effect was significantly greater than 0in the high-imagery group (t (22)=2.6,p =0.0085,see Figure 7D),where subjects reduced their discount rate by onaverage 16%in the presence of episodic tags.In the low-imagery group,on the other hand,the tag effect was not different from zero (t (22)=0.573,p =0.286),yielding a significant group difference (t (44)=2.40,p =0.011).DISCUSSIONWe investigated the interactions between episodic future thought and intertemporal decision-making using behavioral testing and fMRI.Experiment 1shows that reward delay dis-counting is modulated by episodic future event cues,and the extent of this modulation is predicted by the degree of sponta-neous episodic imagery during decision-making,an effect that we replicated in Experiment 2(episodic tag effect).The neuroi-maging data (Experiment 1)highlight two mechanisms that support this effect:(1)valuation signals in the lateral ACC and (2)neural coupling between ACC and hippocampus/amygdala,both predicting the size of the tag effect.The size of the tag effect was directly related to posttest imagery scores,strongly suggesting that future thinking signifi-cantly contributed to this effect.Pooling subjects across both experiments revealed that high-imagery subjects reduced their discount rate by on average 16%in the episodic condition,whereas low-imagery subjects did not.Experiment 2addressed a number of alternative accounts for this effect.First,reward confidence was comparable for all conditions,arguing against the possibility that the tags may have somehow altered subjec-tive certainty that a reward would be forthcoming.Second,differences in temporal focus between conditions(date-basedFigure 5.Correlation between the Neural and Behavioral Tag Effect(A)Glass brain and (B and C)anatomical projection of the correlation between the neural tag effect (subjective value correlation episodic >control)and the behav-ioral tag effect (log difference between discount rates)in the bilateral ACC (p =0.021,FWE-corrected across an anatomical mask of bilateral ACC).(C)Coronal sections of the same contrast at a liberal threshold of p <0.01show that both left and right ACC clusters encompass gray matter of the cingulate gyrus.(D)Scatter-plot depicting the linear relationship between the neural and the behavioral tag effect in the right ACC.(A)and (B)are thresholded at p <0.001with 10contiguous voxels,whereas (C)is thresholded at p <0.01with 10contiguousvoxels.Figure 6.Results of the Psychophysiolog-ical Interaction Analysis(A)The seed for the psychophysiological interac-tion (PPI)analysis was placed in the right ACC (18,34,18).(B)The tag effect was associated with increased ACC-hippocampal coupling (p =0.031,corrected across bilateral hippocampus)and ACC-amyg-dala coupling (p =0.051,corrected across bilateral amygdala).Maps are thresholded at p <0.005,uncorrected for display purposes and projected onto the mean structural scan of all participants;HC,hippocampus;Amy,Amygdala;rACC,right anterior cingulate cortex.NeuronEpisodic Modulation of Delay DiscountingNeuron 66,138–148,April 15,2010ª2010Elsevier Inc.143。
In the realm of academic research and practical problemsolving,group investigations play a pivotal role.These collaborative efforts not only foster a sense of teamwork but also enhance the depth and breadth of the findings.The following narrative illustrates the experience of a group investigation that I,as a seasoned English teacher,have had the privilege to observe and guide.The group consisted of a diverse set of students,each bringing their unique perspectives and skills to the table.The task at hand was to investigate the impact of social media on the mental health of teenagers. This was a topic of significant relevance,given the pervasive nature of social media in the lives of young people today.The initial phase of the investigation involved brainstorming sessions where ideas were freely exchanged.The students were encouraged to voice their opinions without fear of judgment,creating a safe space for open dialogue.This approach was instrumental in generating a wide array of ideas and hypotheses that would guide the subsequent stages of the research.One of the students,a keen observer of social trends,suggested looking into the correlation between the amount of time spent on social media platforms and the levels of reported anxiety and depression among teenagers.Another student,with a background in psychology,proposed examining the role of social comparison and its potential to exacerbate feelings of inadequacy and low selfesteem.As the group delved deeper into the investigation,they divided the work among themselves based on their areas of interest and expertise.Some focused on literature reviews,meticulously sifting through existing research to identify gaps and establish a solid theoretical foundation for their study.Others embarked on designing a survey,ensuring that the questions were clear,unbiased,and relevant to the research objectives.The process was not without its challenges.Disagreements arose over the interpretation of data and the direction the investigation should take. However,these moments served as valuable learning experiences, teaching the students the importance of critical thinking and the art of negotiation.One particularly memorable incident involved a heated debate over the statistical significance of their findings.The group had to revisit their methodology and consider alternative explanations for their results.This exercise not only strengthened their analytical skills but also highlighted the importance of being open to new perspectives and being willing to revise ones initial conclusions.The culmination of the group investigation was the presentation of their findings at a schoolwide seminar.The students had prepared a comprehensive report,complete with visual aids and an engaging narrative that brought their research to life.The audience,comprising fellow students,teachers,and invited guests,was captivated by their insights and the depth of their analysis.The groups investigation revealed that while social media can provide a platform for connection and selfexpression,it can also contribute to feelings of social isolation and selfdoubt,particularly when used excessively or in a manner that encourages comparison with others.The findings underscored the need for a balanced approach to social media usage and the importance of fostering resilience and a healthy selfimage among teenagers.The experience of guiding this group investigation was both rewarding and enlightening.It was a testament to the power of collaboration and the potential for young minds to contribute meaningfully to important societal conversations.As an educator,I was proud to witness the growth and development of these students,not just in terms of their research skills but also in their ability to navigate complex issues with empathy,curiosity,and a commitment to truth and integrity.。
第3期(总第389期)2024年3月商㊀业㊀经㊀济㊀与㊀管㊀理JOURNAL OF BUSINESS ECONOMICS No.3(General No.389)Mar.2024收稿日期:2023-12-08基金项目:国家自然科学基金项目 如何 扬长避短 ?越轨创新对多主体的双刃剑效应㊁人力资源管理干预机制及系统动态过程研究 (72172032);国家社会科学基金项目 机器情感学习对人机协同双路径作用和养老陪护服务供需平衡的整合效应研究 (23BGL239);辽宁省社会科学规划基金项目 数字化㊁服务化背景下的辽宁制造企业绩效提升战略研究 (L21BGL019);河北省自然科学基金项目 服务化㊁数字化及其融合战略对制造企业绩效的影响研究 (G2021501006)作者简介:张兰霞,女,教授,博士生导师,管理学博士,主要从事组织行为与人力资源管理研究;李佳敏(通讯作者),男,博士研究生,主要从事组织行为与人力资源管理研究;毛孟雨,男,博士研究生,主要从事组织行为与人力资源管理研究㊂零工工作者感知算法控制对工作投入的影响机制研究基于认知和情感的双路径模型张兰霞,李佳敏,毛孟雨(东北大学工商管理学院,辽宁沈阳110169)摘㊀要:算法的快速发展及其在管理活动中的应用正改变着员工的工作方式,给组织管理实践带来了新的挑战㊂目前却少有研究关注员工感知算法控制对其工作投入的影响机制㊂文章依据资源保存理论,基于认知和情感的视角,构建了感知算法控制对工作投入影响的双路径模型,并检验了认知负荷和情绪耗竭的中介作用以及算法透明度的调节作用㊂针对397名零工工作者两阶段的调查分析发现:感知算法控制显著负向影响工作投入;认知负荷和情绪耗竭不仅在感知算法控制和工作投入之间起中介作用,而且在感知算法控制和工作投入之间起链式中介作用;算法透明度不仅调节了感知算法控制与认知负荷和情绪耗竭间的关系,还调节了认知负荷和情绪耗竭的中介作用㊂这不仅丰富了算法管理领域的实证研究成果,也为相关组织开展科学的员工管理实践提供了参考和借鉴㊂关键词:感知算法控制;工作投入;算法透明度;资源保存理论中图分类号:F270㊀㊀文献标志码:A㊀㊀文章编号:10002154(2024)03004712DOI:10.14134/33-1336/f.2024.03.004Impact of Perceived Algorithmic Control of Gig Workers on Work Engagement :A Dual-Pathway Model Based on Cognitive and Affective FactorsZHANG Lanxia,LI Jiamin,MAO Mengyu (School of Business Administration ,Northeastern University ,Shenyang 110169,China )Abstract ︰The rapid development of algorithms and their application in management activities are changing the way employees work and bringing new challenges to organizational management practices.There is currently little research focusing on the mechanisms by which employees perceived algorithmic control and how it impacts their work engagement.This study is based on the conservation of resources theory and constructs a dual-path model of the impact of perceived algorithmic control on work engagement from cognitive and emotional perspectives.It examines the mediating role of cognitive load and emotional exhaustion,as well as the moderating role of algorithm transparency.From a two-stage survey analysis of 397gig workers this study finds that perceived algorithmic control significantly negatively affects work engagement.Cognitive load and emotional exhaustion not only play mediation effects in the relationships between perceived algorithmic control and work engagement,but also play chain mediation roles in the relationships between perceived algorithmic control and work engagement.In addition,algorithm transparency not only moderates the84商㊀业㊀经㊀济㊀与㊀管㊀理2024年effects of perceived algorithmic control on cognitive load and emotional exhaustion,but also moderates the mediating role of cognitive load and emotional exhaustion.This study enriches the empirical research results in the field of algorithm management,and provides a basis for relevant organizations to develop scientific employee management measures.Key words︰perceived algorithmic control;work engagement;algorithm transparency;conservation of resources theory一、引㊀言数字化促进了零工经济在全球范围内的发展,尤其在中国,以网约车司机㊁外卖小哥为代表的零工工作者依托在线劳动平台实现就业,推动了我国经济的发展[1]㊂目前,已有许多企业依靠平台算法管理零工工作者,使算法管理在管理实践中逐步得到推广[2]㊂在企业应用算法的过程中,零工工作者不仅在工作中经常感知到算法对自身的监控,而且他们的收入和绩效都是通过算法反馈的数据得到的,感知算法控制对零工工作者的重要影响逐渐凸显[3]㊂因此,零工工作者感知算法控制逐渐得到组织管理者与学者们的关注,在学术研究中取得了许多成果㊂感知算法控制是指员工对于算法如何通过规范指导㊁追踪评估和行为约束对其提供在线劳动服务的过程进行实时动态控制的综合感知[4]㊂通过文献梳理发现,相关学者对感知算法控制的结果变量进行了初步探讨,并且发现感知算法控制可以影响员工的行为㊁绩效和健康等[5-6]㊂总的来说,这些研究深化了人们对感知算法控制结果变量的理解,也为组织如何有效地采取措施应对感知算法控制提供了可靠的管理启示㊂但遗憾的是,这些研究都忽视了感知算法控制对零工工作者工作状态的影响㊂事实上,算法正在逐步改变着员工的工作状态㊂工作状态不仅对员工的身心健康和工作行为有重要影响,还会影响企业绩效㊂因此,探究感知算法控制对员工工作状态的影响是十分必要的㊂在对员工工作状态的零星研究中,学者们分别检验了感知算法控制对员工的认知和情感的影响㊂例如,孙锐等[6]发现感知算法控制会影响零工工作者的情绪耗竭㊂但是,目前的研究缺少对员工的认知和情感的整合㊂事实上,感知算法控制对员工的认知和情感都有重要影响,基于认知和情感的算法管理研究有待学者整合㊂此外,有学者认为个体认知的变化会进一步导致情感的变化[7]㊂因此,本文进一步考虑认知到情感的链式中介作用㊂特别地,已有研究发现算法控制对认知和情感的影响存在研究结论不一致的地方㊂例如,感知算法控制对工作投入的影响不是简单的线性关系,感知算法控制对情绪耗竭存在 双刃剑 效应[6]等㊂本文认为出现不一致研究结论的原因是学者们缺乏对感知算法控制边界条件的认知,因此需要理论界进一步揭示感知算法控制的边界条件㊂工作投入作为一种以活力㊁奉献等特质为代表的积极工作状态,是连接个体特征㊁工作因素和工作绩效的纽带[8]㊂本文选取工作投入作为员工的工作状态变量,探讨感知算法控制对员工工作状态的影响机制㊂资源保存理论为解释感知算法控制对工作投入的影响机制提供了新的理论视角㊂该理论认为,个体具有保护自身资源的动机,现有资源存量会进一步影响他们的工作态度与行为[9]㊂由于算法对零工工作者进行全方位的监控,他们在感受到这种控制后,会感受到工作压力㊁紧张和不安[4]㊂基于资源保存理论,感知算法控制会使零工工作者快速损耗资源,导致资源短缺㊂这些损失的资源可以表现为零工工作者在感知算法控制后产生的消极心理状态,如心理认知和情感状态等[2-3]㊂一方面,感受到算法控制的零工工作者既希望按时完成工作,也会担心自己在工作过程中由于 争分夺秒 完成任务而出现危险,这就会经历矛盾的认知过程,消耗自身认知资源,产生认知负荷;另一方面,零工工作者在感受到算法的实时监控和追踪后会产生工作压力,进而消耗自己的情绪资源,导致情绪耗竭[9]㊂因此,认知负荷是一种消极认知,情绪耗竭是一种消极情感,二者的增加意味着零工工作者心理资源的减少[10]㊂伴随着资源的减少,零工工作者将会展现出消极的工作态度和行为㊂基于此,本文认为,认知负荷和情绪耗竭在感知算法控制和工作投入之间起中介作用㊂此外,依据资源保存理论,如果零工工作者能够理解算法系统的运作机制,他们在工作中就能采取措施有效规避算法带来的不利影响,减少资源的消耗㊂反之,如果零工工作者无法理解算法系统的使用过程,他们会产生担忧和焦虑,消耗自身资源[9-11]㊂能够衡量用户理解算法系统运作原因以及运作程度的算法透明度可能会影响感知算法控制引起员工认知和情绪资源消耗的程度[11]㊂因此,本文认为,算法透明度是感知算法控制影响工作投入过程的重要边界条件,探讨算法透明度在感知算法控制和认知负荷㊁情绪耗竭之间的调节作用以及对认知负荷和情绪耗竭中介效果的调节作用是十分必要的㊂综上所述,本文依据资源保存理论,探究零工工作者的感知算法控制对工作投入的影响,厘清认知负荷和情绪耗竭在上述关系中的中介作用以及算法透明度的调节作用,以期丰富算法管理领域的实证研究成果,也为相关组织开展科学的员工管理实践提供参考和借鉴㊂二㊁理论基础与研究假设(一)感知算法控制与工作投入感知算法控制是员工对于算法如何通过规范指导㊁追踪评估和行为约束对其提供在线劳动服务过程进行实时动态控制的综合感知,对员工和组织都会产生重要的影响[12-14]㊂而工作投入作为一种以活力㊁奉献等特质为代表的积极工作状态,是连接个体特征㊁工作因素和工作绩效的纽带[8]㊂资源保存理论为解释感知算法控制对工作投入的影响提供了一个理论视角㊂该理论认为个体有保护和维持自身资源㊁获取新资源的动机㊂当个体的资源出现损耗时,个体会保护自身资源不再减少㊂同时,个体会感到自己受到威胁,进而产生消极的心理体验和工作行为[9]㊂具体而言,感知算法控制表达了零工工作者对于平台制定的标准㊁政策和内容的理解,并基于此形成自己的判断,进而影响他们的行为㊂当零工工作者感知到算法控制时,他们会认为平台在催促他们提高工作效率,并且对自己的行为进行约束㊂为了按照平台的规定按时且高效地完成任务,零工工作者会出现资源损耗等情况㊂此时,他们会减少工作投入以保证不再消耗自身的资源㊂已有研究发现,感知算法控制会给员工带来工作负荷㊁工作不安全感等消极影响[15-16]㊂零工工作者在受到感知算法控制带来的消极影响后,会减少工作投入㊂基于此,本文提出如下假设:H1:零工工作者感知算法控制对工作投入具有显著负向影响㊂(二)认知路径:认知负荷的中介作用认知负荷是指个体在工作过程中,收集㊁分析和处理信息时消耗的认知资源的总和[17]㊂已有研究发现,信息复杂程度[18]和网络环境特征[19]等会影响员工的认知负荷㊂零工工作者使用算法协助自己工作并且感知到算法控制时,他们也会面临复杂的信息以及复杂的网络环境特征,进而增加认知负荷㊂资源保存理论认为,个体由于资源损失而引发的潜在威胁感会影响个体对环境的反应[9]㊂具体而言,当零工工作者感到算法控制时,他们内心中会产生矛盾的认知过程㊂例如,当外卖小哥送餐时间马上要截止时,平台催促外卖小哥要将外卖按时送达,否则就会扣除部分奖励㊂此时,外卖小哥内心中会产生矛盾的过程,即加快配送速度以完成任务或在保证自身安全和遵守交通规则的前提下完成任务㊂这样复杂的心理活动通常会大量消耗零工工作者的认知资源,进一步增强零工工作者的认知负荷[20]㊂因此,本文认为,零工工作者感知算法控制显著正向影响认知负荷㊂进一步地,认知负荷也是工作中重要的压力源之一,会阻碍员工完成工作任务[21]㊂依据资源保存理论,零工工作者在应对压力时,需要投入更多的资源去缓解压力带来的威胁,加快了资源的损耗和流失,这会导致零工工作者减少工作投入[22]㊂Breevaart 和Bakker [23]的研究结论为上述论断提供了实证支持,他们研究发现员工日常的认知负荷会显著负向影响工作投入㊂因此,本文认为,零工工作者认知负荷对工作投入具有显著负向影响㊂基于资源保存理论,个体在资源遭受损失时,会出现的应激反应㊂个体在应对压力的过程中又会投入更多的资源,从而陷入资源损失螺旋[9]㊂综上所述,零工工作者在感知算法控制时需要更多的心智活动以保证任务的完成,这会导致其产生认知负荷㊂认知负荷作为重要的压力源会导致员工心理资源的进一步流失,从而降低工作投入[9,22]㊂Hu 等[24]的研究已经证明了认知负荷在用户熟悉程度和用户满意度之间的中介作用,为上述论断提供了支持㊂因此,本文认为,认知负荷在感知算法控制和工作投入之间起中介作用㊂94㊀第3期㊀张兰霞,李佳敏,毛孟雨:零工工作者感知算法控制对工作投入的影响机制研究 基于认知和情感的双路径模型05商㊀业㊀经㊀济㊀与㊀管㊀理2024年基于此,本文提出如下假设:H2:认知负荷在感知算法控制和工作投入之间起中介作用㊂(三)情感路径:情绪耗竭的中介作用情绪耗竭是指个体在处理人际关系㊁使用知识技能的过程中,过度消耗自己的情绪资源而产生焦躁情绪的心理状态[25]㊂零工工作者在感受到算法控制时,会感受到算法对自身的要求和限制,在这种要求和限制下,零工工作者甚至需要立即做出选择[26]㊂基于资源保存理论,零工工作者在选择的过程中可能会感受到角色冲突,产生心理不适,造成情绪资源的损耗,进而导致情绪耗竭[27]㊂因此,本文认为,零工工作者感知算法控制对情绪耗竭具有显著正向影响㊂进一步地,情绪耗竭是工作倦怠的重要表现形式㊂情绪耗竭的员工不仅会对工作失去热情和动力,而且会伴有沮丧㊁焦虑等心理感受[28]㊂员工在情绪耗竭时会出现资源的消耗,在个体情绪资源枯竭的情况下,零工工作者为了防止仅有的资源进一步被消耗,会减少情绪资源投入,进而减少工作投入㊂正如万金等[29]研究发现,员工情绪耗竭显著负向影响工作投入㊂因此,本文认为,零工工作者情绪耗竭对工作投入具有显著负向影响㊂资源保存理论认为,个体在完成任务过程中会损耗自身资源,使个体感受到威胁,进而产生消极的心理体验[9]㊂由于感知算法控制会让零工工作者感到心理不适,导致情绪耗竭㊂情绪资源的流失使零工工作者对工作的活力与热情降低,陷入沮丧和焦虑等消极的情绪中,从而减少了工作投入㊂与上述逻辑推论一致的是,于维娜等[30]验证了情绪耗竭在领导包容性和非工作期间恢复体验之间的中介作用,姚柱和罗瑾琏[31]则验证了情绪耗竭在时间压力和知识隐藏之间的中介作用㊂因此,本文认为,情绪耗竭在感知算法控制和工作投入之间起中介作用㊂基于此,本文提出如下假设:H3:情绪耗竭在感知算法控制和工作投入之间起中介作用㊂(四)认知负荷与情绪耗竭的链式中介作用结合假设H2和H3,基于资源保存理论,当零工工作者感到算法对自身的控制时,会带来紧张和不安,零工工作者的资源会减少[4]㊂此时,零工工作者可能需要频繁切换任务或应对不同类型的工作,这就需要调动大量的认知资源[17]㊂由于感知算法控制涉及对信息的处理和理解,零工工作者可能会感到认知负荷加重㊂高度的认知负荷可能导致零工工作者感到压力和疲劳,这可能引起情绪的负面变化[25]㊂情绪耗竭指的是长时间的心理压力和疲劳,这可能导致个体对工作的情感投入减少[25]㊂因此,当零工工作者产生认知负荷后,自身资源进一步减少,进而表现为情感上的资源耗竭,即情绪耗竭㊂长期的情绪耗竭可能使零工工作者感到疲惫和缺乏动力[28]㊂由于认知资源有限,当个体感到疲劳时,他们可能更难以集中注意力㊁保持高效率和保持对任务的积极投入状态[8]㊂最终,员工为了防止进一步消耗自身资源而减少工作投入㊂因此,本文认为,认知负荷和情绪耗竭在感知算法控制和工作投入之间起链式中介作用㊂基于此,本文提出如下假设:H4:认知负荷和情绪耗竭在感知算法控制和工作投入之间起链式中介作用㊂(五)算法透明度的调节作用算法透明度是指用户可以理解算法系统运作的原因以及如何运作的程度[10]㊂算法的不透明属性和黑箱属性导致许多员工对算法表现出较高的不信任感,还有员工甚至出现了算法厌恶[11]㊂同时,零工工作者在工作中不得不接受算法决策的结果,这会给他们带来不确定感,出现这些感觉也是因为他们不能理解算法运作的机制[32]㊂甚至有研究发现,当组织使用算法管理员工时,如果员工对算法系统有着清晰和准确的理解,员工在工作时会表现出更强的自主感和胜任感[33]㊂因此,算法透明度可能是感知算法控制影响工作投入过程中的一个重要边界条件㊂具体而言,感知算法透明度高的零工工作者会更了解算法背后的原理,更愿意接受算法的监督和提醒,甚至会将算法视为帮助自身提升工作绩效的工作伙伴[34]㊂因此,感知算法透明度高的零工工作者在感受到算法控制时,会认为算法在帮助自己高效地完成工作㊂此时,零工工作者自身的资源会得到补充,感知算法控制对认知负荷和情绪耗竭的正向影响会被削弱[35]㊂反之,感知算法透明度低的零工工作者对算法的运行机制和原理并不了解,只是按照算法分配的工作和提醒完成任务,他们难以处理算法传递给他们的复杂信息,甚至会产生更强的工作不确定性感和工作不安全感[36]㊂因此,感知算法透明度低的零工工作者在感受到算法控制时,会认为算法给他们的工作带来阻碍㊂此时,零工工作者自身的资源会进一步地损耗,感知算法控制对认知负荷和情绪耗竭的正向影响被增强㊂裴嘉良等[37]的研究为上述观点提供了实证支持,他们发现,算法透明度可以调节感知算法控制对自主性动机和控制性动机的影响㊂基于此,本文提出如下假设:H5:算法透明度削弱了感知算法控制与认知负荷(a )和情绪耗竭(b )间的正相关关系,即算法透明度越高,感知算法控制对认知负荷(a )和情绪耗竭(b )的正向影响越弱㊂综上所述,认知负荷和情绪耗竭在感知算法控制影响工作投入的双路径之间均有间接效应㊂因此,本文进一步提出被调节的中介作用假设,即零工工作者感知算法控制通过认知负荷和情绪耗竭影响工作投入的双路径会受到算法透明度的调节㊂当零工工作者感知到的算法透明度高时,认知负荷和情绪耗竭的中介作用就会被削弱㊂具体而言,感知算法透明度低的零工工作者,由于不了解算法背后的原理,在感知算法控制后会感到压力和威胁,因此他们会表现出更高的认知负荷和情绪耗竭㊂而认知负荷和情绪耗竭会导致个体资源的损耗,造成工作投入的减少㊂基于此,本文提出如下假设:H6:算法透明度削弱了认知负荷(a )和情绪耗竭(b )的间接效应,即算法透明度越高,认知负荷(a )和情绪耗竭(b )在感知算法控制和工作投入之间的间接作用越弱㊂综上所述,本文的理论模型如图1所示㊂图1㊀研究模型图三㊁研究方法(一)样本和程序本文采用问卷调查法收集数据,样本来自山东省济南市平台企业中的零工工作者,具体包括外卖配送员㊁网约车司机㊁跑腿人员和在线家政服务人员㊂研究人员首先将设计好的问卷导入问卷收集平台 问卷星 中,生成问卷电子链接和二维码;随后联系相关企业的负责人协助发放问卷电子链接和二维码㊂研究人员依靠校友关系,首先联系了9家使用算法管理的平台企业㊂在与相关企业负责人沟通后,最终有7家平台企业同意协助发放调查问卷电子链接和二维码㊂问卷的填写均为匿名,并且完全尊重员工的个人意愿㊂由于本文中的所有变量均为员工自我评价,为降低共同方法偏差对数据的不利影响,本文采用两阶段问卷收集法㊂为保证问卷数据的准确匹配,要求被试填写手机号后6位㊂虽然相关企业的员工使用算法工作,但为确保问卷的准确性,避免问卷链接和二维码被发送到不相关的被试手中,研究人员在问卷中设置了甄别题项,即向被试确认在工作中是否使用算法㊂如果被试选择 否 ,则问卷会自动结束㊂第一轮调研在202315㊀第3期㊀张兰霞,李佳敏,毛孟雨:零工工作者感知算法控制对工作投入的影响机制研究 基于认知和情感的双路径模型25商㊀业㊀经㊀济㊀与㊀管㊀理2024年年1月进行,测量了感知算法控制㊁算法透明度和人口统计学变量等信息㊂本轮共发放465份问卷,收回445份问卷㊂在删除作答时间过短㊁作答结果具有明显规律等不合格问卷后,共得到431份有效问卷,问卷有效回收率为92.69%㊂第二轮调研于两个月后进行,测量了认知负荷㊁情绪耗竭和工作投入㊂本轮主要对第一轮中的有效问卷进行追踪,通过对被试手机号后6位进行匹配,共收回415份问卷,在删除作答时间过短㊁作答结果具有明显规律等不合格问卷后,共得到397份有效问卷,问卷有效回收率为92.11%㊂在有效样本中,男性占52.39%,女性占47.61%;18 25岁占33.21%,26 30岁占25.14%,31 40岁占35.21%,41岁以上占6.44%;工作2年以下占比33.21%,3 5年占比29.31%,6 8年占比25.73%,9年及以上占比11.75%;大学本科学历的样本最多,占样本总数的66.75%,初中及以下学历的样本最少,占样本总数的8.81%㊂(二)变量测量本文中的变量均选取国内外成熟量表进行测量,对国外量表严格按照翻译 回译的程序设计问卷[38]㊂为保证翻译后的国外量表在中国情境下的适用性,邀请了两位组织行为领域的博士研究生以及一位零工工作者对量表的内容进行了审查㊂所有问卷均采用5级Likert量表进行测量,其中,1代表 完全不同意 ,5代表 完全同意 ㊂1.感知算法控制㊂选取裴嘉良等[4]开发的包括11个题项的量表测量感知算法控制㊂其中,代表性题项为 算法按照平台标准对我的工作做出了规范指示 ,该变量的内部一致性系数是0.88㊂2.算法透明度㊂本文借鉴裴嘉良等[37]学者的做法,在Durcikova和Gray[39]开发的3题项透明度量表的基础上对测量题项进行改编㊂其中,代表性题项为 我可以随时查询有关平台算法运作的信息 ,该变量的内部一致性系数是0.85㊂3.认知负荷㊂采用Mohr等[40]开发的3题项量表来测量认知负荷㊂其中,代表性题项为 我觉得下班后休息很不容易 ,该变量的内部一致性系数是0.80㊂4.情绪耗竭㊂采用Boswell等[41]开发的3题项量表测量情绪耗竭㊂其中,代表性题项为 我觉得工作使我精神枯竭 ,该变量的内部一致性系数是0.81㊂5.工作投入㊂采用Schaufeli等[42]学者开发的9题项量表测量工作投入㊂其中,代表性题项为 我在工作时会达到忘我的状态 ,该变量的内部一致性系数是0.90㊂6.控制变量㊂参考以往研究[43],本文将年龄㊁性别㊁受教育程度和工作年限等人口统计学变量作为控制变量㊂其中,年龄和工作年限为填空题,由被试直接填写其年龄和工作年限;性别和受教育程度为选择题,由被试根据自己实际情况选择㊂四㊁数据分析及结果(一)共同方法偏差检验使用Harman单因子检验方法对共同方法偏差进行检验[44]㊂将所有变量的题项未旋转进行探索性因子分析,结果表明,第一个主成分的特征值大于1,且解释了30.47%的变异值,没有超过40%的临界值,说明本文的共同方法偏差问题并不严重㊂同时,鉴于Harman单因子检验方法结果可能不灵敏,故本文在五因子模型的基础上,加入误差变量因子㊂随后,将该模型与五因子模型进行比较,发现各指标的变化不大(ΔCFI=0.011,ΔTLI=0.012,ΔRMSEA=0.010),这再次说明本文的共同方法偏差问题并不严重[45]㊂(二)描述性统计分析本文所涉及变量的均值㊁标准差和各变量间的相关系数如表1所示㊂由表1可知,感知算法控制与情绪耗竭(r=0.33,p<0.01)和认知负荷(r=0.32,p<0.01)显著正相关,与工作投入(r=-0.30,p<0.01)显著负相关;情绪耗竭与工作投入(r=-0.29,p<0.01)显著负相关;认知负荷与工作投入(r=-0.33,p< 0.01)显著负相关㊂因此,H1得到初步支持㊂同时,各变量AVE的平方根均大于该变量与其他变量的相关。
Proceedings of the International Conference on Computer Vision(ICCV’98),Kerkyra,Greece,September1999.Correlation-Based Estimation of Ego-motion and Structurefrom Motion and StereoR.Mandelbaum G.Salgian H.SawhneySarnoff CorporationCN5300,Princeton,NJ08543rmandelbaum,gsalgian,hsawhney@AbstractThis paper describes a correlation-based,iterative, multi-resolution algorithm which estimates both scene structure and the motion of the camera rig through an en-vironment from the stream(s)of incoming images.Both single-camera rigs and multiple-camera rigs can be ac-commodated.The use of multiple synchronized cameras results in more rapid convergence of the iterative ap-proach.The algorithm uses a global ego-motion con-straint to refine estimates of inter-frame camera rotation and translation.It uses local window-based correlation to refine the current estimate of scene structure.All analysis is performed at multiple resolutions.In order to combine,in a straightforward way,the cor-relation surfaces from multiple viewpoints and from mul-tiple pixels in a support region,each pixel’s correlation surface is modeled as a quadratic.This parameterization allows direct,explicit computation of incremental refine-ments for ego-motion and structure using linear algebra. Batches can be of arbitrary size,allowing a trade-off be-tween accuracy and latency.Batches can also be daisy-chained for extended sequences.Results of the algorithm are shown on synthetic and real outdoor image sequences.Keywords:ego-motion,stereo,shape recovery,struc-ture,multi-baseline,multi-resolution1IntroductionThe ego-motion of a camera rig moving through an en-vironment provides useful information for tasks such as navigation and self-localization within a map.Similarly, recovering the structure of the environment is useful for tasks such as obstacle avoidance,terrain-based velocity control,and3D map construction.In general,the prob-lems of estimating egomotion and structure from image se-quences are mutually dependent.Prior accurate knowledge of egomotion allows structure to be computed by triangula-tion from corresponding image points.This is the principle behind standard parallel-axis stereo algorithms,where the baseline is known accurately from calibration.In this case, knowledge of the epi-polar geometry provides for efficient search for corresponding points.On the other hand,if prior information is available re-garding the structure of the scene,then egomotion can be computed directly.Essentially,one considers the space of all possible poses of the camera.One then searches for the pose for which the perspective projection of the environ-ment onto the image plane most closely matches the actual image obtained.This paper addresses the case where neither accurate egomotion nor structure information is available.We pro-pose a correlation-based algorithm which assumes a very coarse starting-point for both egomotion and structure,and then alternatively and iteratively refines the estimates of both.The updated estimate of egomotion is used to obtain an improved estimate of structure,which in turn is used to refine the estimate of egomotion.1.1Related WorkSeveral researchers have addressed similar problems [3,5,6,9,11,12,13].In[4],it is pointed out that ex-isting algorithms fall into two classes:(i)those that use the epipolar constraint and assume that the motionfield is available,and(ii)those that utilize the“positive depth”constraint,also known as“direct”algorithms.The algo-rithm described in this paper does not fall into either of these categories.Although we do use the epipolar con-straint,we explicitly avoid solving for opticalflow as an intermediate step,and thus avoid the problems associated withflow estimation.Instead,we describe a correlation-based approach,where we approximate each correlation surface with a quadratic.We then use the full second-order model for egomotion and structure estimation.In other words,we employ the full shape of the correlation surface, and not just the location parameter,as in optic-flow tech-niques.This allows us to perform outlier-rejection.A similar correlation-based approach is used in[8]toalign images from multiple modalities(visible and IR cam-eras).However,in that case,a2D motion model is used (we use a3D model),and Beaudet masks are employed to estimatefirst and second derivatives of the correlation sur-face(wefit quadratics).The correlation-based approach has several advantages,especially for the types of terrain encountered during off-road autonomous navigation.We discuss these advantages in section4.1.A factorization approach is described in[10].However, 1this method addesses depth estimation at feature points; dense structure estimation is not addessed.With regard to iterative egomotion and dense structure estimation,the work closest to this paper is that of[6]and of[12].How-ever,instead of correlation,both of these algorithms rely on the image-brightness constraint.Another aspect which differentiates this work is the fact that,in addition to addressing monocular image streams, our algorithm is explicitly geared to be able to handle long sequences from a moving stereo rig,which is typical appa-ratus on outdoor autonomous vehicles.In this regard,our algorithm addresses similar issues as those in[1].How-ever,in[1],the emphasis is on the use of the trilinear tensor to guarantee consistency of egomotion estimation;estima-tion of structure over long senquences is not addressed.In this paper,we introduce the algorithm,and show proof-of-concept examples from synthetic data,and from monocular and stereo image sequences.We elaborate on several advantages of this approach,but delay a detailed quantitative comparison with other methods for a future document.1.2MotivationA primary domain of application of this algorithm is outdoor mobility of autonomous vehicles.Within this context,we seek a unified approach to processing im-age streams from the vehicle’s navigation and surveillance cameras,which addresses the following issues:1.Detection,segmentation,and representation of mov-ing objects:For the sake of self-preservation and safe oper-ation among human personnel,the vision system must be capable of operating in dynamic environments,and must be able to detect other moving vehicles and personnel.2.Extended terrain representation:The algorithm must be able to register imagery and integrate information over time.In this way,a coherent representation of the envi-ronment can be incorporated in a higher-level module per-forming3D map-maintenance.3.Far-range sensing:A persistent issue in autonomous mobility is the need to detect shape at long distances.For example,a vehicle may need to locate the gap in a dis-tant tree-line indicating a potential ing traditional structure recovery algorithms such as traditional stereo does not suffice because the baseline separation between the cameras cannot be made large enough on the vehicle. We would like an algorithm which can perform stereo-like computations over time(as opposed to traditional snap-shot stereo).This allows for arbitrarily large baselines,and hence greatly enhanced depth estimation at a distance.4.Easy incorporation of other modalities:Many out-door robotic vehicles are equipped with multiple sensor modalities,such as LADAR,an Inertial Navigation Sys-tem(INS),and odometry.We seek an algorithm which provides for easy integration and fusion of data from these modalities.The algorithm described in this paper expressly ad-dresses all these needs.1.3NotationVector quantities will be denoted with boldface,such as.Matrices will be denoted using capital let-ters.The standard Laplacian-of-Gaussian operator is de-noted by.1.4Algorithm overviewThe approach described in this paper iteratively refines estimates for egomotion and structure,in a pyramid-based multi-resolution framework.The algorithm not only alter-nates between estimating egomotion and structure,but also iterates over multiple image resolutions,and over multiple sizes of support regions for correlation.The fundamental input to the algorithm is a batch of im-ages obtained from a camera moving through the environ-ment.Batches are of arbitrary length,though significant image overlap is required within each rge batches are useful for cases where increased accuracy is required in structure and egomotion estimates.On the other hand, for low-latency,real-time operation,shorter batches can be used.In all cases,the results from successive batches can be daisy-chained together.Image overlap is required only within each batch and between successive batches;there is no requirement for image overlap among all batches.Asa result,the structure estimate for each batch is bootstrappedby the previous batch,allowing for rapid convergenceof the iterative algorithm,and thus facilitating real-time implementationthe structure-constancy constraint is preserved acrosslong chains of batchesthe net output of the algorithm is an extended3D-mosaic of the environment,and ego-motion.This paper focuses on the processing of individual batches.Bootstrapping:The algorithm is bootstrapped by pro-viding a priori coarse estimates of inter-image relative ori-entation(egomotion)and scene structure.These a priori estimates can come from a number of sources.While daisy-chaining,the initial structure estimate is the output of the previous batch process,while the initial ego-motion estimate can be obtained using Kalman Filter prediction based on the ego-motion in the previous batch.If the camera rig resides on a vehicle equipped with other sen-sor modalities,these can also be used.A LADAR,for instance,can be used to bootstrap estimates of structure, while inertial sensors or odometry can be used to boot-strap ego-motion estimates.In the absence of such de-vices,other image-based procedures can be used,such as those described in[1,10].Alternatively,if a stereo cam-era pair is available,the plane corresponding to the least-squares affine registration transform between the left and right cameras can be used as an initial estimate of the groundplane horopter[2].Note that bootstrapping can be very coarse.Finally,traditional stereo algorithms may be used to bootstrap estimates of structure.Indeed,we ex-plore this last approach in some detail in this paper.Several2of these bootstrapping mechanisms,their requirements and utility are summarized in the following table:RequirementsHoropter-basedgroundplane-fitting StructureHardware deviceMulti-modal alignmentFeature-trackingRelative-Hardware deviceIterative refinement:Based on these initial coarse esti-mates of egomotion and structure,the algorithm iteratively refines the estimates as illustrated in Figure1,according to the following steps:(1)For a batch of images,,designateas the reference image,and as in-spection images.(2)Warp all inspection images according to current ego-motion estimates for rotation and translation, and for the current estimate of structure for all points in the image.(3)Compute correlation surfaces between all warped in-spection images and the reference image.(4)Fit a quadratic surface to each correlation surface foreach point in every inspection image.(5)Prune all points whose quadratic models are not ellip-tic paraboloids.(6)Add all non-pruned correlation surfaces for each in-spection e this global cumulative correla-tion surface to compute incremental refinements of ego-motion(and)for that inspection image.(7)Add all non-pruned correlation surfaces for a particu-lar point,over all inspection e this local cumulative correlation surface to compute an incre-mental refinement of structure at that point for all points.(8)Update estimates of ego-motion and structure fromthese incremental refinements.(9)Go to step2above,unless the maximum number ofiterations has been reached.2Theoretical backgroundThis section describes the linear algebra needed to com-pute the incremental refinements,,and from quadratic models of correlation surfaces,as described in iteration steps6and7of section1.4.2.1Quadratic Models for correlation surfacesConsider a sequence of images of a static scene taken with a camera of focal length.Letbe the reference image-coordinate ing the perspective projection camera model and the derivative of the3D position of a moving object,the imageflow of an image point in image,,is given by(see[7])(1)where,is the depth of the scene at, and are the camera rotation and translation corre-sponding to image,and(2)Let denote the image motion corresponding to es-timates of ego-motion,and structure from a pre-vious iteration or resolution.Then,from equation1,an equation for incremental image motion is(3) Next,let denote the image formed by warping by theflow for all points.Denote the Laplacian-of-Gaussian images and by and respectively.Consider the correlation surface formed by correlating with the reference. That is,for an image-to-image shift of,,where denotes an operation on two corresponding pixels.In traditional stereo computations, is multiplication,absolute difference,or the square of the difference.Note that depends on.In fact, the procedure of correlation-basedflow-estimation consists offinding the value of or,depending on the choice of.We assume that a second-order(quadratic)model can befit to for each point.Let denote the cur-rent estimate of imageflow in image at point.By Taylor expansion,(4)(the vector offirst derivatives), and denotes the Hessian of(matrix of second derivatives).That is,letCorrelation surface (m x n)Figure1:General overview of the iterative refinement procedure for one batch(see section1.4for details).2.2Global estimation of ego-motionIn order to update our estimate of ego-motionbetween frame and frame,we wish tofind the ex-tremum of the global correlation surface obtained byadding the correlation surfaces for all points in the image,.During each iteration,we alter-nate between estimating the incremental refinementsand.During estimation of,we hold our cur-rent estimates for translation and structure constant.Inother words,we perform gradient descent along each ofthe“axes”of,and independently.Thus,for esti-mation of the incremental refinement,we temporarilyassume and.Hence,from equation3,(6)since,for any vector,(8) Tofind the extremum of,we take the derivative of with respect to,and equate it to.By the Chain Rule, and using equations8and4,for all(11) Tofind the extremum of,we take the derivative of with respect to,and equate it to.By the Chain Rule, and using equations11and4,whence(13)Figure2:The ego-motion of every frame in a se-quence is described by the camera translation and ro-tation relative to the reference frame.In the partic-ular case of a stereo rig with afixed baseline,the relativemotion between and is constant for the wholesequence.3ExperimentsThe algorithm was tested on synthetic data as well as real video sequences.The synthetic example allows quan-titative evaluation of the performance against ground truth.3.1Synthetic dataLet to be four images rendered by four cameras arranged as in Figure2(a stereo rig formed by and moving forward into,,respectively).The synthetic data was a surface made up of fronto-parallel tiles textured with a random dot pattern and situated at various depths from the camera(in an camera-centered refer-ence frame).For,,the position relative to is given by the actual,values in Table1.The input for the algorithm consisted of the four images,,the focal length,and the stereo baseline(,).First,an ini-tial shape estimate is computed from thefirst stereo pair. Second,the ego-motion for,are computed and the baseline(ego-motion for)is refined using the current shape estimate.Finally,both shape and ego-motion esti-mates are refined concurrently(iterating for translation,ro-tation and shape at every pyramid level in a coarse-to-fine fashion).Thefinal values for the recovered ego-motion are given in Table1,while Figure3shows the recovered depth map.Recovered(-5,0,0)(0,0,0)0 (-5.25,-0.5,-0.5)(0,0,0)Figure4:First frame of the“city”sequenceFigure5:Depth map for the“city”sequence,computedfrom a batch of16frames.Brighter points are closer.the left camera),with different batch ing the no-tation in Figure2,a batch size of consists of frames in the binocular case and frames in the monocular case,where.Scene structure was initial-ized with an estimate of the ground plane,based on the camera height and the downward tilt angle.In the stereo case,the sequence of iterations was similar to the one de-scribed in section3.1(shape from thefirst stereo pair,ego-motion for every frame,then concurrent refinement of ego-motion and shape estimates).For the monocular case,the sequence was similar to that in section3.2(ego-motion for every frame,then concurrent refinement of ego-motion and shape estimates).Figure7shows the recovered depth maps for differ-ent batch sizes,as well as a direct comparison of monoc-ular analysis(left column),regular stereo(top right cor-ner)and the combination of motion and stereo cues(lower three rows in the right column).The depth recovered using stereo only is qualitatively correct,with the rocks stand-ing out against the distant background.Since there is no overlap between the left and right frame for the left edge, this part shows the unrefined initial estimate of the ground plane.The depth estimate from2-frame monocular analy-sis is noisy,a consequence of the small inter-frame vehicle motion(approximately10cm).The depth map becomes smoother as more frames are processed,but there still are inherent problems with monocular analysis alone,such as poor range estimation near the focus of expansion.The bottom part of the right column illustrates thead-Figure6:First pair of the“rocks”stereo sequence vantages of combining motion and stereo cues.The depth information obtained from forward motion“fills in”the re-gion where there is no stereo overlap.Only half of the frames(corresponding to the left camera)contribute to this region,therefore its appearance is similar to the corre-sponding region from the previous row in the left column.(left cameraonly)Figure7:Depth map for the“rocks”sequence(see sec-tion3.3).All depth maps are in the coordinate system ofthe left image of Figure6.4DiscussionIn this paper we presented a correlation-based algorithm for estimating both ego-motion and scene structure from multiple video streams.The algorithm can run on the input video streams alone,or use information from a variety of sources(LADAR,Inertial Sensors,odometry)to bootstrap and speed up the estimation process by reducing the num-ber of iterations.In particular,the algorithm is explicitly6designed to be able to handle long batches,and to daisy-chain results from multiple batches from a moving stereo rig,which is typical apparatus on outdoor autonomous ve-hicles.4.1Advantages of the approachIn the particular domain of outdoor autonomous navi-gation,there are several advantages of using correlation-based estimation and of quadratic surface-fitting to corre-lation surfaces,as described in this paper:Extended search range:In a multi-resolution pyramid-based approach,a correlation-based approach allows the search range to be extended(with the associated increase in computational cost).Contrast this with an image-brightness-constraint approach(see[6,12]),where the ref-erence and inspection images must always be aligned to within1pixel at whatever the current processing resolution is.A correlation-based approach allows a moreflexible use of multi-resolution analysis.This is useful for scenes in which different frequency bands may have differing amounts of image energy.Off-road grassy terrain,for ex-ample,generally has large high-frequency,but small low-frequency content.For these cases,a correlation-based approach can adaptively use large search areas in high-frequency bands of a pyramid representation of the scene. Alternatively,images with more low-frequency content can be analyzed coarse-to-fine,exploiting the computational advantages of pyramid-based image-processing.Direct estimation:Byfitting quadratic surfaces to corre-lation surfaces,the incremental refinements,,and for rotation,translation,and structure can be computed directly using linear algebra.Pruning of outliers:Thefitting of quadratic surfaces to the correlation surfaces provides for a simple measure of quality of correlation by analyzing the curvature of the quadratic correlation associated with a pixel.Thus image points in low-texture regions with poor correlation surfaces may be detected and pruned.Adaptive multi-resolution processing:The quality mea-sure of quadratic correlation surfaces provides a simple confidence measure which can be used to adaptively adjust correlation search regions and multi-resolution processing.Adaptive support regions:A comparison of correlation confidence for multiple sizes of correlation support win-dows provides a mechanism for detecting depth disconti-nuities and low-texture regions in the image(s).4.2Motivational issues revisitedFinally,we recapitulate the primary issues which mo-tivated the development of this algorithm,as described in section1.2,and summarize how our approach addresses each issue.Details of each issue are beyond the scope of this paper.Detection,segmentation,and representation of moving objects:Our algorithm enforces structure constancy over a batch of images.This structure is bootstrapped using a stereo pair of images taken simultaneously,and thereforenot subject to misinterpretation caused by independently moving objects.Thus,moving objects can be detected by comparing the structure estimated over time with the initial structure obtained with stereo.Extended terrain representation:Our algorithm can be used to generate3D mosaics by daisy-chaining the results from successive batches(see section1.4).Far-range sensing:Since the algorithm integrates infor-mation over long periods of time,the approach can be used for detecting shape at long distances,by utilizing arbitrar-ily large baselines.Easy incorporation of other modalities:As mentioned above,our algorithm can utilize data from other sensing modalities(such as LADAR,INS,and odometry)directly by using this information to bootstrap structure and ego-motion estimates.References[1]S.Avidan and A.Shashua.Threading fundamental matri-ces.In Proceedings of thefifth European Conference onComputer Vision(ECCV’98),volume1,pages124–140,June1998.[2]P.Burt,L.Wixson,and G.Salgian.Electronically directed‘focal’stereo.Proceedings of the Fifth International Con-ference on Computer Vision(ICCV’95),pages94–101,June1995.[3]K.Daniilidis and H.H.Nagel.The coupling of rotationand translation in motion estimation of planar surfaces.InCVPR,pages188–193,1993.[4] C.Ferm¨u ller and Y.Aloimonos.What is computed by struc-ture from motion algorithms?In Proceedings of thefifthEuropean Conference on Computer Vision(ECCV’98),vol-ume1,pages359–375,June1998.[5]K.J.Hanna.Direct multi-resolution estimation of ego-motion and structure from motion.In Workshop on VisualMotion,pages156–162,Princeton,NJ,October1991.[6]K.J.Hanna and bining stereo andmotion analysis for direct estimation of scene structure.InProc.Intl.Conf.on Computer Vision,pages357–365,1993.[7] B.Horn.Robot Vision.McGraw-Hill,1986.[8]M.Irani and P.Anandan.Robust multi-sensor image align-ment.In Proceedings of the sixth International Conferenceon Computer Vision(ICCV’98),pages959–965,January1998.[9]T.Kanade,Okutomi,and Nakhara.A multiple-baselinestereo method.In DARPA IU Workshop,pages409–426,January1992.[10] D.Morris and T.Kanade.A unified factorization algorithmfor points,line segments and planes with uncertainty mod-els.In Proceedings of the sixth International Conference onComputer Vision(ICCV’98),pages696–702,January1998.[11]S.M.Seitz and C.R.Dyer.Photorealistic scene reconstruc-tion by voxel coloring.In puter Vision and Pat-tern Recognition Conference,pages1067–1073,1997.[12]G.P.Stein and A.Shashua.Model-based brightness con-straints:On direct estimation of structure and motion.Inputer Vision and Pattern Recognition Confer-ence,pages400–406,1997.[13]T.Vieville,O.Faugeras,and Q.Luong.Motion of pointsand lines in the uncalibrated case.IJCV,17(1):7–42,1996. 7。