Abstract WaveRead Automatic measurement of relative gene expression levels from microarrays

格式：pdf
大小：1.21 MB
文档页数：10

下载文档原格式

一种雷达辐射源智能个体识别的方法

一种雷达辐射源智能个体识别的方法
陆剑雄;陈旗;满欣
【期刊名称】《电光与控制》
【年(卷),期】2024(31)4
【摘要】针对使用传统的卷积神经网络及低信噪比环境下雷达辐射源智能个体识别研究中识别性能不够的问题,提出了一种基于短时傅里叶变换(STFT)和EfficientNet的雷达辐射源个体识别方法。

首先对雷达信号进行短时傅里叶变换,提取时频特征,然后利用EfficientNet中多个MBconv模块对不同时频特征图像的叠加,挖掘出信号图像隐含的更加复杂和抽象的深层次时频特征,包括信号强度的分布、时频模式、周期性变化等,从而完成个体分类识别。

EfficientNet可以同时改变网络深度、宽度、图像分辨率3个参数,解决了梯度消失、梯度爆炸等问题。

实验结果表明,基于STFT和EfficientNet的雷达辐射源智能个体识别的方法,相比于传统卷积神经网络在低信噪比环境下具有更好的识别性能。

【总页数】6页(P115-120)
【作者】陆剑雄;陈旗;满欣
【作者单位】海军工程大学电子工程学院
【正文语种】中文
【中图分类】TN974
【相关文献】
1.一种深度强化学习的雷达辐射源个体识别方法
2.一种基于随机森林的雷达辐射源个体识别方法
3.基于KNN和雷达辐射源脉间参数的舰船目标个体识别方法
4.基于多源信息融合的激光雷达辐射源个体识别方法
5.融合双谱特征的雷达辐射源个体识别方法
因版权原因，仅展示原文概要，查看原文内容请购买。

基于卷积神经网络的监控场景下行人属性识别

基于卷积神经网络的监控场景下行人属性识别胡诚;陈亮;张勋;孙韶媛【摘要】在真实监控场景下的视觉行人属性,如性别、衣着类型,对行人检索和行人重识别非常重要.传统的行人属性识别算法,采取人工提取特征,而且忽视行人属性特征之间的关联.受到卷积神经网络在传统计算机视觉任务中出色表现的启发,提出一种基于卷积神经网络的行人属性识别方法在监控场景下识别行人属性.卷积神经网络在训练过程中可以自动提取行人特征.重新定义的损失函数,可以同时考虑所有行人属性特征之间的联系.相比于传统方法,该算法实施简单且识别精度更高.【期刊名称】《现代计算机（专业版）》【年(卷),期】2018(000)001【总页数】5页(P22-26)【关键词】视觉行人属性;卷积神经网络;损失函数【作者】胡诚;陈亮;张勋;孙韶媛【作者单位】东华大学信息科学与技术学院,上海 201620;东华大学数字化纺织服装技术教育部工程研究中心,上海201620;东华大学信息科学与技术学院,上海201620;东华大学数字化纺织服装技术教育部工程研究中心,上海201620;东华大学信息科学与技术学院,上海 201620;东华大学数字化纺织服装技术教育部工程研究中心,上海201620;东华大学信息科学与技术学院,上海 201620;东华大学数字化纺织服装技术教育部工程研究中心,上海201620【正文语种】中文0 引言行人视觉属性识别，由于它的高层的语义信息，可以建立人的底层特征和高层认知的联系。

因此在计算机视觉领域是一个很热门的研究方向。

并且在很多的领域也取得了成功。

例如：图片检索、目标检测、人脸识别。

近些年，随着平安城市的概念的提出，数以万计的监控摄像头装在了城市的各个角落，保护着人们的安全。

因此，监控场景下的行人视觉属性的识别具有重要的研究价值，并且它也在智能视频监控和智能商业视频有很大的市场前景。

当前大多数的行人属性识别研究主要在两个应用场景：自然场景和监控场景。

基于深度学习的自动演奏乐器识别技术研究

基于深度学习的自动演奏乐器识别技术研究随着深度学习技术的不断发展和应用，自动演奏乐器识别技术也变得越来越成熟。

这种技术的应用范围非常广泛，可以应用于音乐产业、智能家居、游戏等领域。

本文将探讨基于深度学习的自动演奏乐器识别技术的研究及其发展前景。

一、技术原理基于深度学习的自动演奏乐器识别技术通常采用卷积神经网络（Convolutional Neural Networks，CNN）进行识别。

首先，将演奏乐器的音频信号输入到CNN模型中，进行特征提取和转换。

由于不同乐器的音频信号特征不同，因此需要对不同乐器进行训练。

然后，模型会根据已有的音频信号进行学习，并建立学习模型。

最后，将模型应用于新的音频信号，根据学习到的模型进行自动分类，判断音频信号所属的乐器。

整个过程被称为自动演奏乐器识别。

二、应用场景自动演奏乐器识别技术广泛应用于音乐产业、智能家居、游戏等领域。

在音乐产业中，音乐家可以使用这种技术来快速识别音乐作品中所使用的乐器，提高创作效率。

在智能家居领域，可以通过音频识别技术计算室内人员的数量、位置和活动，从而更准确地控制智能设备。

在游戏开发领域，自动演奏乐器识别技术可以增强音乐游戏的游戏体验，帮助玩家更好地了解和掌握游戏的规则和操作。

三、技术挑战虽然基于深度学习的自动演奏乐器识别技术已经取得了很好的进展，但仍然存在挑战和问题。

首先，不同乐器的音色复杂多变，因此需要训练数量庞大的数据集。

其次，噪声等干扰因素会对自动演奏乐器识别产生负面影响，影响准确性。

解决这些问题的关键在于提高数据集的质量和数量，设计更加精细的特征提取和转换方法，以及优化识别算法。

四、发展前景基于深度学习的自动演奏乐器识别技术在未来的发展中将会有更广泛的应用。

随着技术的发展和成熟，自动演奏乐器识别技术将会得到更加广泛的应用，在音乐创作、音乐教育、智能家居、游戏等领域都有很大的应用前景。

此外，随着移动互联网技术的普及，自动演奏乐器识别技术也将会成为手机、平板电脑等移动设备上的重要应用之一。

SCI论文摘要中常用的表达方法

SCI论文摘要中常用的表达方法要写好摘要，需要建立一个适合自己需要的句型库（选择的词汇来源于SCI高被引用论文）引言部分（1）回顾研究背景，常用词汇有review, summarize, present, outline, describe等（2）说明写作目的，常用词汇有purpose, attempt, aim等，另外还可以用动词不定式充当目的壮语老表达（3）介绍论文的重点内容或研究范围，常用词汇有study, present, include, focus, emphasize, emphasis, attention等方法部分（1）介绍研究或试验过程，常用词汇有test study, investigate, examine,experiment, discuss, consider, analyze, analysis等（2）说明研究或试验方法，常用词汇有measure, estimate, calculate等（3）介绍应用、用途，常用词汇有use, apply, application等结果部分（1）展示研究结果，常用词汇有show, result, present等（2）介绍结论，常用词汇有summary, introduce,conclude等讨论部分（1）陈述论文的论点和作者的观点，常用词汇有suggest, repot, present, expect, describe 等（2）说明论证，常用词汇有support, provide, indicate, identify, find, demonstrate, confirm, clarify等（3）推荐和建议，常用词汇有suggest,suggestion, recommend, recommendation, propose,necessity,necessary,expect等。

摘要引言部分案例词汇review•Author(s): ROBINSON, TE; BERRIDGE, KC•Title:THE NEURAL BASIS OF DRUG CRA VING - AN INCENTIVE-SENSITIZATION THEORY OF ADDICTION•Source: BRAIN RESEARCH REVIEWS, 18 (3): 247-291 SEP-DEC 1993 《脑研究评论》荷兰SCI被引用1774We review evidence for this view of addiction and discuss its implications for understanding the psychology and neurobiology of addiction.回顾研究背景SCI高被引摘要引言部分案例词汇summarizeAuthor(s): Barnett, RM; Carone, CD; 被引用1571Title: Particles and field .1. Review of particle physicsSource: PHYSICAL REVIEW D, 54 (1): 1-+ Part 1 JUL 1 1996:《物理学评论，D辑》美国引言部分回顾研究背景常用词汇summarizeAbstract: This biennial review summarizes much of Particle Physics. Using data from previous editions, plus 1900 new measurements from 700 papers, we list, evaluate, and average measuredproperties of gauge bosons, leptons, quarks, mesons, and baryons. We also summarize searches for hypothetical particles such as Higgs bosons, heavy neutrinos, and supersymmetric particles. All the particle properties and search limits are listed in Summary Tables. We also give numerous tables, figures, formulae, and reviews of topics such as the Standard Model, particle detectors, probability, and statistics. A booklet is available containing the Summary Tables and abbreviated versions of some of the other sections of this full Review.SCI摘要引言部分案例attentionSCI摘要方法部分案例considerSCI高被引摘要引言部分案例词汇outline•Author(s): TIERNEY, L SCI引用728次•Title:MARKOV-CHAINS FOR EXPLORING POSTERIOR DISTRIBUTIONS 引言部分回顾研究背景，常用词汇outline•Source: ANNALS OF STATISTICS, 22 (4): 1701-1728 DEC 1994•《统计学纪事》美国•Abstract: Several Markov chain methods are available for sampling from a posterior distribution. Two important examples are the Gibbs sampler and the Metropolis algorithm.In addition, several strategies are available for constructing hybrid algorithms. This paper outlines some of the basic methods and strategies and discusses some related theoretical and practical issues. On the theoretical side, results from the theory of general state space Markov chains can be used to obtain convergence rates, laws of large numbers and central limit theorems for estimates obtained from Markov chain methods. These theoretical results can be used to guide the construction of more efficient algorithms. For the practical use of Markov chain methods, standard simulation methodology provides several Variance reduction techniques and also gives guidance on the choice of sample size and allocation.SCI高被引摘要引言部分案例回顾研究背景presentAuthor(s): L YNCH, M; MILLIGAN, BG SC I被引用661Title: ANAL YSIS OF POPULATION GENETIC-STRUCTURE WITH RAPD MARKERS Source: MOLECULAR ECOLOGY, 3 (2): 91-99 APR 1994《分子生态学》英国Abstract: Recent advances in the application of the polymerase chain reaction make it possible to score individuals at a large number of loci. The RAPD (random amplified polymorphic DNA) method is one such technique that has attracted widespread interest.The analysis of population structure with RAPD data is hampered by the lack of complete genotypic information resulting from dominance, since this enhances the sampling variance associated with single loci as well as induces bias in parameter estimation. We present estimators for several population-genetic parameters (gene and genotype frequencies, within- and between-population heterozygosities, degree of inbreeding and population subdivision, and degree of individual relatedness) along with expressions for their sampling variances. Although completely unbiased estimators do not appear to be possible with RAPDs, several steps are suggested that will insure that the bias in parameter estimates is negligible. To achieve the same degree of statistical power, on the order of 2 to 10 times more individuals need to be sampled per locus when dominant markers are relied upon, as compared to codominant (RFLP, isozyme) markers. Moreover, to avoid bias in parameter estimation, the marker alleles for most of these loci should be in relatively low frequency. Due to the need for pruning loci with low-frequency null alleles, more loci also need to be sampled with RAPDs than with more conventional markers, and sole problems of bias cannot be completely eliminated.SCI高被引摘要引言部分案例词汇describe•Author(s): CLONINGER, CR; SVRAKIC, DM; PRZYBECK, TR•Title: A PSYCHOBIOLOGICAL MODEL OF TEMPERAMENT AND CHARACTER•Source: ARCHIVES OF GENERAL PSYCHIATRY, 50 (12): 975-990 DEC 1993《普通精神病学纪要》美国•引言部分回顾研究背景，常用词汇describe 被引用926•Abstract: In this study, we describe a psychobiological model of the structure and development of personality that accounts for dimensions of both temperament and character. Previous research has confirmed four dimensions of temperament: novelty seeking, harm avoidance, reward dependence, and persistence, which are independently heritable, manifest early in life, and involve preconceptual biases in perceptual memory and habit formation. For the first time, we describe three dimensions of character that mature in adulthood and influence personal and social effectiveness by insight learning about self-concepts.Self-concepts vary according to the extent to which a person identifies the self as (1) an autonomous individual, (2) an integral part of humanity, and (3) an integral part of the universe as a whole. Each aspect of self-concept corresponds to one of three character dimensions called self-directedness, cooperativeness, and self-transcendence, respectively. We also describe the conceptual background and development of a self-report measure of these dimensions, the Temperament and Character Inventory. Data on 300 individuals from the general population support the reliability and structure of these seven personality dimensions. We discuss the implications for studies of information processing, inheritance, development, diagnosis, and treatment.摘要引言部分案例•（2）说明写作目的，常用词汇有purpose, attempt, aimSCI高被引摘要引言部分案例attempt说明写作目的•Author(s): Donoho, DL; Johnstone, IM•Title: Adapting to unknown smoothness via wavelet shrinkage•Source: JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 90 (432): 1200-1224 DEC 1995 《美国统计学会志》被引用429次•Abstract: We attempt to recover a function of unknown smoothness from noisy sampled data. We introduce a procedure, SureShrink, that suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: A threshold level is assigned to each dyadic resolution level by the principle of minimizing the Stein unbiased estimate of risk (Sure) for threshold estimates. The computational effort of the overall procedure is order N.log(N) as a function of the sample size N. SureShrink is smoothness adaptive: If the unknown function contains jumps, then the reconstruction (essentially) does also; if the unknown function has a smooth piece, then the reconstruction is (essentially) as smooth as the mother wavelet will allow. The procedure is in a sense optimally smoothness adaptive: It is near minimax simultaneously over a whole interval of the Besov scale; the size of this interval depends on the choice of mother wavelet. We know from a previous paper by the authors that traditional smoothing methods-kernels, splines, and orthogonal series estimates-even with optimal choices of the smoothing parameter, would be unable to perform in a near-minimax way over many spaces in the Besov scale.Examples of SureShrink are given. The advantages of the method are particularly evident when the underlying function has jump discontinuities on a smooth backgroundSCI高被引摘要引言部分案例To investigate说明写作目的•Author(s): OLTV AI, ZN; MILLIMAN, CL; KORSMEYER, SJ•Title: BCL-2 HETERODIMERIZES IN-VIVO WITH A CONSERVED HOMOLOG, BAX, THAT ACCELERATES PROGRAMMED CELL-DEATH•Source: CELL, 74 (4): 609-619 AUG 27 1993 被引用3233•Abstract: Bcl-2 protein is able to repress a number of apoptotic death programs. To investigate the mechanism of Bcl-2's effect, we examined whether Bcl-2 interacted with other proteins. We identified an associated 21 kd protein partner, Bax, that has extensive amino acid homology with Bcl-2, focused within highly conserved domains I and II. Bax is encoded by six exons and demonstrates a complex pattern of alternative RNA splicing that predicts a 21 kd membrane (alpha) and two forms of cytosolic protein (beta and gamma). Bax homodimerizes and forms heterodimers with Bcl-2 in vivo. Overexpressed Bax accelerates apoptotic death induced by cytokine deprivation in an IL-3-dependent cell line. Overexpressed Bax also counters the death repressor activity of Bcl-2. These data suggest a model in which the ratio of Bcl-2 to Bax determines survival or death following an apoptotic stimulus.SCI高被引摘要引言部分案例purposes说明写作目的•Author(s): ROGERS, FJ; IGLESIAS, CA•Title: RADIATIVE ATOMIC ROSSELAND MEAN OPACITY TABLES•Source: ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 79 (2): 507-568 APR 1992 《天体物理学杂志增刊》美国SCI被引用512•Abstract: For more than two decades the astrophysics community has depended on opacity tables produced at Los Alamos. In the present work we offer new radiative Rosseland mean opacity tables calculated with the OPAL code developed independently at LLNL. We give extensive results for the recent Anders-Grevesse mixture which allow accurate interpolation in temperature, density, hydrogen mass fraction, as well as metal mass fraction. The tables are organized differently from previous work. Instead of rows and columns of constant temperature and density, we use temperature and follow tracks of constant R, where R = density/(temperature)3. The range of R and temperature are such as to cover typical stellar conditions from the interior through the envelope and the hotter atmospheres. Cool atmospheres are not considered since photoabsorption by molecules is neglected. Only radiative processes are taken into account so that electron conduction is not included. For comparison purposes we present some opacity tables for the Ross-Aller and Cox-Tabor metal abundances. Although in many regions the OPAL opacities are similar to previous work, large differences are reported.For example, factors of 2-3 opacity enhancements are found in stellar envelop conditions.SCI高被引摘要引言部分案例aim说明写作目的•Author(s):EDV ARDSSON, B; ANDERSEN, J; GUSTAFSSON, B; LAMBERT, DL;NISSEN, PE; TOMKIN, J•Title:THE CHEMICAL EVOLUTION OF THE GALACTIC DISK .1. ANALYSISAND RESULTS•Source: ASTRONOMY AND ASTROPHYSICS, 275 (1): 101-152 AUG 1993 《天文学与天体物理学》被引用934•Abstract:With the aim to provide observational constraints on the evolution of the galactic disk, we have derived abundances of 0, Na, Mg, Al, Si, Ca, Ti, Fe, Ni, Y, Zr, Ba and Nd, as well as individual photometric ages, for 189 nearby field F and G disk dwarfs.The galactic orbital properties of all stars have been derived from accurate kinematic data, enabling estimates to be made of the distances from the galactic center of the stars‘ birthplaces. 结构式摘要•Our extensive high resolution, high S/N, spectroscopic observations of carefully selected northern and southern stars provide accurate equivalent widths of up to 86 unblended absorption lines per star between 5000 and 9000 angstrom. The abundance analysis was made with greatly improved theoretical LTE model atmospheres. Through the inclusion of a great number of iron-peak element absorption lines the model fluxes reproduce the observed UV and visual fluxes with good accuracy. A new theoretical calibration of T(eff) as a function of Stromgren b - y for solar-type dwarfs has been established. The new models and T(eff) scale are shown to yield good agreement between photometric and spectroscopic measurements of effective temperatures and surface gravities, but the photometrically derived very high overall metallicities for the most metal rich stars are not supported by the spectroscopic analysis of weak spectral lines.•Author(s): PAYNE, MC; TETER, MP; ALLAN, DC; ARIAS, TA; JOANNOPOULOS, JD•Title:ITERA TIVE MINIMIZATION TECHNIQUES FOR ABINITIO TOTAL-ENERGY CALCULATIONS - MOLECULAR-DYNAMICS AND CONJUGA TE GRADIENTS•Source: REVIEWS OF MODERN PHYSICS, 64 (4): 1045-1097 OCT 1992 《现代物理学评论》美国American Physical Society SCI被引用2654 •Abstract: This article describes recent technical developments that have made the total-energy pseudopotential the most powerful ab initio quantum-mechanical modeling method presently available. In addition to presenting technical details of the pseudopotential method, the article aims to heighten awareness of the capabilities of the method in order to stimulate its application to as wide a range of problems in as many scientific disciplines as possible.SCI高被引摘要引言部分案例includes介绍论文的重点内容或研究范围•Author(s):MARCHESINI, G; WEBBER, BR; ABBIENDI, G; KNOWLES, IG;SEYMOUR, MH; STANCO, L•Title: HERWIG 5.1 - A MONTE-CARLO EVENT GENERA TOR FOR SIMULATING HADRON EMISSION REACTIONS WITH INTERFERING GLUONS SCI被引用955次•Source: COMPUTER PHYSICS COMMUNICATIONS, 67 (3): 465-508 JAN 1992:《计算机物理学通讯》荷兰Elsevier•Abstract: HERWIG is a general-purpose particle-physics event generator, which includes the simulation of hard lepton-lepton, lepton-hadron and hadron-hadron scattering and soft hadron-hadron collisions in one package. It uses the parton-shower approach for initial-state and final-state QCD radiation, including colour coherence effects and azimuthal correlations both within and between jets. This article includes a brief review of the physics underlying HERWIG, followed by a description of the program itself. This includes details of the input and control parameters used by the program, and the output data provided by it. Sample output from a typical simulation is given and annotated.SCI高被引摘要引言部分案例presents介绍论文的重点内容或研究范围•Author(s): IDSO, KE; IDSO, SB•Title: PLANT-RESPONSES TO ATMOSPHERIC CO2 ENRICHMENT IN THE FACE OF ENVIRONMENTAL CONSTRAINTS - A REVIEW OF THE PAST 10 YEARS RESEARCH•Source: AGRICULTURAL AND FOREST METEOROLOGY, 69 (3-4): 153-203 JUL 1994 《农业和林业气象学》荷兰Elsevier 被引用225•Abstract:This paper presents a detailed analysis of several hundred plant carbon exchange rate (CER) and dry weight (DW) responses to atmospheric CO2 enrichment determined over the past 10 years. It demonstrates that the percentage increase in plant growth produced by raising the air's CO2 content is generally not reduced by less than optimal levels of light, water or soil nutrients, nor by high temperatures, salinity or gaseous air pollution. More often than not, in fact, the data show the relative growth-enhancing effects of atmospheric CO2 enrichment to be greatest when resource limitations and environmental stresses are most severe.SCI高被引摘要引言部分案例介绍论文的重点内容或研究范围emphasizing •Author(s): BESAG, J; GREEN, P; HIGDON, D; MENGERSEN, K•Title: BAYESIAN COMPUTATION AND STOCHASTIC-SYSTEMS•Source: STATISTICAL SCIENCE, 10 (1): 3-41 FEB 1995《统计科学》美国•SCI被引用296次•Abstract: Markov chain Monte Carlo (MCMC) methods have been used extensively in statistical physics over the last 40 years, in spatial statistics for the past 20 and in Bayesian image analysis over the last decade. In the last five years, MCMC has been introduced into significance testing, general Bayesian inference and maximum likelihood estimation. This paper presents basic methodology of MCMC, emphasizing the Bayesian paradigm, conditional probability and the intimate relationship with Markov random fields in spatial statistics.Hastings algorithms are discussed, including Gibbs, Metropolis and some other variations. Pairwise difference priors are described and are used subsequently in three Bayesian applications, in each of which there is a pronounced spatial or temporal aspect to the modeling. The examples involve logistic regression in the presence of unobserved covariates and ordinal factors; the analysis of agricultural field experiments, with adjustment for fertility gradients; and processing oflow-resolution medical images obtained by a gamma camera. Additional methodological issues arise in each of these applications and in the Appendices. The paper lays particular emphasis on the calculation of posterior probabilities and concurs with others in its view that MCMC facilitates a fundamental breakthrough in applied Bayesian modeling.SCI高被引摘要引言部分案例介绍论文的重点内容或研究范围focuses •Author(s): HUNT, KJ; SBARBARO, D; ZBIKOWSKI, R; GAWTHROP, PJ•Title: NEURAL NETWORKS FOR CONTROL-SYSTEMS - A SURVEY•Source: AUTOMA TICA, 28 (6): 1083-1112 NOV 1992《自动学》荷兰Elsevier•SCI被引用427次•Abstract:This paper focuses on the promise of artificial neural networks in the realm of modelling, identification and control of nonlinear systems. The basic ideas and techniques of artificial neural networks are presented in language and notation familiar to control engineers. Applications of a variety of neural network architectures in control are surveyed. We explore the links between the fields of control science and neural networks in a unified presentation and identify key areas for future research.SCI高被引摘要引言部分案例介绍论文的重点内容或研究范围focus•Author(s): Stuiver, M; Reimer, PJ; Bard, E; Beck, JW;•Title: INTCAL98 radiocarbon age calibration, 24,000-0 cal BP•Source: RADIOCARBON, 40 (3): 1041-1083 1998《放射性碳》美国SCI被引用2131次•Abstract: The focus of this paper is the conversion of radiocarbon ages to calibrated (cal) ages for the interval 24,000-0 cal BP (Before Present, 0 cal BP = AD 1950), based upon a sample set of dendrochronologically dated tree rings, uranium-thorium dated corals, and varve-counted marine sediment. The C-14 age-cal age information, produced by many laboratories, is converted to Delta(14)C profiles and calibration curves, for the atmosphere as well as the oceans. We discuss offsets in measured C-14 ages and the errors therein, regional C-14 age differences, tree-coral C-14 age comparisons and the time dependence of marine reservoir ages, and evaluate decadal vs. single-year C-14 results. Changes in oceanic deepwater circulation, especially for the 16,000-11,000 cal sp interval, are reflected in the Delta(14)C values of INTCAL98.SCI高被引摘要引言部分案例介绍论文的重点内容或研究范围emphasis •Author(s): LEBRETON, JD; BURNHAM, KP; CLOBERT, J; ANDERSON, DR•Title: MODELING SURVIV AL AND TESTING BIOLOGICAL HYPOTHESES USING MARKED ANIMALS - A UNIFIED APPROACH WITH CASE-STUDIES •Source: ECOLOGICAL MONOGRAPHS, 62 (1): 67-118 MAR 1992•《生态学论丛》美国•Abstract: The understanding of the dynamics of animal populations and of related ecological and evolutionary issues frequently depends on a direct analysis of life history parameters. For instance, examination of trade-offs between reproduction and survival usually rely on individually marked animals, for which the exact time of death is most often unknown, because marked individuals cannot be followed closely through time.Thus, the quantitative analysis of survival studies and experiments must be based oncapture-recapture (or resighting) models which consider, besides the parameters of primary interest, recapture or resighting rates that are nuisance parameters. 结构式摘要•T his paper synthesizes, using a common framework, these recent developments together with new ones, with an emphasis on flexibility in modeling, model selection, and the analysis of multiple data sets. The effects on survival and capture rates of time, age, and categorical variables characterizing the individuals (e.g., sex) can be considered, as well as interactions between such effects. This "analysis of variance" philosophy emphasizes the structure of the survival and capture process rather than the technical characteristics of any particular model. The flexible array of models encompassed in this synthesis uses a common notation. As a result of the great level of flexibility and relevance achieved, the focus is changed from fitting a particular model to model building and model selection.SCI摘要方法部分案例•方法部分•（1）介绍研究或试验过程，常用词汇有test，study, investigate, examine,experiment, discuss, consider, analyze, analysis等•（2）说明研究或试验方法，常用词汇有measure, estimate, calculate等•（3）介绍应用、用途，常用词汇有use, apply, application等SCI高被引摘要方法部分案例discusses介绍研究或试验过程•Author(s): LIANG, KY; ZEGER, SL; QAQISH, B•Title: MULTIV ARIATE REGRESSION-ANAL YSES FOR CATEGORICAL-DATA •Source:JOURNAL OF THE ROY AL STA TISTICAL SOCIETY SERIES B-METHODOLOGICAL, 54 (1): 3-40 1992《皇家统计学会志，B辑：统计方法论》•SCI被引用298•Abstract: It is common to observe a vector of discrete and/or continuous responses in scientific problems where the objective is to characterize the dependence of each response on explanatory variables and to account for the association between the outcomes. The response vector can comprise repeated observations on one variable, as in longitudinal studies or genetic studies of families, or can include observations for different variables.This paper discusses a class of models for the marginal expectations of each response and for pairwise associations. The marginal models are contrasted with log-linear models.Two generalized estimating equation approaches are compared for parameter estimation.The first focuses on the regression parameters; the second simultaneously estimates the regression and association parameters. The robustness and efficiency of each is discussed.The methods are illustrated with analyses of two data sets from public health research SCI高被引摘要方法部分案例介绍研究或试验过程examines•Author(s): Huo, QS; Margolese, DI; Stucky, GD•Title: Surfactant control of phases in the synthesis of mesoporous silica-based materials •Source: CHEMISTRY OF MATERIALS, 8 (5): 1147-1160 MAY 1996•SCI被引用643次《材料的化学性质》美国•Abstract: The low-temperature formation of liquid-crystal-like arrays made up of molecular complexes formed between molecular inorganic species and amphiphilic organic molecules is a convenient approach for the synthesis of mesostructure materials.This paper examines how the molecular shapes of covalent organosilanes, quaternary ammonium surfactants, and mixed surfactants in various reaction conditions can be used to synthesize silica-based mesophase configurations, MCM-41 (2d hexagonal, p6m), MCM-48 (cubic Ia3d), MCM-50 (lamellar), SBA-1 (cubic Pm3n), SBA-2 (3d hexagonal P6(3)/mmc), and SBA-3(hexagonal p6m from acidic synthesis media). The structural function of surfactants in mesophase formation can to a first approximation be related to that of classical surfactants in water or other solvents with parallel roles for organic additives. The effective surfactant ion pair packing parameter, g = V/alpha(0)l, remains a useful molecular structure-directing index to characterize the geometry of the mesophase products, and phase transitions may be viewed as a variation of g in the liquid-crystal-Like solid phase. Solvent and cosolvent structure direction can be effectively used by varying polarity, hydrophobic/hydrophilic properties and functionalizing the surfactant molecule, for example with hydroxy group or variable charge. Surfactants and synthesis conditions can be chosen and controlled to obtain predicted silica-based mesophase products. A room-temperature synthesis of the bicontinuous cubic phase, MCM-48, is presented. A low-temperature (100 degrees C) and low-pH (7-10) treatment approach that can be used to give MCM-41 with high-quality, large pores (up to 60 Angstrom), and pore volumes as large as 1.6 cm(3)/g is described.Estimates 介绍研究或试验过程SCI高被引摘要方法部分案例•Author(s): KESSLER, RC; MCGONAGLE, KA; ZHAO, SY; NELSON, CB; HUGHES, M; ESHLEMAN, S; WITTCHEN, HU; KENDLER, KS•Title:LIFETIME AND 12-MONTH PREV ALENCE OF DSM-III-R PSYCHIATRIC-DISORDERS IN THE UNITED-STA TES - RESULTS FROM THE NATIONAL-COMORBIDITY-SURVEY•Source: ARCHIVES OF GENERAL PSYCHIATRY, 51 (1): 8-19 JAN 1994•《普通精神病学纪要》美国SCI被引用4350次•Abstract: Background: This study presents estimates of lifetime and 12-month prevalence of 14 DSM-III-R psychiatric disorders from the National Comorbidity Survey, the first survey to administer a structured psychiatric interview to a national probability sample in the United States.Methods: The DSM-III-R psychiatric disorders among persons aged 15 to 54 years in the noninstitutionalized civilian population of the United States were assessed with data collected by lay interviewers using a revised version of the Composite International Diagnostic Interview. Results: Nearly 50% of respondents reported at least one lifetime disorder, and close to 30% reported at least one 12-month disorder. The most common disorders were major depressive episode, alcohol dependence, social phobia, and simple phobia. More than half of all lifetime disorders occurred in the 14% of the population who had a history of three or more comorbid disorders. These highly comorbid people also included the vast majority of people with severe disorders.Less than 40% of those with a lifetime disorder had ever received professional treatment,and less than 20% of those with a recent disorder had been in treatment during the past 12 months. Consistent with previous risk factor research, it was found that women had elevated rates of affective disorders and anxiety disorders, that men had elevated rates of substance use disorders and antisocial personality disorder, and that most disorders declined with age and with higher socioeconomic status. Conclusions: The prevalence of psychiatric disorders is greater than previously thought to be the case. Furthermore, this morbidity is more highly concentrated than previously recognized in roughly one sixth of the population who have a history of three or more comorbid disorders. This suggests that the causes and consequences of high comorbidity should be the focus of research attention. The majority of people with psychiatric disorders fail to obtain professional treatment. Even among people with a lifetime history of three or more comorbid disorders, the proportion who ever obtain specialty sector mental health treatment is less than 50%.These results argue for the importance of more outreach and more research on barriers to professional help-seekingSCI高被引摘要方法部分案例说明研究或试验方法measure•Author(s): Schlegel, DJ; Finkbeiner, DP; Davis, M•Title:Maps of dust infrared emission for use in estimation of reddening and cosmic microwave background radiation foregrounds•Source: ASTROPHYSICAL JOURNAL, 500 (2): 525-553 Part 1 JUN 20 1998 SCI 被引用2972 次《天体物理学杂志》美国•The primary use of these maps is likely to be as a new estimator of Galactic extinction. To calibrate our maps, we assume a standard reddening law and use the colors of elliptical galaxies to measure the reddening per unit flux density of 100 mu m emission. We find consistent calibration using the B-R color distribution of a sample of the 106 brightest cluster ellipticals, as well as a sample of 384 ellipticals with B-V and Mg line strength measurements. For the latter sample, we use the correlation of intrinsic B-V versus Mg, index to tighten the power of the test greatly. We demonstrate that the new maps are twice as accurate as the older Burstein-Heiles reddening estimates in regions of low and moderate reddening. The maps are expected to be significantly more accurate in regions of high reddening. These dust maps will also be useful for estimating millimeter emission that contaminates cosmic microwave background radiation experiments and for estimating soft X-ray absorption. We describe how to access our maps readily for general use.SCI高被引摘要结果部分案例application介绍应用、用途•Author(s): MALLAT, S; ZHONG, S•Title: CHARACTERIZATION OF SIGNALS FROM MULTISCALE EDGES•Source: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 14 (7): 710-732 JUL 1992•SCI被引用508次《IEEE模式分析与机器智能汇刊》美国•Abstract: A multiscale Canny edge detection is equivalent to finding the local maxima ofa wavelet transform. We study the properties of multiscale edges through the wavelet。

基于深度学习的视觉手势估计综述

第１３卷㊀第１１期Ｖｏｌ．１３Ｎｏ．１１㊀㊀智㊀能㊀计㊀算㊀机㊀与㊀应㊀用ＩｎｔｅｌｌｉｇｅｎｔＣｏｍｐｕｔｅｒａｎｄＡｐｐｌｉｃａｔｉｏｎｓ㊀㊀２０２３年１１月㊀Ｎｏｖ．２０２３㊀㊀㊀㊀㊀㊀文章编号：２０９５－２１６３（２０２３）１１－０２３２－０７中图分类号：ＴＰ１８３文献标志码：Ａ基于深度学习的视觉手势估计综述武㊀胜，秦浩东（中国电子科技南湖研究院，浙江嘉兴３１４００１）摘㊀要：基于深度学习的视觉手势估计一直是计算机视觉领域的重点研究课题之一，随着深度学习和神经网络相关研究取得了巨大进步，针对手势估计中的高自由度㊁肤色㊁环境干扰㊁遮挡等问题已经远远优于传统方法㊂基于深度学习的三维手势估计主要是通过构建神经网络，对图像特征进行抽象化分析和理解，从而预测出手指关键点的三维坐标以及角度等信息，进而构建出手掌模型㊂准确的三维手势估计可以快速推动ＡＲ／ＶＲ行业的发展，因为沉浸与交互是ＡＲ／ＶＲ的关键要素，通过视觉手势交互可以为用户提供更方便㊁快捷㊁逼真的ＡＲ／ＶＲ互动体验㊂本文首先对当前手势估计方案进行阐述，了解到手势估计各方案的优缺点，然后介绍了基于深度学习的手势估计方法㊁相关数据集和评价指标，最后根据各研究结果，对当前三维手势估计所面临的挑战以及未来发展进行阐述㊂关键词：手势估计；深度学习；关键点检测；神经网络ＯｖｅｒｖｉｅｗｏｆｖｉｓｕａｌｇｅｓｔｕｒｅｅｓｔｉｍａｔｉｏｎｂａｓｅｄｏｎｄｅｐｔｈｌｅａｒｎｉｎｇＷＵＳｈｅｎｇ，ＱＩＮＨａｏｄｏｎｇ（ＣｈｉｎａＮａｎｈｕＡｃａｄｅｍｙｏｆＥｌｅｃｔｒｏｎｉｃｓａｎｄＩｎｆｏｒｍａｔｉｏｎＴｅｃｈｎｏｌｏｇｙ，ＪｉａｘｉｎｇＺｈｅｊｉａｎｇ３１４００１，Ｃｈｉｎａ）Ａｂｓｔｒａｃｔ：Ｖｉｓｉｏｎｇｅｓｔｕｒｅｅｓｔｉｍａｔｉｏｎｂａｓｅｄｏｎｄｅｅｐｌｅａｒｎｉｎｇｈａｓａｌｗａｙｓｂｅｅｎｏｎｅｏｆｔｈｅｋｅｙｒｅｓｅａｒｃｈｔｏｐｉｃｓｉｎｔｈｅｆｉｅｌｄｏｆｃｏｍｐｕｔｅｒｖｉｓｉｏｎ．Ｗｉｔｈｔｈｅｇｒｅａｔｐｒｏｇｒｅｓｓｏｆｄｅｅｐｌｅａｒｎｉｎｇａｎｄｎｅｕｒａｌｎｅｔｗｏｒｋｒｅｌａｔｅｄｒｅｓｅａｒｃｈ，ｉｔｈａｓｂｅｅｎｆａｒｓｕｐｅｒｉｏｒｔｏｔｒａｄｉｔｉｏｎａｌｍｅｔｈｏｄｓｆｏｒｔｈｅｐｒｏｂｌｅｍｓｏｆｈｉｇｈｄｅｇｒｅｅｏｆｆｒｅｅｄｏｍ，ｓｋｉｎｃｏｌｏｒ，ｅｎｖｉｒｏｎｍｅｎｔｉｎｔｅｒｆｅｒｅｎｃｅａｎｄｏｃｃｌｕｓｉｏｎｉｎｇｅｓｔｕｒｅｅｓｔｉｍａｔｉｏｎ．Ｔｈｒｅｅ－ｄｉｍｅｎｓｉｏｎａｌｇｅｓｔｕｒｅｅｓｔｉｍａｔｉｏｎｂａｓｅｄｏｎｄｅｅｐｌｅａｒｎｉｎｇｉｓｍａｉｎｌｙｔｏｃｏｎｓｔｒｕｃｔａｎｅｕｒａｌｎｅｔｗｏｒｋｆｏｒａｂｓｔｒａｃｔａｎａｌｙｓｉｓａｎｄｕｎｄｅｒｓｔａｎｄｉｎｇｏｆｉｍａｇｅｆｅａｔｕｒｅｓ，ｓｏａｓｔｏｐｒｅｄｉｃｔｔｈｅｔｈｒｅｅ－ｄｉｍｅｎｓｉｏｎａｌｃｏｏｒｄｉｎａｔｅｓａｎｄａｎｇｌｅｓｏｆｋｅｙｐｏｉｎｔｓｏｆｆｉｎｇｅｒｓａｎｄｔｈｅｎｂｕｉｌｄａｐａｌｍｍｏｄｅｌ．Ａｃｃｕｒａｔｅ３ＤｇｅｓｔｕｒｅｅｓｔｉｍａｔｉｏｎｃａｎｒａｐｉｄｌｙｐｒｏｍｏｔｅｔｈｅｄｅｖｅｌｏｐｍｅｎｔｏｆＡＲ／ＶＲｉｎｄｕｓｔｒｙ，ｂｅｃａｕｓｅｉｍｍｅｒｓｉｏｎａｎｄｉｎｔｅｒａｃｔｉｏｎａｒｅｔｈｅｋｅｙｅｌｅｍｅｎｔｓｏｆＡＲ／ＶＲ．Ｔｈｒｏｕｇｈｖｉｓｕａｌｇｅｓｔｕｒｅｉｎｔｅｒａｃｔｉｏｎ，ｕｓｅｒｓｃａｎｐｒｏｖｉｄｅｍｏｒｅｃｏｎｖｅｎｉｅｎｔ，ｆａｓｔａｎｄｒｅａｌｉｓｔｉｃＡＲ／ＶＲｉｎｔｅｒａｃｔｉｖｅｅｘｐｅｒｉｅｎｃｅ．Ｉｎｔｈｉｓｐａｐｅｒ，ｆｉｒｓｔｌｙ，ｔｈｅｃｕｒｒｅｎｔｇｅｓｔｕｒｅｅｓｔｉｍａｔｉｏｎｓｃｈｅｍｅｓａｒｅｄｅｓｃｒｉｂｅｄ，ａｎｄｔｈｅａｄｖａｎｔａｇｅｓａｎｄｄｉｓａｄｖａｎｔａｇｅｓｏｆｅａｃｈｇｅｓｔｕｒｅｅｓｔｉｍａｔｉｏｎｓｃｈｅｍｅａｒｅｕｎｄｅｒｓｔｏｏｄ．Ｔｈｅｎ，ｔｈｅｇｅｓｔｕｒｅｅｓｔｉｍａｔｉｏｎｍｅｔｈｏｄｂａｓｅｄｏｎｄｅｅｐｌｅａｒｎｉｎｇｉｓｉｎｔｒｏｄｕｃｅｄ，ａｎｄｔｈｅｒｅｌａｔｅｄｄａｔａｓｅｔｓａｎｄｅｖａｌｕａｔｉｏｎｉｎｄｅｘｅｓａｒｅｉｎｔｒｏｄｕｃｅｄ．Ｆｉｎａｌｌｙ，ｔｈｅｃｈａｌｌｅｎｇｅｓａｎｄｆｕｔｕｒｅｄｅｖｅｌｏｐｍｅｎｔｏｆｃｕｒｒｅｎｔ３Ｄｇｅｓｔｕｒｅｅｓｔｉｍａｔｉｏｎａｒｅｄｅｓｃｒｉｂｅｄａｃｃｏｒｄｉｎｇｔｏｔｈｅｒｅｓｅａｒｃｈｒｅｓｕｌｔｓ．Ｋｅｙｗｏｒｄｓ：ｇｅｓｔｕｒｅｅｓｔｉｍａｔｉｏｎ；ｄｅｅｐｌｅａｒｎｉｎｇ；ｋｅｙｐｏｉｎｔｄｅｔｅｃｔｉｏｎ；ｎｅｕｒａｌｎｅｔｗｏｒｋ作者简介：秦浩东（２０００－），男，硕士研究生，主要研究方向：ＡＲ软件开发㊁手势识别㊂通讯作者：武㊀胜（１９９１－），男，高级工程师，主要研究方向：ＡＲ软件研究设计㊁ｕｎｉｔｙ３Ｄ渲染引擎和手势识别研究㊂Ｅｍａｉｌ：ｗｕｓｈｅｎｇ＠ｍａｉｌ．ｕｓｔｃ．ｅｄｕ．ｃｎ收稿日期：２０２２－１１－１４０㊀引㊀言三维手势姿态估计是从采集的图像或者视频等对象中预测出手部关键点的位置［１］，再根据手关节点的位置预测出手掌的姿态，主要包含了目标识别㊁分割㊁回归检测等㊂传统手势估计受光线环境㊁拍摄角度㊁遮挡等影响，其准确性与实时性受到限制㊂随着卷积神经网络㊁递归神经网络㊁生成对抗网络等深度学习网络模型［２］的发展，以及ＧＰＵ算力的提升，深度学习在图像分割㊁图像识别㊁图像分类方面已经取得了巨大进步，手势估计使用深度卷积神经网络，预测得将更加准确㊂目前，基于深度学习的研究方法基本可以划分为３类，分别是：基于点云的深度神经网络㊁基于体素的深度神经网络以及基于多视点的深度神经网络㊂另外，随着计算机图形学㊁计算机视觉㊁人工智能等多学科的快速发展，苹果㊁谷歌㊁华为㊁微软等也都推出了相关的ＡＲ／ＶＲ引擎，ＡＲ／ＶＲ相关成果已广泛应用于教育㊁医疗㊁军事等领域㊂虚拟与现实的交互是增强现实中不可或缺的一部分，手势交互［３］仍然是ＡＲ／ＶＲ最重要的交互方式，可以增强用户的沉浸感，利用手势可以实现远程操作㊁手语识别等应用，这也推动着视觉手势估计的进一步的发展㊂本文主要对三维手势姿态估计进行梳理与分析，阐述基于深度学习的手势估计方法，整理相关数据集与评价指标，并对当前所面临的问题和未来发展趋势进行了阐述㊂１㊀手势估计相关工作１．１㊀手势估计方案分类手势估计可分为３类：基于可穿戴设备的手势估计㊁基于深度传感器的手势追踪估计㊁基于视觉的手势估计㊂（１）可穿戴设备的数据手套［４］通过内置传感器采集手部的运动数据，主要包括惯性㊁光纤以及光学三种传感器技术数据手套㊂基于惯性的数据手套虽然价格便宜，但是其漂移问题较为严重㊂基于光学的数据手套通过多个红外等摄像头采集手部数据，一般具有价格昂贵㊁遮挡等一系列问题㊂基于光纤的数据手套的数据精度以及稳定性虽然较好，但是其价格也十分昂贵，容易损坏㊂通常长时间穿戴数据手套存在手部会发汗，影响操作的沉浸感等问题，因此，数据手套没有得到大规模的应用㊂（２）基于深度传感器的手势追踪估计［５］，如：ＬｅａｐＭｏｔｉｏｎ和Ｋｉｎｅｃｔ，在内部已经封装好手部重要信息识别算法，使用比较简单方便，但是其采集识别准确性取决于摄像机方向，这会限制用户的运动，而且在背景复杂㊁遮挡以及光线变化较大时，识别率较低㊂（３）基于图像视觉的手势估计［６－７］可以解决价格昂贵㊁穿戴不方便等问题，但是仍然深受遮挡㊁光线等问题困扰，而就目前图像学㊁人工智能等学科的快速发展，基于视觉的手势识别仍然是研究的主流方向㊂基于视觉的研究方法可以分为基于双目的方法和基于ＲＧＢ的方法以及基于ＲＧＢ－Ｄ的方法㊂带有双摄像头以及深度传感器手机的普及，给视觉手势提供了条件㊂基于ＲＧＢ－Ｄ的深度图与彩色图融合的方法有着其它方法所不具备的优势：①使用单一的深度图在超过一定距离后会出现精度下降情况，而彩色图相机具有变焦功能，可以容易获取较远距离的物体㊂②三维信息转换到二维信息过程中必将丢失一些数据，丢失的数据可以经过彩色图予以找回㊂③单一的彩色图在计算深度数据上精度会出现误差，通过深度图可进行补偿计算㊂手势姿态估计方案如图１所示㊂双目单目R G B-D可穿戴设备智能摄像头R G B图像手势估计图１㊀手势姿态估计Ｆｉｇ．１㊀Ｇｅｓｔｕｒｅｐｏｓｅｅｓｔｉｍａｔｉｏｎ１．２㊀手势运动学分析手部由手指㊁手掌以及手腕共有２７个互相连接的骨骼组成，手势估计最核心的问题是对手腕以及手指指骨的关节㊁连同指尖处进行识别㊁分割㊁跟踪以及估计，人手骨骼分布如图２所示㊂掌骨指骨关节图２㊀人手骨骼分布Ｆｉｇ．２㊀Ｄｉｓｔｒｉｂｕｔｉｏｎｏｆｈｕｍａｎｈａｎｄｂｏｎｅｓ㊀㊀人手是一个具有２６自由度的执行机构，具体包括指骨关节１个弯曲自由度；掌骨关节１个自由度弯曲，１个自由度绕转，故２个自由度；腕骨为６自由度，因此共有１∗２∗５＋２∗５＋６＝２６个自由度，手掌２６自由度模型如图３所示㊂1D O F2D O F6D O F图３㊀手掌２６自由度模型Ｆｉｇ．３㊀２６ｄｅｇｒｅｅｏｆｆｒｅｅｄｏｍｍｏｄｅｌｏｆｔｈｅｐａｌｍ㊀㊀根据人手指骨骼关节㊁手掌模型以及运动分析可以得出手部参与交互的主要为手指关节㊁掌指关节以及手腕［８］㊂因此，目前主流的手掌模型关节编３３２第１１期武胜，等：基于深度学习的视觉手势估计综述码有１４㊁１６㊁２１三种，大多数论文以及数据集都是采用２１关节点模型，通过估计关节点在三维空间的坐标，可预测出手姿态㊂手掌不同自由度模型如图４所示㊂(a )(b )(c )图４㊀手掌不同自由度模型Ｆｉｇ．４㊀Ｍｏｄｅｌｓｏｆｔｈｅｐａｌｍｗｉｔｈｄｉｆｆｅｒｅｎｔｄｅｇｒｅｅｓｏｆｆｒｅｅｄｏｍ１．３㊀识别流程手势估计包括人手识别㊁分割㊁跟踪㊁估计四步㊂其中，人手识别是为了减少背景噪声对手势估计的影响以及降低后续处理的计算量，识别出手部的区域㊂人手分割是将手部数据进行像素级别的提取，获取手部精准的信息㊂手部跟踪是通过连续帧预测下一步的手部位置，减少手部定位的耗时㊂手势估计是从图像中回归出手部完整的姿态，最终获取关节点三维坐标信息㊂２㊀深度学习的手势估计方法基于视觉的三维手势估计自首次引入深度学习以后，深度学习已经成为视觉手势的一个主流研究领域，越来越多的科研学者通过训练大量的样本数据，强化了模型的性能，获得了更加精准的特征，提高了鲁棒性以及泛化能力㊂基于深度学习的视觉估计可分为基于人工的神经网络㊁图神经网络㊁卷积神经网络㊁深度神经网络等［９－１０］㊂根据Ｅｒｏｌ等学者［１１］的综述结论，三维手势跟踪算法可以分为判别法㊁生成法［１２］，而为了利用二者的优点，有学者提出了混合法㊂２．１㊀判别法判别法又称为数据驱动，对数据特别依赖，需要多个高质量的数据集，可学习从图像特征空间到手势特征空间的映射关系，进而预测出手势㊂判别法根据手势跟踪的检测与估计进行区分，又可以分为基于回归的方法与基于检测的方法㊂判别法由于可以采用离线的训练，无需大量手掌模型，因此，更适合实时应用㊂２０１４年，Ｔｏｍｐｓｏｎ等学者［１３］首次将卷积神经网络应用到手势估计中，利用卷积神经网络来提取手部图像特征信息，并为手部关键点生成２Ｄ热图，然后利用逆运动学原理由热图提取特征，再根据目标函数最小化来估计３Ｄ手部姿态㊂这也启发了很多人使用卷积神经网络以及热图进行手部姿态估计㊂Ｓｉｎｈａ等学者［１４］利用卷积神经网络获取图像特征的方法，再结合深度数据进行最近邻特征匹配补全手势估计的参数㊂由于手势估计的复杂性，从图像中估计的关节与真实关节可能出现偏差㊂针对上述情况，Ｇｅ等学者［１５］先提出了一种新的基于深度图的多视角获取手部关节点后进行回归融合，估算出手势坐标㊂此后Ｇｅ等学者［１６］根据Ｑｉ等学者［１７］的启发将ＰｏｉｎｔＮｅｔ＋＋应用解决三维手势估计问题，将手势深度图３Ｄ点云进行采样和归一化输入到ＰｏｉｎｔＮｅｔ网络中，进行点云特征提取，同时该方法还设计了一个指尖矫正网络进行指尖位置的优化㊂随后，Ｇｅ等学者［１８］又进一步改变了网络结构，采用编解码器两层架构代替分层架构的采样，对３Ｄ关节位置进行预测，提高手势估计的精度㊂在此之前，大多数手部估计方法止步于三维手部关键点的回归，并不能精准地反映手部形态，而ＡＲ／ＶＲ领域需要更加逼真的手部模型㊂同时，图神经网络能够解决复杂的结构关系，学者将图神经网络引入手势中㊂因此，Ｇｅ等学者［１９］提出一个全新的端到端训练的图卷积神经网络，将２Ｄ热图等潜在特征变量通过该网络生成了密级手部网格，根据网格坐标最终得到三维关节坐标，原理如图５所示㊂Ｆａｎｇ等学者［２０］也提出了基于图卷积网络的联合图推理来估算关节的复杂关系，同时通过增强像素的能力，估算出每个像素的偏移量，再对所有的像素进行加权计算，进而估计出手部信息㊂２．２㊀混合法生成法又称为基于模型的方法，主要是基于固定的手势模型进行姿态估计识别，需要根据运动学原理事先创建满足手部形态学约束的模型，再进行匹配㊂主要流程如下：首先需要根据输入图像匹配适合的手部模型，然后进行模型参数初始化，并找到一个实际模型与输入模型之间的损失函数，通过不断迭代最小化损失函数得到最优手势模型㊂生成法的主要优化方法体现在目标函数最小化方法以及使用先验手势来匹配数据的方法，在本文不进行详细介绍㊂为了最优化地使用生成法与判别法，有学者提出了混合法，可以使用判别法对姿态进行先验，引导对生成模型的优化，然后使用生成法细化手型与位置，降低跟踪的误差，提高复杂场景环境下跟踪估计的鲁棒性㊂４３２智㊀能㊀计㊀算㊀机㊀与㊀应㊀用㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀第１３卷㊀堆叠沙漏网络堆叠沙漏网络热图特征图潜在特征残差网络残差网络热图损失热图热图损失图卷积网络图卷积网络潜在特征特征图网格损失3D手部网络3D手部网络xxyyzz姿态回归姿态回归3D手姿态3D手姿态深度图深度图损失xxyyzz仿标签-G T标签网格损失网格渲染器(a)(b)图５㊀Ｇｅ等学者［１９］提出的网络原理图Ｆｉｇ．５㊀ＳｃｈｅｍａｔｉｃｄｉａｇｒａｍｏｆｔｈｅｎｅｔｗｏｒｋｐｒｏｐｏｓｅｄｂｙＧｅｅｔａｌ［１９］㊀㊀Ｙｅ等学者［２１］提出基于层次的混合手势估计方法，通过变换输入空间与输出空间的方式，将多阶段与多层回归集成到ＣＮＮ中，在多层级之间，通过粒子群算法把运动学约束施加到ＣＮＮ中，该方法可以减少关节与视角的变化，纠正手势估计的结果㊂Ｍｕｅｌｌｅｒ等学者［２２］先利用卷积神经网络定位手关节，再使用深度值计算得出手的三维信息㊂Ｚｈａｎｇ等学者［２３］先对深度图中的手掌进行分割，并通过预训练的ＬＳＴＭ预测当前的手势，最后重建对象模型㊂３㊀数据集与评价指标３．１㊀数据集大规模精准标注的数据集是手势估计的基础，而早期由于缺少专业相机方阵，数据集较小㊂随机光学组件相关硬件以及计算机软件的发展，使得手势估计数据集已经非常丰富，不仅有手动标注数据㊁自动标注数据㊁半自动标注数据，还有全自动合成数据［２４］，无论在数据质量㊁还是数据规模上已经有质的飞越㊂手动标记数据有Ｄｅｘｔｅｒ－１㊁ＭＳＲＡ１４等，由于手工标注数据是一件繁琐的事，因此该类数据集规模相对较小，不适合用于大规模数据驱动的手势估计㊂半自动标注的手势数据有ＩＣＶＬ㊁ＭＳＲＡ１５㊁ＮＹＵ等，半自动标注方法一般先估算出三维手部关节点，再使用人工标注方法进行修正或者于初始先手动标注出二维手部关节点，再使用算法预测出三维手部关节点，即使使用半自动标注，收集以及标注大数据集的手势数据也是一个繁琐复杂的大工程㊂为了获得更高质量㊁更大规模的数据集，出现了全自动以及合成数据集方法㊂全自动标注数据有ＨａｎｄＮｅｔ㊁ＢｉｇＨａｎｄ２．２Ｍ等，全自动标注数据先让受试者带上数据手套，在采集图像时进行手部关节数据标注，相较于半自动标注来说自动标注效率大大提高，适合创建大型手势标注数据集㊂合成数据有ＭＳＲＣ㊁ＲＨＤ等，合成数据使用软件先基于手势模型生成不同姿态的仿真图像数据，再自动标记三维关节信息㊂合成数据标记效率高，可以创建大规模的数据集，但合成数据很难对真实图像的丰富纹理特征进行建模，而且因为反关节等各种原因导致数据特征丢失，同时受限于手部的多自由度以及手部肤色，因此就目前来说，合成数据质量相对不高，但随着计算机相关学科的发展，合成数据必将是手势标注数据的发展方向㊂表１列出了手势估计公共数据集，随着时间的进行，数据量整体呈现上升趋势，从中挑选一个合成数据集㊁一个超大型数据集以及一个中文手语数据集进行介绍㊂（１）ＲＨＤ（ＲｅｎｄｅｒｅｄＨａｎｄＰｏｓｅ）㊂是一个４１２５８个训练集以及２７２８个测试集的手势估计的图像数据集，是由弗莱堡大学在２０１７年发布的合成渲染数据集，每个样本共有深度图㊁ＲＧＢ图㊁分割图，图像像素为３２０ˑ３２０㊂每只手都有２１个关键点的精确二维以及三维注释㊂（２）ＦｒｅｉＨａｎｄ㊂是一个包含３２个人进行的手部动作采集，共有３２５６０个训练样本以及３９６０个测试样本图像数据集㊂是由弗莱堡大学与Ａｄｏｂｅ研究院于２０１９年发布的，可用于图像检测㊁分类任务㊂（３）ＩｎｔｅｒＨａｎｄ２．６Ｍ㊂是第一个具有准确ＧＴ３Ｄ双手交互的大规模手部实拍数据集㊂由ＦａｃｅｂｏｏｋＲｅａｌｉｔｙＬａｂ于２０２０年发布，包括２６０万张手势图像㊂可为学者提供了一个双手交互的手势估计数据集㊂５３２第１１期武胜，等：基于深度学习的视觉手势估计综述表１㊀三维手势估计常用数据集Ｔａｂ．１㊀Ｃｏｍｍｏｎｄａｔａｓｅｔｏｆ３Ｄｇｅｓｔｕｒｅｅｓｔｉｍａｔｉｏｎ数据集时间图像数量标记方式尺寸ＳＴＢ［２５］２０１５３６０００手动６４０ˑ４８０ＭＳＲＡ１４［２６］２０１４２４００手动３２０ˑ２４０Ｄｅｘｔｅｒ１［２７］２０１３２１３７手动３２０ˑ２４０ＩｎｔｅｒＨａｎｄ２．６Ｍ［２８］２０２０２６０Ｗ半自动５１２ˑ３３４ＦｒｅｉＨａｎｄ［２９］２０１７１３３０００半自动２２４ˑ２２４ＭＳＲＡ１５［３０］２０１５７６３７５半自动６４０ˑ４８０ＮＹＵ［１３］２０１４８１００９半自动６４０ˑ４８０ＩＣＶＬ［３１］２０１４１７６０４半自动３２０ˑ２４０ＨａｎｄＮｅｔ［３２］２０１５２１２９２８自动３２０ˑ２４０ＢｉｇＨａｎｄ２．２Ｍ［３３］２０１７２．２Ｍ自动６４０ˑ４８０ＭＳＲＣ［３４］２０１５１０２０００合成５１２ˑ４２４ＲＨＤ［３５］２０１７４３７００合成３２０ˑ３２０３．２㊀评价指标手势评价的标准是指相对于标注的手势点相差多少㊂常见的评价指标可分述如下㊂（１）平均关节位置误差（ＭｅａｎＰｅｒＪｏｉｎｔＰｏｓｉｔｉｏｎＥｒｒｏｒ，ＭＰＪＰＥ）［３６］，定义为预测关节点位置与真实三维关节点位置的平均欧几里得距离，单位为ｍｍ㊂指标值越小㊁姿态估计算法越好，计算公式如下：ＭＰＪＰＥｊ＝ðｉ（ｐｉｊ－ｐｇｔｉｊ）Ｎ（１）㊀㊀其中，Ｎ表示手指节点数；ｐｉｊ表示预测点；ｐｇｔｉｊ表示真实标注点㊂（２）端点误差（ＥｎｄＰｏｉｎｔＥｒｒｏｒ，ＥＰＥ）［３７］㊂定义为手部跟关节对齐后预测的三维手部坐标与真实坐标之间的平均欧式距离，单位为ｍｍ㊂计算公式如下：ＥＰＥ＝ðＳｍ＝１（ðｉｋ＝１（ｙｍｋ－ｙｍ０）－ｙ＾ｍｋ－ｙ＾ｍ０ｍａｘ（ｗ，ｈ）ｉˑＳ（２）㊀㊀其中，Ｓ为样本数；ｉ为关节点数；ｙ表示真实值；ｙ＾表示预测值㊂（３）正确关键点百分比（ＰｅｒｃｅｎｔａｇｅｏｆＣｏｒｒｅｃｔＫｅｙＰｏｉｎｔｓ，ＰＣＫ）［３８］表示手势估计结果预测值与真实值相差的欧氏距离在一定可接受范围内，则认定为预测准确㊂Ｊｋ计算公式如下：ＰＣＫｋｉ＝ðｐδｄｐｉｄｐｄｃｆɤＴｋæèçöø÷ðｐ１（３）㊀㊀其中，Ｔｋ表示阈值㊂㊀㊀（４）工作特征曲线下面积（ＡｒｅａＵｎｄｅｒＣｕｒｖｅ，ＡＵＣ）［３９］㊂在手势估计中，ＡＵＣ被定义为ＰＣＫ曲线与坐标轴围成的面积，相同标准下ＡＵＣ值越大表示估计误差越小，精度越高㊂不同算法在ＲＨＤ以及ＳＴＢ公开数据集上执行精度对比见表２㊂表２㊀不同算法的精度比较Ｔａｂ．２㊀Ｐｒｅｃｉｓｉｏｎｃｏｍｐａｒｉｓｏｎｏｆｄｉｆｆｅｒｅｎｔａｌｇｏｒｉｔｈｍｓ方法ＡＵＣ（ＳＴＢ）ＡＵＣ（ＲＨＤ）Ｋｒｅｊｏｖ等［４０］０．９９１０．８４９Ｙａｎｇ等［４１］０．９９６０．９０１Ｇｅ等［１９］０．９９８０．９２０ＧＵ等［４２］０．９９６０．８８７Ｍｃｅｕ等［４３］０．９６５０．５６０Ｚｈｏｕ等［４４］０．９９１０．８９３Ｃｈｅｎ等［４５］０．９９００．９３９４㊀问题与挑战当前已经有较多的学者参与研究三维手势估计，基于单目ＲＧＢ㊁双目㊁ＲＧＢ－Ｄ的估计在特定场景设备下已经取得了较大进步，但是在特殊环境进行复杂操作时仍然有较多的问题亟待解决，例如：环境背景与手掌肤色贴合㊁光照变化较大㊁进行复杂的自遮挡动作等［４６］㊂４．１㊀复杂场景环境为了精准分割出手势图像，大部分手势估计方法均在背景单一㊁且单手条件下进行，而正常环境下可能无法控制在环境光照变化较强的场景或者与手肤色相近的背景或者反光面㊁玻璃等背景下的多手协作㊂因为，高光照在这种复杂的背景环境中无疑加大了手势检测㊁分割的难度㊂例如：强光照射手部或阴影投射手部均使手与背景不明显㊂如何提高手势估计在复杂场景背景下的手势检测与分割的精准性，进而提高复杂场景的手势交互能力，将会是未来的一个研究方向㊂４．２㊀高自由度人手有２６个自由度，可以实现３００ʎ／ｓ旋转以及５ｍ／ｓ的快速运动，因此十分灵活，手势估计姿态的复杂度随着自由度以及运动速度的增加而呈指数的增长㊂目前仍存在较多精度较低㊁无法贴合手部结构的运动模型㊂如何在高自由度的快速运动的手部图像序列中进行精准识别高维时序特征，快速预测手部关节值仍然是一个热点问题㊂６３２智㊀能㊀计㊀算㊀机㊀与㊀应㊀用㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀第１３卷㊀４．３㊀自遮挡因为手部的高自由度导致手部具有多样性以及多异性㊂人类很容易实现的自握拳㊁自握手等无疑会出现手部自遮挡㊁自碰撞㊂而且因为肤色㊁年龄等差异较大，加上自遮挡问题，可能使得手部在图像中所占面积较小，进而丢失较多手部细节信息，导致手势估计不准确或者完全失效㊂４．４㊀实时性与准确性当前较多研究是在实验室环境中使用高性能计算机进行检测㊁分割，其运行速率可达９０ＦＰＳ以上，而在手机或者ＡＲ眼镜上，加上复杂的环境等因素，其处理速度可能达不到１０ＦＰＳ，ＡＲ／ＶＲ应用的理想运行速率不低于６０ＦＰＳ㊂因此，在复杂的环境下，需要实现准确性与实时性，仍然有较多问题需要解决㊂５㊀展㊀望基于深度学习的三维手势估计方法不断进行优化，极大地提升了手势估计的效果，基于上文提出的问题，研究者可以从以下几个方面进行优化㊂５．１㊀利用时序信息基于时间序列的手势估计可以利用双向长短时记忆网络模型获取前后帧之间的时序特征，挖掘出更加丰富的特征信息，进而辅助预测出后续手掌位置㊁甚至手势关键节点信息，解决自遮挡等复杂环境背景下手势识别的准确性以及手势估计的速度问题㊂５．２㊀优化网络模型深度学习的手势估计中，网络模型是一个重要的主题㊂如何优化出轻量级的网络模型解决复杂的场景下手势检测与分割以及特征提取等手势估计的准确性问题，进而提高网络的运行速度，是助力手势估计研究的一个重要学术方向㊂５．３㊀利用混合法判别法对遮挡等有较强的鲁棒性问题可以快速从错误中恢复，而且其运行速度较快，但是却无法利用时序帧，导致手势估计容易出现跟踪丢失现象，而生成法可以利用时序帧，使用拟合模型处理高维数据和复杂环境下的手势估计㊂如何平衡使用判别法与混合法，充分利用二者的优势，可加快手势估计跟踪的性能㊂６㊀结束语本文对基于深度学习的手势估计算法以及数据集和评价指标进行了回顾，探讨了手势估计目前所面临的挑战以及未来的研究方向㊂手势交互是最重要的人机交互之一，应用在ＡＲ／ＶＲ㊁手语识别㊁远程操控等方面，虽然不少学者在手势估计方面的研究已经取得了一定成果，但是距离实际应用还有较长的路要走㊂因此，也希望相关研究学者继续进行复杂场景的手势研究，让手势估计早日在中低端设备上落地应用㊂参考文献［１］解迎刚，王全．基于视觉的动态手势识别研究综述［Ｊ］．计算机工程与应用，２０２１，５７（２２）：６８－７７．［２］ＫＲＩＺＨＥＶＳＫＹＡ，ＳＵＴＳＫＥＶＥＲＩ，ＨＩＮＴＯＮＧＥ．Ｉｍａｇｅｎｅｔｃｌａｓｓｉｆｉｃａｔｉｏｎｗｉｔｈｄｅｅｐｃｏｎｖｏｌｕｔｉｏｎａｌｎｅｕｒａｌｎｅｔｗｏｒｋｓ［Ｊ］．ＣｏｍｍｕｎｉｃａｔｉｏｎｓｏｆｔｈｅＡＣＭ，２０１７，６０（６）：８４－９０．［３］易靖国，程江华，库锡树．视觉手势识别综述［Ｊ］．计算机科学，２０１６，４３（Ｓ１）：１０３－１０８．［４］ＪＩＡＮＧＬｉｎｊｕｎ，ＸＩＡＨａｉｌｕｎ，ＧＵＯＣａｉｌｉ．Ａｍｏｄｅｌ－ｂａｓｅｄｓｙｓｔｅｍｆｏｒｒｅａｌ－ｔｉｍｅａｒｔｉｃｕｌａｔｅｄｈａｎｄｔｒａｃｋｉｎｇｕｓｉｎｇａｓｉｍｐｌｅｄａｔａｇｌｏｖｅａｎｄａｄｅｐｔｈｃａｍｅｒａ［Ｊ］．Ｓｅｎｓｏｒｓ，２０１９，１９（２１）：４６８０－１７８８．［５］ＱＩＡＮＪｉｎｇ，ＭＡＪｉａｊｕｎ，ＬＩＸｉａｎｇｙｕ，ｅｔａｌ．Ｐｏｒｔａｌｂｌｅ：Ｉｎｔｕｉｔｉｖｅｆｒｅｅ－ｈａｎｄｍａｎｉｐｕｌａｔｉｏｎｉｎｕｎｂｏｕｎｄｅｄｓｍａｒｔｐｈｏｎｅ－ｂａｓｅｄａｕｇｍｅｎｔｅｄｒｅａｌｉｔｙ［Ｃ］／／Ｐｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅ３２ｎｄＡｎｎｕａｌＡＣＭＳｙｍｐｏｓｉｕｍｏｎＵｓｅｒＩｎｔｅｒｆａｃｅＳｏｆｔｗａｒｅａｎｄＴｅｃｈｎｏｌｏｇｙ．Ｍｏｎｔｒｅａｌ，Ｃａｎａｄａ：ＡＣＭ，２０１９：１３３－１４５．［６］方林普．基于视觉的手势交互关键技术研究［Ｄ］．广州：华南理工大学，２０２１．［７］陈红梅，赖重远，张洋，等．基于深度数据的手势识别研究进展［Ｊ］．江汉大学学报（自然科学版），２０１８，４６（２）：１０１－１０８．［８］ＤＯＯＳＴＩＢ．Ｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎ：Ａｓｕｒｖｅｙ［Ｊ］．ａｒＸｉｖｐｒｅｐｒｉｎｔａｒＸｉｖ：１９０３．０１０１３，２０１９．［９］武国梁．基于深度学习的手势估计研究［Ｄ］．长春：中国科学院大学（中国科学院长春光学精密机械与物理研究所），２０２１．［１０］王健．基于深度学习的手势识别算法研究［Ｄ］．长春：长春理工大学，２０２１．［１１］ＥＲＯＬＡ，ＢＥＢＩＳＧ，ＮＩＣＯＬＥＳＣＵＭ，ｅｔａｌ．Ｖｉｓｉｏｎ－ｂａｓｅｄｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎ：Ａｒｅｖｉｅｗ［Ｊ］．ＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＩｍａｇｅＵｎｄｅｒｓｔａｎｄｉｎｇ，２００７，１０８（１／２）：５２－７３．［１２］张继凯，李琦，王月明，等．基于单目ＲＧＢ图像的三维手势跟踪算法综述［Ｊ］．计算机科学，２０２２，４９（４）：１７４－１８７．［１３］ＴＯＭＰＳＯＮＪ，ＳＴＥＩＮＭ，ＬＥＣＵＮＹ，ｅｔａｌ．Ｒｅａｌ－ｔｉｍｅｃｏｎｔｉｎｕｏｕｓｐｏｓｅｒｅｃｏｖｅｒｙｏｆｈｕｍａｎｈａｎｄｓｕｓｉｎｇｃｏｎｖｏｌｕｔｉｏｎａｌｎｅｔｗｏｒｋｓ［Ｊ］．ＡＣＭＴｒａｎｓａｃｔｉｏｎｓｏｎＧｒａｐｈｉｃｓ，２０１４，３３（５）：１－１０．［１４］ＳＩＮＨＡＡ，ＣＨＯＩＣ，ＲＡＭＡＮＩＫ．ＤｅｅｐＨａｎｄ：ＲｏｂｕｓｔＨａｎｄＰｏｓｅＥｓｔｉｍａｔｉｏｎｂｙＣｏｍｐｌｅｔｉｎｇａＭａｔｒｉｘＩｍｐｕｔｅｄｗｉｔｈＤｅｅｐＦｅａｔｕｒｅｓ［Ｃ］／／２０１６ＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ（ＣＶＰＲ）．ＬａｓＶｅｇａｓ，ＮＶ，ＵＳＡ：ＩＥＥＥ，２０１６：４１５０－４１５８．［１５］ＧＥＬｉｕｈａｏ，ＬＩＡＮＧＨｕｉ，ＹＵＡＮＪｕｎｓｏｎｇ，ｅｔａｌ．Ｒｏｂｕｓｔ３ｄｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎｉｎｓｉｎｇｌｅｄｅｐｔｈｉｍａｇｅｓ：ｆｒｏｍｓｉｎｇｌｅ－ｖｉｅｗｃｎｎｔｏｍｕｌｔｉ－ｖｉｅｗＣＮＮｓ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ，ＬａｓＶｅｇａｓ，ＵＳＡ：ＩＥＥＥ，２０１６：３５９３－３６０１．［１６］ＧＥＬｉｕｈａｏ，ＣＡＩＹｕｊｕｎ，ＷＥＮＧＪｕｎｗｕ，ｅｔａｌ．Ｈａｎｄｐｏｉｎｔｎｅｔ：３ｄｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎｕｓｉｎｇｐｏｉｎｔｓｅｔｓ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ．ＳａｌｔＬａｋｅＣｉｔｙ：ＩＥＥＥ，２０１８：８４１７－８４．［１７］ＱＩＣＲ，ＹＩＬｉ，ＳＵＨａｏ，ｅｔａｌ．Ｐｏｉｎｔｎｅｔ＋＋：Ｄｅｅｐｈｉｅｒａｒｃｈｉｃａｌ７３２第１１期武胜，等：基于深度学习的视觉手势估计综述ｆｅａｔｕｒｅｌｅａｒｎｉｎｇｏｎｐｏｉｎｔｓｅｔｓｉｎａｍｅｔｒｉｃｓｐａｃｅ［Ｃ］／／ＡｄｖａｎｃｅｓｉｎＮｅｕｒａｌＩｎｆｏｒｍａｔｉｏｎＰｒｏｃｅｓｓｉｎｇＳｙｓｔｅｍｓ．ＬｏｎｇＢｅａｃｈ，Ｃａｌｉｆｏｒｎｉａ，ＵＳＡ：ＮＩＰＳＦｏｕｎｄａｔｉｏｎ，２０１７：５０９９－５１０８．［１８］ＧＥＬｉｕｈａｏ，ＲＥＮＺｈｏｕ，ＹＵＡＮＪｕｎｓｏｎｇ．Ｐｏｉｎｔ－ｔｏ－ｐｏｉｎｔｒｅｇｒｅｓｓｉｏｎｐｏｉｎｔｎｅｔｆｏｒ３ｄｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＥｕｒｏｐｅａｎＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ（ＥＣＣＶ）．Ｍｕｎｉｃｈ，Ｇｅｒｍａｎｙ：ｄｂｌｐ，２０１８：４７５－４９１．［１９］ＧＥＬｉｕｈａｏ，ＲＥＮＺｈｏｕ，ＬＩＹｕｎｃｈｅｎｇ，ｅｔ，ａｌ．３ＤＨａｎｄｓｈａｐｅａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎｆｒｏｍａｓｉｎｇｌｅＲＧＢｉｍａｇｅ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥ／ＣＶＦＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ（ＣＶＰＲ）．ＬｏｎｇＢｅａｃｈ：ＩＥＥＥ，２０１９：１０８３３－１０８４２．［２０］ＦＡＮＧＬｉｎｐｕ，ＬＩＵＸｉｎｇｙａｎ，ＬＩＵＬｉ，ｅｔａｌ．ＪＧＲ－Ｐ２Ｏ：Ｊｏｉｎｔｇｒａｐｈｒｅａｓｏｎｉｎｇｂａｓｅｄｐｉｘｅｌ－ｔｏ－ｏｆｆｓｅｔｐｒｅｄｉｃｔｉｏｎｎｅｔｗｏｒｋｆｏｒ３Ｄｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎｆｒｏｍａｓｉｎｇｌｅｄｅｐｔｈｉｍａｇｅ［Ｃ］／／ＥｕｒｏｐｅａｎＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ．Ｃｈａｍ：Ｓｐｒｉｎｇｅｒ，２０２０：１２０－１３７．［２１］ＹＥＱｉ，ＹＵＡＮＳｈａｎｘｉｎ，ＫＩＭＴＫ．Ｓｐａｔｉａｌａｔｔｅｎｔｉｏｎｄｅｅｐｎｅｔｗｉｔｈｐａｒｔｉａｌｐｓｏｆｏｒｈｉｅｒａｒｃｈｉｃａｌｈｙｂｒｉｄｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎ［Ｃ］／／ＥｕｒｏｐｅａｎＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ．Ｃｈａｍ：Ｓｐｒｉｎｇｅｒ，２０１６：３４６－３６１．［２２］ＭＵＥＬＬＥＲＦ，ＭＥＨＴＡＤ，ＳＯＴＮＹＣＨＥＮＫＯＯ．ｅｔａｌ．Ｒｅａｌ－ｔｉｍｅｈａｎｄｔｒａｃｋｉｎｇｕｎｄｅｒｏｃｃｌｕｓｉｏｎｆｒｏｍａｎｅｇｏｃｅｎｔｒｉｃＲＧＢ－Ｄｓｅｎｓｏｒ［Ｃ］／／２０１７ＩＥＥＥＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ．Ｖｅｎｉｃｅ，Ｉｔａｌｙ：ＩＥＥＥ，２０１７：１１６３－１１７２．［２３］ＺＨＡＮＧＨａｏ，ＢＯＺｉｈａｏ，ＹＯＮＧＪｕｎｈｕｉ，ｅｔａｌ．ＩｎｔｅｒａｃｔｉｏｎＦｕｓｉｏｎ：Ｒｅａｌ－ｔｉｍｅｒｅｃｏｎｓｔｒｕｃｔｉｏｎｏｆｈａｎｄｐｏｓｅｓａｎｄｄｅｆｏｒｍａｂｌｅｏｂｊｅｃｔｓｉｎｈａｎｄ－ｏｂｊｅｃｔｉｎｔｅｒａｃｔｉｏｎｓ［Ｊ］．ＡＣＭＴｒａｎｓａｃｔｉｏｎｓｏｎＧｒａｐｈｉｃｓ（ＴＯＧ），２０１９，３８（４）：１－１１．［２４］王丽萍，汪成，邱飞岳，等．深度图像中的３Ｄ手势姿态估计方法综述［Ｊ］．小型微型计算机系统，２０２１，４２（６）：１２２７－１２３５．［２５］ＷＨＥＡＴＬＡＮＤＮ，ＷＡＮＧＹｉｎｇｙｉｎｇ，ＳＯＮＧＨｕａｇｕａｎｇ，ｅｔａｌ．Ｓｔａｔｅｏｆｔｈｅａｒｔｉｎｈａｎｄａｎｄｆｉｎｇｅｒｍｏｄｅｌｉｎｇａｎｄａｎｉｍａｔｉｏｎ［Ｊ］．ＣｏｍｐｕｔｅｒＧｒａｐｈｉｃｓＦｏｒｕｍ，２０１５，３４（２）：７３５－７６０．［２６］ＱＩＡＮＣｈｅｎ，ＳＵＮＸｉａｏ，ＷＥＩＹｉｃｈｅｎ，ｅｔａｌ．Ｒｅａｌｔｉｍｅａｎｄｒｏｂｕｓｔｈａｎｄｔｒａｃｋｉｎｇｆｒｏｍｄｅｐｔｈ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ．Ｃｏｌｕｍｂｕｓ，ＯＨ，ＵＳＡ：ＩＥＥＥ，２０１４：１１０６－１１１３．［２７］ＳＲＩＤＨＡＲＳ，ＯＵＬＡＳＶＩＲＴＡＡ，ＴＨＥＯＢＡＬＴＣ．ＩｎｔｅｒａｃｔｉｖｅｍａｒｋｅｒｌｅｓｓａｒｔｉｃｕｌａｔｅｄｈａｎｄｍｏｔｉｏｎｔｒａｃｋｉｎｇｕｓｉｎｇＲＧＢａｎｄｄｅｐｔｈｄａｔａ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ．Ａｕｓｔｒａｌｉａ：ＩＥＥＥ，２０１３：２４５６－２４６３．［２８］ＭＯＯＮＧ，ＹＵＳＩ，ＷＥＮＨｅ，ｅｔａｌ．ＩｎｔｅｒＨａｎｄ２．６Ｍ：Ａｄａｔａｓｅｔａｎｄｂａｓｅｌｉｎｅｆｏｒ３ｄｉｎｔｅｒａｃｔｉｎｇｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎｆｒｏｍａｓｉｎｇｌｅｒｇｂｉｍａｇｅ］Ｊ］．ａｒＸｉｖｐｒｅｐｒｉｎｔａｒＸｉｖ：２００８．０９３０９，２０２０．［２９］ＺＨＡＮＧＪｉａｗｅｉ，ＪＩＡＯＪｉａｎｂｏ，ＣＨＥＮＭｉｎｇｌｉａｎｇ，ｅｔａｌ．Ａｈａｎｄｐｏｓｅｔｒａｃｋｉｎｇｂｅｎｃｈｍａｒｋｆｒｏｍｓｔｅｒｅｏｍａｔｃｈｉｎｇ［Ｃ］／／２０１７ＩＥＥＥＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＩｍａｇｅＰｒｏｃｅｓｓｉｎｇ（ＩＣＩＰ）．Ｂｅｉｊｉｎｇ，Ｃｈｉｎａ：ＩＥＥＥ，２０１７：９８２－９８６．［３０］ＳＵＮＸｉａｏ，ＷＥＩＹｉｃｈｅｎ，ＬＩＡＮＧＳｈｕａｎｇ，ｅｔａｌ．Ｃａｓｃａｄｅｄｈａｎｄｐｏｓｅｒｅｇｒｅｓｓｉｏｎ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ．Ｘｉᶄａｎ，Ｃｈｉｎａ：ＩＥＥＥ，２０１５：８２４－８３２．［３１］ＴＯＭＰＳＯＮＪ，ＳＴＥＩＮＭ，ＬＥＣＵＮＹ，ｅｔａｌ．Ｒｅａｌ－ｔｉｍｅｃｏｎｔｉｎｕｏｕｓｐｏｓｅｒｅｃｏｖｅｒｙｏｆｈｕｍａｎｈａｎｄｓｕｓｉｎｇｃｏｎｖｏｌｕｔｉｏｎａｌｎｅｔｗｏｒｋｓ［Ｊ］．ＡＣＭＴｒａｎｓａｃｔｉｏｎｓｏｎＧｒａｐｈｉｃｓ，２０１４，３３（５）：１－１０．［３２］ＴＡＮＧＤａｎｈａｎｇ，ＪＩＮＣＨ，ＴＥＪＡＮＩＡ，ｅｔａｌ．Ｌａｔｅｎｔｒｅｇｒｅｓｓｉｏｎｆｏｒｅｓｔ：Ｓｔｒｕｃｔｕｒｅｄｅｓｔｉｍａｔｉｏｎｏｆ３Ｄａｒｔｉｃｕｌａｔｅｄｈａｎｄｐｏｓｔｕｒｅ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ．Ｃｏｌｕｍｂｕｓ：ＩＥＥＥ，２０１４：３７８６－３７９３．［３３］ＷＥＴＺＬＥＲＡ，ＳＬＯＳＳＢＥＲＧＲ，ＫＩＭＭＥＬＲ．Ｒｕｌｅｏｆｔｈｕｍｂ：Ｄｅｅｐｄｅｒｏｔａｔｉｏｎｆｏｒｉｍｐｒｏｖｅｄｆｉｎｇｅｒｔｉｐｄｅｔｅｃｔｉｏｎ［ＥＢ／ＯＬ］．［２０１５］．ｈｔｔｐｓ：／／ａｒｘｉｖ．ｏｒｇ／ａｂｓ／１５０７．０５７２６．［３４］ＹＵＡＮＳｈａｎｘｉｎ，ＹＥＱｉ，ＳＴＥＮＧＥＲＢ，ｅｔａｌ．Ｂｉｇｈａｎｄ２．２ｍｂｅｎｃｈｍａｒｋ：Ｈａｎｄｐｏｓｅｄａｔａｓｅｔａｎｄｓｔａｔｅｏｆｔｈｅａｒｔａｎａｌｙｓｉｓ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ．Ｖｅｎｉｃｅ，Ｉｔａｌｙ：ＩＥＥＥ，２０１７：４８６６－４８７４．［３５］ＳＨＡＲＰＴ，ＫＥＳＫＩＮＣ，ＲＯＢＥＲＴＳＯＮＤ，ｅｔａｌ．Ａｃｃｕｒａｔｅ，ｒｏｂｕｓｔ，ａｎｄｆｌｅｘｉｂｌｅｒｅａｌ－ｔｉｍｅｈａｎｄｔｒａｃｋｉｎｇ［Ｃ］／／Ｐｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅ３３ｒｄＡｎｎｕａｌＡＣＭＣｏｎｆｅｒｅｎｃｅｏｎＨｕｍａｎＦａｃｔｏｒｓｉｎＣｏｍｐｕｔｉｎｇＳｙｓｔｅｍｓ．Ｓｅｏｕｌ：ＡＣＭ，２０１５：３６３３－３６４２．［３６］ＺＩＭＭＥＲＭＡＮＮＣ，ＢＲＯＸＴ．Ｌｅａｒｎｉｎｇｌｏｅｓｔｉｍａｔｅ３ｄｈａｎｄｐｏｓｅｆｒｏｍｓｉｎｇｌｅｒｇｂｉｍａｇｅｓ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ（ＩＣＣＶ）．Ｖｅｎｉｃｅ，Ｉｔａｌｙ：ＩＥＥＥ，２０１７：４９０３－４９１１．［３７］ＬＯＢＯＪＭ，ＪＩＭＥＮＥＺ－ＶＡＬＶＥＲＤＥＡ，ＲＥＡＬＲ．ＡＵＣ：Ａｍｉｓｌｅａｄｉｎｇｍｅａｓｕｒｅｏｆｔｈｅｐｅｒｆｏｒｍａｎｃｅｏｆｐｒｅｄｉｃｔｉｖｅｄｉｓｔｒｉｂｕｔｉｏｎｍｏｄｅｌｓ［Ｊ］．ＧｌｏｂａｌｅｃｏｌｏｇｙａｎｄＢｉｏｇｅｏｇｒａｐｈｙ，２００８，１７（２）：１４５－１５１．［３８］ＭＣＫＥＥＩＷ，ＷＩＬＬＩＡＭＳＯＮＰＣ，ＬＡＭＥＷ，ｅｔａｌ．Ｔｈｅａｃｃｕｒａｃｙｏｆ４ｐａｎｏｒａｍｉｃｕｎｉｔｓｉｎｔｈｅｐｒｏｊｅｃｔｉｏｎｏｆｍｅｓｉｏｄｉｓｔａｌｔｏｏｔｈａｎｇｕｌａｔｉｏｎｓ［Ｊ］．Ａｍｅｒｉｃａｎｊｏｕｒｎａｌｏｆｏｒｔｈｏｄｏｎｔｉｃｓａｎｄｄｅｎｔｏｆａｃｉａｌｏｒｔｈｏｐｅｄｉｃｓ，２００２，１２１（２）：１６６－１７５．［３９］ＶＩＮＴＰＦ，ＨＩＮＲＩＣＨＳＲＮ．Ｅｎｄｐｏｉｎｔｅｒｒｏｒｉｎｓｍｏｏｔｈｉｎｇａｎｄｄｉｆｆｅｒｅｎｔｉａｔｉｎｇｒａｗｋｉｎｅｍａｔｉｃｄａｔａ：Ａｎｅｖａｌｕａｔｉｏｎｏｆｆｏｕｒｐｏｐｕｌａｒｍｅｔｈｏｄｓ［Ｊ］．ＪｏｕｒｎａｌｏｆＢｉｏｍｅｃｈａｎｉｃｓ，１９９６，２９（１２）：１６３７－１６４２．［４０］ＫＲＥＪＯＶＰ，ＢＯＷＤＥＮＲ．Ｍｕｌｔｉ－ｔｏｕｃｈｌｅｓｓ：Ｒｅａｌ－ｔｉｍｅｆｉｎｇｅｒｔｉｐｄｅｔｅｃｔｉｏｎａｎｄｔｒａｃｋｉｎｇｕｓｉｎｇｇｅｏｄｅｓｉｃｍａｘｉｍａ［Ｃ］／／２０１３１０ｔｈＩＥＥＥＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅａｎｄＷｏｒｋｓｈｏｐｓｏｎＡｕｔｏｍａｔｉｃＦａｃｅａｎｄＧｅｓｔｕｒｅＲｅｃｏｇｎｉｔｉｏｎ（ＦＧ）．Ｓｈａｎｇｈａｉ，Ｃｈｉｎａ：ＩＥＥＥ，２０１３：１－７．［４１］ＹＡＮＧＬｉｎｌｉｎ，ＹＡＯＡ．Ｄｉｓｅｎｔａｎｇｌｉｎｇｌａｔｅｎｔｈａｎｄｓｆｏｒｉｍａｇｅｓｙｎｔｈｅｓｉｓａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥ／ＣＶＦＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ（ＣＶＰＲ）．ＬｏｎｇＢｅａｃｈ：ＩＥＥＥ，２０１９：９８７７－９８８６．［４２］ＹＡＮＧＬｉｎｌｉｎ，ＬＩＳｈｉｌｅ，ＬｅｅＤ，ｅｔａｌ．Ａｌｉｇｎｉｎｇｌａｔｅｎｔｓｐａｃｅｓｆｏｒ３ｄｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥ／ＣＶＦＩｎｔｅｒｎａｔｉｏｎａｌＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ．Ｓｅｏｕｌ：ＩＥＥＥ，２０１９：２３３５－２３４３．［４３］ＧＵＪｉａｊｕｎ，ＷＡＮＧＺｈｉｙｏｎｇ，ＯＵＹＡＮＧＷａｎｌｉ，ｅｔａｌ．３ｄｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎｗｉｔｈｄｉｓｅｎｔａｎｇｌｅｄｃｒｏｓｓｍｉｏｄａｌｌａｔｅｎｔｓｐａｃｅ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥ／ＣＶＦＷｉｎｔｅｒＣｏｎｆｅｒｅｎｃｅｏｎＡｐｐｌｉｃａｔｉｏｎｓｏｆＣｏｍｐｕｔｅｒＶｉｓｉｏｎ（ＷＡＣＶ）．ＳｎｏｗｍａｓｓＶｉｌｌａｇｅ：ｄｂｌｐ，２０２０：３９１－４００．［４４］ＭＣＥＵＥＲＦ，ＢＥＲＮＡＲＤＦ，ＳＯＴＮＹＣＨＥＮＫＯＯ，ｅｔａｌ．ＧＡＮｅｒａｔｅｄｈａｎｄｓｆｏｒｒｅａｌ－ｔｉｍｅ３ｄｈａｎｄｔｒａｃｋｉｎｇｆｒｏｍｍｏｎｏｃｕｌａｒＲＧＢ［Ｊ］．ａｒＸｉｖｐｒｅｐｒｉｎｔａｒＸｉｖ：１７１２．０１０５７，２０１７．［４５］ＺＨＯＵＹｕｘｉａｏ，ＨＡＢＥＲＮＡＮＮＭ，ＸｕＷｅｉｐｅｎｇ，ｅｔａｌ．Ｍｏｎｏｃｕｌａｒｒｅａｌ－ｔｉｍｅｈａｎｄｓｈａｐｅａｎｄｍｏｔｉｏｎｃａｐｔｕｒｅｕｓｉｎｇｍｕｌｔｉ－ｍｏｄａｌｄａｔａ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥ／ＣＶＦＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ．Ｓｅａｔｔｌｅ：ＩＥＥＥ，２０２０：５３４６－５３５５．［４６］ＣＨＥＮＬｉａｎｇｊｉａｎ，Ｌ１ＮＳＹ，ＸＩＥＹｕｓｈｅｎｇ，ｅｔａｌ．ＤＧＧＡＮ：Ｄｅｐｔｈ－ｉｍａｇｅｇｕｉｄｅｄｇｅｎｅｒａｔｉｖｅａｄｖｅｒｓａｒｉａｌｎｅｔｗｏｒｋｓｆｏｒｄｉｓｅｎｔａｎｇｌｉｎｇｒｇｂａｎｄｄｅｐｔｈｉｍａｇｅｓｉｎ３ｄｈａｎｄｐｏｓｅｅｓｔｉｍａｔｉｏｎ［Ｃ］／／ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥ／ＣＶＦＷｉｎｔｅｒＣｏｎｆｅｒｅｎｃｅｏｎＡｐｐｌｉｃａｔｉｏｎｓｏｆＣｏｍｐｕｔｅｒＶｉｓｉｏｎ（ＷＡＣＶ）．Ｓｅａｔｔｌｅ：ＩＥＥＥ，２０２０：４１１－４１９．［４７］梁晓辉．手部姿态估计方法综述［Ｊ］．山西大学学报（自然科学版），２０２２，４５（３）：６３１－６４０．８３２智㊀能㊀计㊀算㊀机㊀与㊀应㊀用㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀㊀第１３卷㊀。

毕业设计93基于连续隐马尔科夫模型的语音识别 (2)

SHANGHAI UNIVERSITY 毕业设计（论文）UNDERGRADUATE PROJECT (THESIS)论文题目基于连续隐马尔科夫模型的语音识别学院机自专业自动化学号03122669学生姓名金微指导教师李昕起讫日期2007 3.20—6.6目录摘要---------------------------------------------------------------------------2 ABSTRACT ------------------------------------------------------------------------2绪论---------------------------------------------------------------------------3第一章语音知识基础---------------------------------------------------------------6 第一节语音识别的基本内容-------------------------------------------6第二节语音识别的实现难点-------------------------------------------9第二章HMM的理论基础--------------------------------------------------------10 第一节HMM的定义----------------------------------------------------10第二节隐马尔科夫模型的数学描述---------------------------------10第三节HMM的类型----------------------------------------------------12第四节HMM的三个基本问题和解决的方-----------------------15第三章HMM算法实现的问题----------------------------------------------21 第一节HMM状态类型及参数B的选择---------------------------21第二节HMM训练时需要解决的问题-----------------------------23第四章语音识别系统的设计---------------------------------------------------32 第一节语音识别系统的开发环境-----------------------------------32第二节基于HMM的语音识别系统的设计------------------------32第三节实验结果---------------------------------------------------------49第五章结束语-------------------------------------------------------------------67致谢------------------------------------------------------------------------------68参考文献------------------------------------------------------------------------69摘要语音识别系统中最重要的部分就是声学模型的建立，隐马尔可夫模型作为语音信号的一种统计模型，由于它能够很好地描述语音信号的非平稳性和时变性，因此在语音识别领域有着广泛的应用。

基于边缘梯度方向直方图的静态手语识别

基于边缘梯度方向直方图的静态手语识别
孙丽娟;张立材
【期刊名称】《微电子学与计算机》
【年(卷),期】2010(0)3
【摘要】文中采用边缘梯度方向直方图作为手势的特征矢量进行手语识别,建立归一化的边缘梯度直方图,使用欧氏距离模板匹配法进行手势的特征匹配,手势特征矢量之间的识别速度较快.实验表明:该方法对图像亮度、缩放、平移具有不变性,该方法计算简单、快速,可以用于手语识别系统.
【总页数】4页(P148-150)
【关键词】手语识别;梯度;边缘梯度直方图
【作者】孙丽娟;张立材
【作者单位】西安建筑科技大学信息与控制工程学院
【正文语种】中文
【中图分类】TP391
【相关文献】
1.基于韦伯梯度方向直方图的人脸识别算法 [J], 杨恢先;唐金鑫;陶霞;姜德财;颜微
2.基于边缘梯度方向直方图的图像检索算法 [J], 杨晓强
3.基于边缘梯度方向直方图的图像检索 [J], 余胜;谢莉
4.基于稀疏自编码器与梯度方向直方图的手势识别 [J], 缑新科;高庆东
5.基于梯度方向直方图的干枯植物与裸土偏振光谱识别研究 [J], 杨威;侯鲲;赵云升
因版权原因，仅展示原文概要，查看原文内容请购买。

纹理物体缺陷的视觉检测算法研究--优秀毕业论文

摘要
在竞争激烈的工业自动化生产过程中，机器视觉对产品质量的把关起着举足轻重的作用，机器视觉在缺陷检测技术方面的应用也逐渐普遍起来。与常规的检测技术相比，自动化的视觉检测系统更加经济、快捷、高效与安全。纹理物体在工业生产中广泛存在，像用于半导体装配和封装底板和发光二极管，现代化电子系统中的印制电路板，以及纺织行业中的布匹和织物等都可认为是含有纹理特征的物体。本论文主要致力于纹理物体的缺陷检测技术研究，为纹理物体的自动化检测提供高效而可靠的检测算法。纹理是描述图像内容的重要特征，纹理分析也已经被成功的应用与纹理分割和纹理分类当中。本研究提出了一种基于纹理分析技术和参考比较方式的缺陷检测算法。这种算法能容忍物体变形引起的图像配准误差，对纹理的影响也具有鲁棒性。本算法旨在为检测出的缺陷区域提供丰富而重要的物理意义，如缺陷区域的大小、形状、亮度对比度及空间分布等。同时，在参考图像可行的情况下，本算法可用于同质纹理物体和非同质纹理物体的检测，对非纹理物体的检测也可取得不错的效果。在整个检测过程中，我们采用了可调控金字塔的纹理分析和重构技术。与传统的小波纹理分析技术不同，我们在小波域中加入处理物体变形和纹理影响的容忍度控制算法，来实现容忍物体变形和对纹理影响鲁棒的目的。最后可调控金字塔的重构保证了缺陷区域物理意义恢复的准确性。实验阶段，我们检测了一系列具有实际应用价值的图像。实验结果表明本文提出的纹理物体缺陷检测算法具有高效性和易于实现性。关键字: 缺陷检测；纹理；物体变形；可调控金字塔；重构
Keywords: defect detection, texture, object distortion, steerable pyramid, reconstruction
II

基于小波神经网络的与文本无关说话人识别方法研究

９．％ｆｒｐａｅｓｕｉｇＭｅｅｕｎｙｃｐｔａｏｆｃｅｔｓｆａｕｅｐｒｍｅｅｓＴｈｘｅｉｎａｔｅｕｔｓｏｔａ９５ｏｓｅｋｒｓｌｑｅｃｅｓｒｌｅｉｎｅｔｒａａｔｒ．ｅｅｐｒ５ｎｒｆｃｉａｍｅｔｌｓｌｈｗｔａｒｓｈ
ＢｉｉｇａＹｎＺａｈｎ— ｎｈｏＺｅ－ｏｇｄＱｉｉ－ｈｎｎｃｅｇＹ－ＷａｇＢｎＧｏＪｎｙｎｎｉｕｉ－ｏｇａ－
（ｅｔｏＥｅｔｎｃｎｏＤｐ．ｆｌｒｉｄＣｍｍｕｉｔｎＥｇｎｅｉｇＮｒｈｎｌｃｉＰｗｒｎｖｒｉＢｏｉｇ００，ｈａｃｏａｎｃｉｎｉｒ，ｏｔＣｉａｅｔｃｏｅｉｓ￣ａｄｎ７３Ｃｉ）ａｏｅｎｈＥｒＵｅ１０ｎ
ＡｂｔａｔＴｈｐｒａｈｆｒｓｅｋｒｅｏｎｔｎｂｓｄｏｅｒｌｅｗｏｋｓａｌｏｅｌｔｈｕｃｉｎｏｕｎｂａｎｓｒｃｅａｐｏｃｏｐａｅｃｇｉｏａｅｎｎｕａｔｒｓｉｂｅｔｍｕａｅｔｅｆｎｔｆｈｍａｒｉｒｉｎｏ
ｔｅｌａｎｎａｅａｄｒｃｇｉｏｏｒｃｎｓｒｍｐｏｅｃｏａｅｏｔｅＢＰｎｔｒｓＩｈｓａｇｏｐｌａｉｎｈｅｒｉｇｒｔｎｅｏｎｔｎｃｒｅｔｅｓａｅｉｒｖｄｍｕｈｃｍｐｒｄｔｈｅｗｏｋ．ｔａｏｄａｐｉｔｉｃｏｐｏｐｃｎｒｏｒｓａｃｕｔｅｏｅｒｓｅｔｄｗｏｔｔｅｒｈｆｒｈｒａｈｅｍｒ．

一种带神经网络的手写体数字识别系统

一种带神经网络的手写体数字识别系统
张滨
【期刊名称】《江西大学学报：自然科学版》
【年(卷),期】1992(016)003
【摘要】手写体数字识别是模式识别的重要问题,反向传播模型(Back—Propagation)是一种性能较好的人工神经网络模型。

本文采用与联想相结合的BP 模型,对BP模型加以改进,并引入选举判决算法,既节省了训练时间,又提高了系统性能和识别率,并在ST—286H上用Turbo C2.0初步建立起一个带神经网络的手写体数字识别系统。

【总页数】6页(P301-306)
【作者】张滨
【作者单位】无
【正文语种】中文
【中图分类】TP391.4
【相关文献】
1.基于BP神经网络的手写体数字识别系统 [J], 宋平
2.手写体数字识别系统中一种新的特征提取方案 [J], 邓丽华;崔志强
3.手写体数字识别系统中一种新的特征提取方案 [J], 宋曰聪;胡伟
4.手写体数字识别系统中一种新的特征提取方法 [J], 钟乐海;胡伟
5.基于卷积神经网络的手写体数字识别系统 [J], 陈岩;李洋洋;余乐;王瑶;吴超;李阳光
因版权原因，仅展示原文概要，查看原文内容请购买。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

WaveRead:Automatic measurement of relative gene expressionlevels from microarrays using wavelet analysisGhislain Bidaut a,1,Frank J.Manion a ,Christophe Garcia b ,Michael F.Ochs a,*aBioinformatics,Division of Population Science,Fox Chase Cancer Center,333Cottman Avenue,Philadelphia,PA 19111-2497,USAbFrance Telecom R&D,4rue du Clos Courtel,BP 59,35512Cesson-Se´vigne ´Cedex,France Received 2March 2005Available online 15November 2005AbstractGene expression microarrays monitor the expression levels of thousands of genes in an experiment simultaneously.To utilize the information generated,each of the thousands of spots on a microarray image must be properly quantiﬁed,including background cor-rection.Most present methods require manual alignment of grids to the image data,and still often require additional minor adjustments on a spot by spot basis to correct for spotting irregularities.Such intervention is time consuming and also introduces inconsistency in the handling of data.A fully automatic,tested system would increase throughput and reliability in this ﬁeld.In this paper,we describe Wave-Read,a fully automated,standalone,open-source system for quantifying gene expression array images.Through the use of wavelet anal-ysis to identify the spot locations and diameters,the system is able to automatically grid the image and quantify signal intensities and background corrections without any user intervention.The ability of WaveRead to perform proper quantiﬁcation is demonstrated by analysis of both simulated images containing spots with donut shapes,elliptical shapes,and Gaussian intensity distributions,as well as of standard images from the National Cancer Institute.Ó2005Elsevier Inc.All rights reserved.Keywords:Microarrays;Image analysis;Wavelet decomposition1.IntroductionRecent advances in microarray technology have led to an explosion in the amount of data available for under-standing cellular function and pathways,with the potential for revealing the underlying cellular behavior responsible for disease [1–4].Studies have already shown that it is pos-sible in some cases to identify disease states more accurate-ly using mRNA expression proﬁles than can be done using classic pathology methods [5,6].The fundamental idea of a microarray experiment is to perform simultaneously thou-sands of hybridizations of mRNA targets derived from the experimental system (e.g.,cells,tumors,etc.)to cDNAprobes aﬃxed to a substrate,usually a glass microscope slide.The targets,one control and one experimental,are labeled with ﬂuorescent probes prior to hybridization,so that for each experiment two microarray images are creat-ed (each one corresponding to the hybridization signal of a single probe).The images are then quantiﬁed and data analysis is performed.Many tools for statistical inference have been developed for microarray measurements,including SAM [7],VERA-andSAM [8],ANOVA techniques [9,10],Bayesian approaches [11,12],and rank tests [13].In addition,a num-ber of data mining and statistical pattern recognition tech-niques have been applied to microarray data (for a review,see [14]).These included unsupervised techniques such as hierarchical clustering [15],principal component analysis [16],multidimensional scaling [17],Bayesian mixture mod-els [18,19],and other clustering methods [20–24].Super-vised techniques have been used primarily for classiﬁcation problems and include support vector1532-0464/$-see front matter Ó2005Elsevier Inc.All rights reserved.doi:10.1016/j.jbi.2005.10.001*Corresponding author.Fax:+12157282513.E-mail address:M_Ochs@ (M.F.Ochs).1Present address:Center for Bioinformatics,Penn Genomics Institute,University of Pennsylvania,14th Floor Blockley Hall,423Guardian Drive,Philadelphia,PA 19104-6021,USA./locate/yjbinmachines[25]and artiﬁcial neural networks[26].Notably a number of wavelet methods have been applied for analyz-ing the preprocessed data,including work determining cell cycle and metabolic periodicities[27–29],normalization [30],and analysis of array CGH measurements[31].Suc-cessful analysis of microarray data depends on proper quantiﬁcation of microarray images,where the measured ﬂuorescence levels must be converted from intensity to rel-ative transcript level.Present methods accessible to most academic and small laboratories inevitably involve signiﬁ-cant user interaction,which both slows the high-through-put process and introduces opportunity for bias,which can seriously aﬀect data mining.Some image analysis approaches limit user intervention, such as the method of Yang et al.[32],that relies on the presence of a batch of images where aﬁrst image is manu-ally gridded and the others are automatically adjusted if they have a similar geometric structure.More recently, Steinfath et al.[33]have introduced an approach based on a histogram segmentation to detect the spots.Then, projections along the axes locate the grid corners and per-mit rotation correction.The gridding is done by dividing the input image into blocks and using their vertical and horizontal projections.In the work of Jain et al.[34],the input image is directly projected along the axes and the grid mapped using the set of local maxima of the projections. Katzer et al.[35]reviewed the existing automatic gridding methods and presented a segmentation algorithm formu-lated on Markov randomﬁelds.Wavelets have also been applied for some preprocessing of microarray images, including denoising[36]and spot identiﬁcation using mod-ulus maxima[37].However,this latter method does not include rotation correction,grid recovery that is required on arrays with many low intensity spots,or automated background estimation.WaveRead is aimed to answer investigators’needs and provide part of a high-throughput microarray system. The method derives its power from the application of wavelet decomposition,which is used to segment the microarray image into signal and background,estimate spot diameters,and provide locations for recovery of grids and sub-grids.Wavelet theory has been described in detail [38]and has been applied in signal processing and imaging applications such as denoising and compression[39]and feature detection[40,41].Wavelets consist of a signal decomposition on a family of functions deﬁned by shifted and elongated versions of the mother wavelet,which is usu-ally a compact function of average0.Unlike a Fourier transform that decomposes the signal to be analysed on a family of sine functions,which is well adapted to the extraction of regularities of a signal,a wavelet transform is more adapted to identiﬁcation of signal irregularities, such as features of interest in an image(i.e.,microarray spots).In image processing,the wavelet decomposition is equivalent to applying a quadrature mirrorﬁlter[42]in two dimensions with high pass and low pass components on the input image,and iteration reapplies the sameﬁlter at a higher frequency on generated bining wavelet analysis to identify spot locations and diameters with additional steps provides a system that does spot detection,spot size estimation,rotation correction,sub-grid determination,and a link to spot identiﬁcations. WaveRead provides an open-source,standalone micro-array image analysis application that integrates array design information and linksﬁnal image analysis results to gene annotations.2.Methods2.1.AlgorithmFor microarray image analysis,we incorporate a wavelet ﬁlter described in[40]that is especially well adapted to fea-ture detection in images.However,the wavelet is imple-mented within a subroutine,and the code is annotated, allowing users to incorporate other wavelets.The imple-mented wavelet creates an approximation image A(low pass in both dimensions)and three detail images in speciﬁc directions,horizontal(H,high pass in the horizontal direc-tion only),vertical(V,high pass in the vertical direction only),and diagonal(D,high pass in both dimensions). The approximation image is further analyzed by a second pass through the wavelet transform,creating new detail images.For microarray images,the presence of a spot in the original image results in an identiﬁable signal in the horizontal and vertical detail images from this second transform.This signal is a set of localized parallel vertical bands and parallel horizontal bands occurring simulta-neously in the V and H detail images,respectively.Identi-ﬁcation of those bands in both detail images guarantees the detection of spots and avoids most detection of irregular signals arising from scratches and dust.There areﬁve main steps in quantifying each image:(1) rotation correction,(2)spot detection,(3)meta-grid identi-ﬁcation,(4)sub-grid mapping,and(5)quantiﬁcation with background estimation.First,the input image is normalized so that the mini-mum pixel is zero and the maximum pixel is65,535to improve the signal for rotation correction and spot detec-tion.Next,the input image relative to the array grid must be accurately corrected for rotation(see Fig.1).A series of rotated images S0are created from the input image by rotating in2°increments betweenÀ10°and+10°.For each S0,projections are made along the horizontal and vertical axes.Each projection is autocorrelated and Fourier trans-formed.To correct the rotation,we use the Fourier trans-form of the projection along the largest dimension of the input image to insure the best accuracy.The optimal angle is the one that corresponds to a periodic signal in the auto-correlation and that gives the highest amplitude in the sig-niﬁcant peak(see Fig.1C).After this rough correction,the procedure is iterated on a[À1°,+1°]interval centered on the optimal angle retained previously with an increment of0.1°to obtain aﬁner correction.The peak in the Fourier380G.Bidaut et al./Journal of Biomedical Informatics39(2006)379–388transforms of the vertical and horizontal projections gives the periodicity of the spotting of the array horizontally and vertically,respectively (fgrix and fgriy ).We use this periodicity to deﬁne a search window used later in the pat-tern recognition stage to locate the spots.Once the microarray has been rotated,the spots need to be identiﬁed.To speed the computation,a variance ﬁlter is applied to the input image so that the search for spot sig-natures is only performed on areas were there is a local maximum in the intensity.The variance ﬁlter is essentially an averaging ﬁlter using a mask of size fgridx by fgridy fol-lowed by a maxima detection stage where areas of size fgridx by fgridy with variance below a certain threshold are masked out,to avoid a feature search on low variance regions.The image is then analyzed using a wavelet trans-form as described above,and the H and V detail images from the ﬁrst A image are retained.Pattern recognition on these images is used to ﬁnd the spots through identiﬁca-tion of the characteristic double lines with uniform separa-tion (see Fig.2).To increase the processing speed,only every other pixel in the H and V images is processed by the pattern recognition routine.Each pixel is classiﬁed as a spot center if it is at the center of the signals (extracted from the H and V images)with two positive and two neg-ative peaks of amplitude at least a 1and a 2separated by a distance d (see Fig.2).The acceptable values for d ,a 1,and a 2permit proper grid detection for images,such as those from the project normal data set [43]and NCI stan-dard microarray images (/dataSets/geawQCandIA ),as well as from Agilent microarrays.(misaligned)(aligned)ACBG.Bidaut et al./Journal of Biomedical Informatics 39(2006)379–388381Default values for d,a1,and a2can be overridden and entered by the user as parameters.Once all the pixels have been classiﬁed as signal/nonsignal,they are clustered together on a distance basis,and the center of each cluster is retained as a spot center.Finally,we apply a spatialﬁlter that retains only the spots having at least two neighbors separated appropriately by the grid spacings fgridx and fgridy.Thisﬁltering stage allows the suppression of extra detections that could prevent the meta-gridding algorithm from performing properly.Next,the sub-grids or blocks(meta-grids)correspond-ing to a printing group must be identiﬁed.At this stage, we use the information provided by the user under the form of a gene IDﬁle that describes the spatial organization of the microarray,to extract the number of sub-grids to be found in the data.Meta-grid size mgridx and mgridy are found by multiplications of the number of spots found from the estimated spacing values fgridx and fgridy.Verti-cal and horizontal histograms of the detections are extract-ed and morphologically closed(i.e.,a dilation followed by an erosion)with a mask size of fgridx and fgridy,respec-tively.This leads to a proﬁle where local minima corre-sponding to the inter-grid spacing remain and minima related to inter-spot spacing have beenﬁltered out.Meta-grids are then separated by locating minima on sections of size2(mgridx)or2(mgridy)on the morphologically closed proﬁles.Once the meta-grids have been detected,the next stage is the sub-grid mapping.Each area separated in the previ-ous step is processed separately as an individual image matching a sub-grid(i.e.,a pin group).Since not every spot on the array is detected in the pattern recognition stage,a grid is mapped to the input area by overlaying the detected spots with a set of possible grids using dynamic programming guided by regularity of spacing to identify the locations of undetected spots.This allows the elimination of some false positives that are detected as spots but that do not fully align with other spots, and it also locates spots with low signal intensity.This procedure treats separately the vertical and horizontal axes.The average spacing of the grid is calculated,spac-ings that are far removed from the average(less than half or greater than twice)are removed,and the average spac-ing s is recalculated.Theﬁrst line is chosen,and then a line is added at a space s from theﬁrst line.If it matches an existing line the cost is zero,while a mismatch results in an increased cost.This is repeated until a line is added near the last line of the grid.The method is then repeated starting with the second line,the third,and so forth.The regular grid with the lowest cost is chosen as the true grid for the array.This is done for both the horizontal and vertical grids.The spot locations and radii(‘‘detection discs’’)are determined by the separation of the lines(d in Fig.2)deﬁn-ing the pattern for spot detection where possible,and by using a grid location and average radius where no direct spot detection was made.The local background intensity is estimated for each spot by creating a histogram of pixels not in the detection discs for the area including the spot and all immediately adjacent spots.The radius of the disk After the rotation correctiondescribed in the text,yieldingover them.Theﬁlter extractsfrom the H and V images,allowed spacing between thesearch window is classiﬁedcenter of the detected spots.382G.Bidaut et al./Journal of Biomedical Informatics39(2006)379–388is estimated by measuring the distance between the maxima on each of the vertical and the horizontal proﬁles(V and H images).If there is no direct detection of a spot by the wavelet analysis,an average radius is used.Practically this makes little diﬀerence as undetected spots tend to be very near background,and the background correction will adjust for reading too much nonsignal.The spot intensity is determined by integration over the disk area,subtracting the background value for each pixel,giving a total signal. Mean and median pixel intensities as well as standard devi-ations are also determined.The local background intensity is estimated by extracting the median of the histogram on the pixel intensity in the local area of size3·3spots cen-tered on the current spot with all signal pixels removed.The kernel of WaveRead has been developed in C,and a Java front-end using the Java Advanced Imaging interface handles image display,loading and data visualization.2.2.Application to simulated dataTwo16bit gray TIFF imageﬁles are created,one in eachﬂuorescent wavelength(typically cy3and cy5),using standard protocols including confocal or other scanning techniques applied to a hybridized microarray(an example protocol from the Stanford group can be found in[1]). Those images are quantiﬁed and background corrected separately by WaveRead,without any user inputs.The analysis described here has been tested on multiple simulations with varying noise including‘‘good’’spots (stepfunctions),donut-shaped spots,elliptical spots,and Gaussian spots.In addition,the system was used for anal-ysis on multiple images from a Genetic Microsystems GMS418Array Scanner(Aﬀymetrix,Santa Clara,CA)fol-lowing hybridization of spotted microarrays with cy3and cy5labeled targets,and images from Agilent arrays pro-cessed on an Agilent G2565AA scanner.For the simulations,the background of each image was derived from a real microarray,and synthetic spots of var-ious shapes and known values were superimposed on it (round,elliptic,gaussian,donut,donut2–donut with no central value),as shown in Fig.3.Images were character-ized by signal-to-noise ratios(SNRs)ranging from21.0 for the gaussian spots to10.9for the donut2spots(worse). For each image,WaveRead automatically separated the sub-grids,determined the spot diameters,and background. The noise level was then systematically increased to probe how WaveRead handles poorer quality images,until the levels shown in Fig.5where noise levels were seven times normal for the donut-shaped spots with no central intensity (panel E)and30times normal for the others.SNRs ranged from a median value of2.63(best)for the gaussian spots to 1.76for the elliptical spots(worse).Table1summarizes median,minimum,and maximum SNR values for the ser-ies of plotted images.A CD EBFig.3.Simulated microarray images.To quantify the system behavior under diﬀerent types of signal,we created microarray images by superimposing diﬀerent types of spots on a background extracted from a real microarray.We simulated plain round spots(A),elliptical(B),Gaussian(C),‘‘donut-like’’spots with a minimal amount ofﬂuorescence in the middle(D),and‘‘donut-like’’spots with noﬂuorescence in the middle(E).SNR levels are given in Table1.G.Bidaut et al./Journal of Biomedical Informatics39(2006)379–3883832.3.Application to NCI gold standard imagesThe system was also tested by analysis of the NCI stan-dard microarray images.Those images were especially gen-erated for the purpose of being reference images to compare diﬀerent quantiﬁcation methods.They are organized into two sets of70microarrays from diﬀerent manufacturers. Each set compares human tissues from heart,brain,and placenta with diﬀerent types of cancer.They have been properly quantiﬁed by NCI using GenePix.In each case the system automatically identiﬁed many of the spots,cor-rectly produced the proper grid,and read all values and backgrounds.This included numerous low quality images with signiﬁcant artifactual signals from dust or scratches as well as poor background.The computation time includ-ing rotation correction for a10Mbyte TIFF image from NCI organized in four sub-grids containing a total of 10,000spots is about200s on a800MHz Pentium III Linux system,and depends linearly with the image size,number of pixels in the search window,and number of spots.3.Results3.1.Application to simulated dataThe simulations on the synthetic microarrays gave good results.For the images in Fig.3,the scatter plots of measured intensity vs known intensity are shown in Fig.4.The correlations for all images are excellent,with correlation coeﬃcients ranging from0.957for donut-shaped spots with no central intensity(panel E)to 0.999for elliptical and round spots(panels A and B). For the increased noise levels,the quality of the quanti-ﬁcation decreased slightly with increasing noise for the round,elliptical,and Gaussian spots,while for the donut-shaped spots the decrease was sharper.For donuts with some inner intensity(typical of those seen in micro-array images),the decrease was uniform with noise. However,for donuts with no central intensity,there was a sudden loss ofﬁdelity when the noise level reached eight times the typical noise level seen(SNR$5).At this point the gridding failed,resulting in the grid size being misestimated(25·20instead of24·24).This illustrates one of the advantages of WaveRead:gross misestimation of transcript levels corresponds to incorrect gridding,a parameter that can be checked automatically while read-ing a series of images.Fig.6shows the scatter plots of measured intensity vs known intensity for the images in Fig.5.Theﬁdelity of the reading was further studied by cal-culating the correlation coeﬃcient between the known values in the simulations and the quantiﬁcation forTable1Signal-to-noise ratio measurements for simulated imagesRound Elliptic Gaussian Donut Donut2SNR Fig.3(normal noise)Median20.514.021.015.710.9Min0.0860.0590.0880.0660.046Max41.828.642.732.022.1SNR Fig.5(maximum noise)Median 2.57 1.76 2.62 1.97 3.01Min0.0110.0070.0110.0080.013Max 5.24 3.58 5.35 4.01 6.13The median,minimum,and maximum signal-to-noise ratios are given forthe simulated images in Figs.3and5,with the median representing atypical spot and the maximum and minimum values giving the range overthe full array.384G.Bidaut et al./Journal of Biomedical Informatics39(2006)379–388increasing noise levels (Fig.7).As can be seen,the cor-relation remains excellent even at high noise for well-de-ﬁned spots (round,elliptical,Gaussian),with some decrease in ﬁdelity for typical donut spots.For donut spots with no central intensity,as noted earlier,the loss of ﬁdelity is dramatic but easily identiﬁed from the incor-rect gridding.3.2.Application to NCI gold standard imagesWe tested the system by comparing the results for auto-matic quantiﬁcation of the NCI test images,available at /dataSets/geawQCandIA ,with values determined by NCI.The results for the normalized values from the two methods are plotted in Fig.8.The valuesreadAD B CEFig.5.Simulated microarray images with higher noise.The same types of spots are generated and superimposed on a real background with 30times the normal level of noise for (A),(B),and (C).The ‘‘donut-like’’spots (D)and (E)are simulated at 7times the normal level of noise.SNR levels are given in Table 1.G.Bidaut et al./Journal of Biomedical Informatics 39(2006)379–388385automatically by this system agree closely with the stan-dard values determined by NCI when compared by linear regression of the corrected WaveRead intensity estimates against the NCI gold standard values(I WaveRead=1.017 I NCI,R2=0.98,where I WaveRead is the WaveRead estimate of spot intensity,I NCI is the NCI gold standard value,and R2provides the correlation coeﬃcient).In addition,we read the microarray images with a widely used commercial product and compared the results to the NCI measure-ments.The results gave a similarﬁt and correlation in lin-ear regression analysis to the automatic reading by the wavelet system(I commercial=1.077I NCI,R2=0.97,where I commercial is the commercial program estimate of spot intensity)although with much more manual intervention in grid alignment and sub-grid identiﬁcation required by the user.4.ConclusionWaveRead provides a reliable and fast tool for the high-throughput analysis of microarray images.Presently,a great deal of time and eﬀort is required to obtain high-quality,reproducible measurements from microarray imag-es.Present methods used in academic laboratories general-ly involve manual placing of grids andﬂagging of spots, which can introduce potential bias that feeds into later data mining.WaveRead is designed to avoid both the eﬀort of manual placement of grids and detection discs and the potential introduction of bias.WaveRead uses a number of mathematical methods to correctly locate and measure each spot on the microarray. An initial correction for misalignment of the array with the image is performed by autocorrelation and Fourier analy-386G.Bidaut et al./Journal of Biomedical Informatics39(2006)379–388sis of the projections of the input image.The sub-grids are also determined by auto-correlation of the projections.For each sub-grid,a wavelet decomposition of the correspond-ing area of the input image is performed and used as a two dimensional edge detector.The spot radii are estimated using the edges detected during the wavelet decomposition stage.The grid is determined by a simple dynamic pro-gramming method with a strong constraint on the regular-ity of the grid to ensure that the best possible grid is retained.Finally the spots and background are measured by using the estimated radii for image segmentation,with the background being estimated by the median of the local background.The accuracy of the quantiﬁcation has been veriﬁed both by simulation and by analysis of the NCI standard.Here,we propose a new method that does not require any user intervention,can rectify misaligned images,and support microarrays with multiple sub-grids.The source code and executables are available on the Fox Chase Bio-informatics website.2AcknowledgmentsWe thank Drs.R.Randy Hardy and Andrew K.God-win of the Fox Chase Cancer Center for providing images for tuning our analysis system.This work was supported by the National Institutes of Health,National Cancer Institute(Comprehensive Cancer Center Core Grant CA06927to R.Young,Ovarian SPORE P50CA83638pi-lot grant to M.F.O.)and the Pew Foundation.We acknowledge the assistance of the Bioinformatics and Microarray Facilities at Fox Chase Cancer Center as well. References[1]Spellman PT,Sherlock G,Zhang MQ,Iyer VR,Anders K,EisenMB,et prehensive identiﬁcation of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.Mol Biol Cell1998;9(12):3273–97.[2]Iyer VR,Eisen MB,Ross DT,Schuler G,Moore T,Lee JCF,et al.The transcriptional program in the response of humanﬁbroblasts to serum.Science1999;283(5398):83–7.[3]Diehn M,Eisen MB,Botstein D,Brown rge-scale identiﬁca-tion of secreted and membrane-associated gene products using DNA microarrays.Nat Genet2000;25(1):58–62.[4]Ross DT,Scherf U,Eisen MB,Perou CM,Rees C,Spellman P,et al.Systematic variation in gene expression patterns in human cancer cell lines.Nat Genet2000;24(3):227–35.[5]Alizadeh AA,Eisen MB,Davis RE,Ma C,Lossos IS,Rosenwald A,et al.Distinct types of diﬀuse large B-cell lymphoma identiﬁed by gene expression proﬁling.Nature2000;403(6769):503–11.[6]Golub TR,Slonim DK,Tamayo P,Huard C,Gaasenbeek M,Mesirov JP,et al.Molecular classiﬁcation of cancer:class discovery and class prediction by gene expression monitoring.Science 1999;286(5439):531–7.[7]Tusher VG,Tibshirani R,Chu G.Signiﬁcance analysis of micro-arrays applied to the ionizing radiation response.Proc Natl Acad Sci USA2001;98(9):5116–21.[8]Ideker T,Thorsson V,Siegel AF,Hood LE.Testing for diﬀerentially-expressed genes by maximum-likelihood analysis of microarray data.J Comput Biol2000;7(6):805–17.[9]Kerr MK,Martin M,Churchill GA.Analysis of variance for geneexpression microarray data.J Comput Biol2000;7(6):819–37. [10]Kerr MK,Afshari CA,Bennett L,Bushel P,Martinez J,Walker NJ,et al.Statistical analysis of a gene expression microarray experiment with replication.Stat Sinica2002;12(1):203–18.[11]Newton MA,Kendziorski CM,Richmond CS,Blattner FR,TsuiKW.On diﬀerential variability of expression ratios:improving statistical inference about gene expression changes from microarray data.J Comput Biol2001;8(1):37–52.[12]Parmigiani G,Garrett E,Anbazhagan R,Gabrielson E.A statisticalframework for expression-based molecular classiﬁcation in cancer.J Roy Stat Soc B2002;64:717–36.[13]Troyanskaya OG,Garber ME,Brown PO,Botstein D,Altman RB.Nonparametric methods for identifying diﬀerentially expressed genes in microarray data.Bioinformatics2002;18(11):1454–61.[14]Ochs MF,Godwin AK.Microarrays in cancer:research andapplications.Biotechniques2003;34:S4–S15.[15]Eisen MB,Spellman PT,Brown PO,Botstein D.Cluster analysis anddisplay of genome-wide expression patterns.Proc Natl Acad Sci USA 1998;95(25):14863–8.[16]Alter O,Brown PO,Botstein D.Singular value decomposition forgenome-wide expression data processing and modeling.Proc Natl Acad Sci USA2000;97(18):10101–6.[17]Khan J,Simon R,Bittner M,Chen Y,Leighton SB,Pohida T,et al.Gene expression proﬁling of alveolar rhabdomyosarcoma with cDNA microarrays.Cancer Res1998;58(22):5009–13.[18]Medvedovic M,Sivaganesan S.Bayesian inﬁnite mixture modelbased clustering of gene expression proﬁles.Bioinformatics 2002;18(9):1194–206.[19]Medvedovic M,Yeung KY,Bumgarner RE.Bayesian mixture modelbased clustering of replicated microarray data.Bioinformatics2004.[20]Gasch AP,Eisen MB.Exploring the conditional coregulation of yeastgene expression through fuzzy k-means clustering.Genome Biol 2002;3(11):RESEARCH0059.[21]Getz G,Levine E,Domany E.Coupled two-way clustering analysisof gene microarray data.Proc Natl Acad Sci USA 2000;97(22):12079–84.[22]Ben-Dor A,Shamir R,Yakhini Z.Clustering gene expressionpatterns.J Comput Biol1999;6(3–4):281–97.[23]Heyer LJ,Kruglyak S,Yooseph S.Exploring expression data:identiﬁcation and analysis of coexpressed genes.Genome Res 1999;9(11):1106–15.[24]Lukashin AV,Fuchs R.Analysis of temporal gene expressionproﬁles:clustering by simulated annealing and determining the optimal number of clusters.Bioinformatics2001;17(5):405–14. [25]Brown MP,Grundy WN,Lin D,Cristianini N,Sugnet CW,FureyTS,et al.Knowledge-based analysis of microarray gene expression data by using support vector machines.Proc Natl Acad Sci USA 2000;97(1):262–7.[26]Khan J,Wei JS,Ringner M,Saal LH,Ladanyi M,Westermann F,et al.Classiﬁcation and diagnostic prediction of cancers using gene expression proﬁling and artiﬁcial neural networks.Nat Med 2001;7(6):673–9.[27]Klevecz RR,Dowse HB.Tuning in the transcriptome:basinsof attraction in the yeast cell cycle.Cell Prolif 2000;33(4):209–18.[28]Klevecz RR.Dynamic architecture of the yeast cell cycle uncoveredby wavelet decomposition of expression microarray data.Funct Integr Genomics2000;1(3):186–92.[29]Klevecz RR,Bolen J,Forrest G,Murray DB.A genomewideoscillation in transcription gates DNA replication and cell cycle.Proc Natl Acad Sci USA2004;101(5):1200–5.[30]Wang J,Ma JZ,Li MD.Normalization of cDNA microarray datausing wavelet b Chem High Throughput Screen 2004;7(8):783–91.2/software/software_open.shtml.G.Bidaut et al./Journal of Biomedical Informatics39(2006)379–388387。