Automatic Feature Weighting in Automatic Transcription of Specified Part in Polyphonic Musi
- 格式:pdf
- 大小:89.09 KB
- 文档页数:4
非线性系统的多扩展目标跟踪算法非线性系统的多目标跟踪算法是指在面对非线性系统时,能够同时跟踪多个目标的一种算法。
在实际应用中,我们经常会遇到需要同时跟踪多个目标的情况,例如在无人机航迹规划、自动驾驶、智能交通系统等领域都会用到多目标跟踪算法。
非线性系统的多目标跟踪算法是一种复杂而又具有挑战性的问题,因为非线性系统具有复杂的动态特性,同时需要考虑多个目标之间的相互影响和干扰。
本文将介绍一种基于扩展目标跟踪算法的非线性系统多目标跟踪方法,并进行深入的探讨。
一、扩展目标跟踪算法简介扩展目标跟踪(Extended Target Tracking, ETT)算法是一种针对多目标跟踪问题的算法。
与传统的目标跟踪算法不同,扩展目标跟踪算法考虑到目标的扩展性,即目标可能在时空上都有一定的扩散性。
这种扩展性使得目标不再是一个点目标,而是一个区域目标,因此需要在目标跟踪算法中考虑到目标的扩展性。
扩展目标跟踪算法能够有效地处理多个目标之间的交叉干扰和相互遮挡的情况,因此在复杂环境下具有非常好的效果。
扩展目标跟踪算法的基本思想是通过对目标进行扩展描述,将目标看作是一个概率分布函数,而不是一个确定的点目标。
根据目标的运动模型和传感器的观测模型,通过贝叶斯滤波方法对目标的状态进行估计和预测。
扩展目标跟踪算法通常采用的滤波方法包括卡尔曼滤波、粒子滤波等,通过对目标的概率分布进行更新和迭代,最终得到目标的轨迹和状态信息。
针对非线性系统的多目标跟踪问题,我们可以将扩展目标跟踪算法进行扩展,利用非线性滤波方法对多个扩展目标进行跟踪。
在非线性系统中,目标的运动和观测模型往往是非线性的,因此传统的线性滤波方法已经不再适用。
我们需要借助非线性滤波方法,如扩展卡尔曼滤波(Extended Kalman Filter, EKF)或无迹卡尔曼滤波(Unscented Kalman Filter, UKF),来处理非线性系统的多目标跟踪问题。
在非线性系统中,目标的状态通常是由位置、速度和加速度等多个参数组成的向量,而目标的观测数据也可能是非线性的。
自动驾驶中常用的语义分割模型自动驾驶技术的发展为我们带来了许多便利和惊喜,其中语义分割模型在自动驾驶系统中扮演着重要的角色。
语义分割模型能够将图像中的每个像素分配到特定的类别中,从而实现对图像的精细理解和分析。
下面我将以人类的视角来介绍自动驾驶中常用的语义分割模型。
一、语义分割模型的概念与原理语义分割模型是计算机视觉领域中的一种重要技术,其目标是将图像中的每个像素分配到特定的语义类别中,例如道路、车辆、行人等。
语义分割模型通常基于深度学习算法,通过对大量标注好的图像进行训练,学习到图像中不同语义类别的特征表示。
二、常用的语义分割模型1. FCN(全卷积网络)FCN是语义分割领域的经典模型之一。
它将传统的卷积神经网络结构进行改造,使其能够输出与输入图像尺寸相同的特征图,并通过上采样操作将特征图恢复到原始尺寸,从而实现像素级别的语义分割。
2. U-NetU-Net模型结构独特,由编码器和解码器组成。
编码器用于提取图像的特征表示,解码器则通过上采样操作将特征恢复到原始尺寸,并与编码器的特征进行融合,最终输出语义分割结果。
3. DeepLabDeepLab模型采用了空洞卷积(dilated convolution)来扩大感受野,从而更好地捕捉图像中的上下文信息。
此外,DeepLab还引入了多尺度信息融合的机制,提高了语义分割的精度和鲁棒性。
三、语义分割模型在自动驾驶中的应用语义分割模型在自动驾驶中发挥着重要的作用。
首先,它能够帮助自动驾驶系统准确地理解道路、车辆、行人等元素,从而更好地规划和控制车辆的行驶。
其次,语义分割模型能够提供高精度的障碍物检测和识别,帮助自动驾驶系统实现精细化的环境感知。
此外,语义分割模型还可以用于交通场景分析、行为预测等关键任务,为自动驾驶系统提供更全面的认知能力。
总结起来,语义分割模型在自动驾驶中具有重要的地位和作用。
它能够将图像中的每个像素与特定的语义类别对应起来,为自动驾驶系统提供精细化的环境感知和认知能力。
语义分析的一些方法语义分析的一些方法(上篇)•5040语义分析,本文指运用各种机器学习方法,挖掘与学习文本、图片等的深层次概念。
wikipedia上的解释:In machine learning, semantic analysis of a corpus is the task of building structures that approximate concepts from a large set of documents(or images)。
工作这几年,陆陆续续实践过一些项目,有搜索广告,社交广告,微博广告,品牌广告,内容广告等。
要使我们广告平台效益最大化,首先需要理解用户,Context(将展示广告的上下文)和广告,才能将最合适的广告展示给用户。
而这其中,就离不开对用户,对上下文,对广告的语义分析,由此催生了一些子项目,例如文本语义分析,图片语义理解,语义索引,短串语义关联,用户广告语义匹配等。
接下来我将写一写我所认识的语义分析的一些方法,虽说我们在做的时候,效果导向居多,方法理论理解也许并不深入,不过权当个人知识点总结,有任何不当之处请指正,谢谢。
本文主要由以下四部分组成:文本基本处理,文本语义分析,图片语义分析,语义分析小结。
先讲述文本处理的基本方法,这构成了语义分析的基础。
接着分文本和图片两节讲述各自语义分析的一些方法,值得注意的是,虽说分为两节,但文本和图片在语义分析方法上有很多共通与关联。
最后我们简单介绍下语义分析在广点通“用户广告匹配”上的应用,并展望一下未来的语义分析方法。
1 文本基本处理在讲文本语义分析之前,我们先说下文本基本处理,因为它构成了语义分析的基础。
而文本处理有很多方面,考虑到本文主题,这里只介绍中文分词以及Term Weighting。
1.1 中文分词拿到一段文本后,通常情况下,首先要做分词。
分词的方法一般有如下几种:•基于字符串匹配的分词方法。
此方法按照不同的扫描方式,逐个查找词库进行分词。
基于遗传算法的汽车变速箱轻量化设计褚永康;文桂林;崔中;文登【摘要】A method of lightweight optimization design for a gear box is proposed based on the approximation model and genetic algorithm(GA). The premise of lightweight optimization is to ensure the performance requirements, such as strength, stiffness and vibration resistance. The Latin method is used in the experimental design to generate test sample points, and the approximation model is built based on the Gaussian radial basis function. The global optimization design is also performed based on the GA. The mass of the optimized box is reduced by 21%. The results show that this method can provide certain engineering guidance for the lightweight optimization of gear box.%文章利用近似模型方法和遗传算法对某变速箱进行轻量化设计.在满足变速箱刚度强度及振动模态要求的基础上,采用拉丁方法进行试验设计,基于高斯径向基函数建立近似模型,最后利用遗传算法进行全局寻优设计.优化后箱体质量降低了21%.结果表明,该方法对汽车变速箱结构轻量化设计具有一定的指导意义.【期刊名称】《合肥工业大学学报(自然科学版)》【年(卷),期】2011(034)010【总页数】5页(P1461-1465)【关键词】变速箱箱体;轻量化;近似模型;遗传算法【作者】褚永康;文桂林;崔中;文登【作者单位】湖南大学汽车车身先进设计制造国家重点实验室,湖南长沙410082;湖南大学汽车车身先进设计制造国家重点实验室,湖南长沙410082;湖南大学汽车车身先进设计制造国家重点实验室,湖南长沙410082;湖南大学汽车车身先进设计制造国家重点实验室,湖南长沙410082【正文语种】中文【中图分类】U463.2120 引言当今汽车技术的发展由于受到环境、资源等问题制约,轻量化的汽车在燃油经济性、排放物等级方面都具有更强的市场竞争力,因此,汽车轻量化技术成为十分必要的研发课题。
Manual1true:balanceTap the power of spectral balance2Welcome to true:balance InstallAuthorization User interface Key Readouts Spectrum Display Spectral References Common References Channel Data Section Width & Correlation Output Meter Balance Check Mono Check Presets SettingsContents345678910111213141516173 Welcome to true:balancetrue:balance is spectral analyzer plug-in that makes iteasy to compare the spectral distribution of your mix inreference to different spectral targets. Additionally, theplug-in provides you with information on the width andcorrelation of your track that will help you to avoid issueswith mono compatibility or a blurry low end.Grab true:balance whenever you need reliable insightsinto the overall balance of your mix. You can use itscommon genre references or load custom referencetracks for comparisons. Plus, true:balance assists youwith suggestions for modifications your track might needin order to match the spectral distribution of your chosenreference. If you appreciate additional guidance, just usethe check features to get suggestions for modifications torectify issues.Get started with true:balance and have fun getting yourtrack ready for its big release.4Mac OSXTo start the installation process, please open the disk image sonible_truebalance_osx_x.x.x.dmg . This will mount the image and open a finder window showing the content of the installation package.To install true:balance on your system, run the installation file truebalance.pkg .The installer will now guide you through the necessary steps to install true:balance on your computer. true:bal-ance will automatically be installed in the default locations for audio plug-ins.Default folders:Audio Unit/Library/Audio/Plug-Ins/Components/VST/Library/Audio/Plug-Ins/VST/VST3/Library/Audio/Plug-Ins/VST3/AAX/Library/Application Support/Avid/Audio/Plug-Ins/System requirementsInstallWindowsTo start the installation process, extract the download-ed zip-file sonible_truebalance_win_x.x.x.zip onto your hard disk and run the installer.The installer will now guide you through the necessarysteps to install true:balance on your computer. true:bal-ance will automatically be installed in the default locationsfor audio plug-ins.Default folders:VST3 C:\Program Files\Common Files\VST3\VSTC:\Program Files\Common Files\VST\AAXC:\Program Files\Common Files\Avid\Audio\Plug-InsCPUIntel Core i5Apple M1RAMOperating systemsGraphics4GBWindows 10+ (64 bit)Mac OS 10.12+OpenGL Version 3.2+5 AuthorizationUnlockingIf you purchased a license for true:balance online, you receive your license key via email.Machine-based unlockingWhen opening true:balance for the first time, a notification window will be displayed asking you to unlock true:bal-ance with a valid license key.Please make sure that your computer is connected to the internet before starting the registration process.Enter your license key and click …register.“ The plug-in will now communicate with our server to check if the license is valid. If it is – enjoy! :)iLokIf you transferred your license to an iLok, simply attach the iLok to your computer. The plug-in will then be automati-cally registered – enjoy!If you don’t receive the email within minutes please check your junk folder first before contacting our support (*******************).Trial versionTo run true:balance in demo-mode, simply click ”try” and you will then be able to use true:balance for a couple of days without any limitations. (Please refer to our website to find out more about the current demo period of true:-balance)When the demo period expires, you will need to purchase a full license in order to continue using the plug-in. Internet connection requirementssonible plug-ins only needs an internet connection during the trial period and for initial license activation. During the trial period, the plug-in needs to go online every time it is used. O nce the license of your plug-in has successfully been activated, an internet connection is no longer needed.Licensing systemYou can select between two licensing systems: machine-based or iLok (USB dongle).By creating a user account on and registering your products – if they are not already visible in your Dashboard, you can manage your plug-in activa-tions.Machine-basedEach license key allows you to install true:balance on two computers with unique system IDs. These system IDs are computed during license activation.The same license can be used by multiple users, but each user has to individually unlock the full version of true:bal-ance under their account.In case a system-ID is changed (e.g. replacement of the hard drive), you can revoke/activate the plug-in next to the respective system-ID in the Dashboard of your sonible user account.iLokIf you want to transfer one activation to your iLok, just make sure the plug-in is registered in your sonible user account. Click on the button …transfer to iLok“ next to the plug-in in your Dashboard and follow the instructions. Note: 1st gen iLok dongles and the iLok Cloud arecurrently not supported.6User interfaceReference DropdownSelect a genre for easy comparison with typical spectral references.Spectrum DisplayObserve the real-time (average) spectrum and compare it to the distribution of your chosen reference.and deviations from the chosen reference in the low, mid and high frequency range.Reference TracksLoad up to 8 reference tracks for easy comparison with existing mixes.Metering SectionMonitor the peak and RMSvalue of your track.Channel Data SectionCheck the width and correlationof your track to avoid monocompatibility issues.7Key ReadoutsThe spectral balance of a track is all about the overall level-relation of different frequency regions and not necessarily about the exact spectral shape. Three readouts showing the average levels in the low, mid and high frequency range help to focus on these keylevel-relations.Target IndicatorTarget indicators below each readout value show the level in relation to the chosen reference.The indicator and the readout both turn green if the measured value meets the value of the reference.8Spectrum DisplayThe spectrum display provides detailed real-time informa-tion about the spectral distribution of a mix. To account for the human perception of levels in different frequencies, the analyzer uses a perceptually motivated frequency summing that leads to a bathtub-like distribution for pink noise and an upwards slope in high frequencies for white-noise-like signals. Narrow-band signals (e.g. sine waves) will have a constant level across all frequencies. This weighting helps to better represent critical level differences as perceived by a mixing engineer.Real-time SpectrumAverage Spectrumthe key readout section.Reference Zonethe signal energy. This mode can help to hit a fixed target, for example when trying to hit similar overall levels for multiple tracks.spectrum and the reference zone.Drag the level scale up or down to adapt the display range to your signal.9Spectral ReferencesWhile true:balance can be used as a classical spectrum analyzer, its main strength comes into play when comparing a mix to different spectral references, like the typical spectral distribution of different common genres or the average distribution of multiple reference tracks.true:balance provides two simple, yet precise ways to compare the qualities of a mix with references:Common Referencespredefined genre-based spectral distributions Custom Reference Tracksuser-defined reference tracks that create custom reference targetsReference SelectorBy clicking the reference selector, you can define if the common references, or the reference tracks should be used as the current spectral reference targets. The common references provide an extensive list of predefinedreferences based on the typical spectral distribution of different genres. The references make it easy to see if a track meets the average spectral characteristics of acertain genre.10For example, if you are producing an EDM track, you can select the “Electronic” genre as reference and compare your track with it. If your track meets the reference in the mids and highs, but overshoots in the low end, it indicates that you should probably tame your bass or kick a bit. Note that it will take a couple of seconds until changes made to your signal will be reflected by the green average spectrum line.Common References11 Channel Data SectionThe lower section of true:balance focuses on the widthand correlation of the analyzed signal. The values arecomputed for the overall signal as well as the threefrequency regions low, mid and high.While a good spectral balance is essential for a great mix,it’s also important to make sure that the overall soundand feeling of a (stereo) mix remains intact when playedback in mono. Checking the width and correlation helpsto identify potential mono-compatibility issues or spatialbalancing problems in the mix.A mon o sign al is gen erated by summin g the left an dright chan n el of a stereo sign al. This loss of the widthlayer means, that all signals covering a certain frequencyregion are now all coming from the same direction andare no longer separated by their spatial distance to eachother. The collapse of all sources in to on e location canlead to problematic maskin g effects. A mix with clearlydistinguishable sources on stereo may sound muddy inmono and quiet components may even be fully maskedby competing sources.Besides, problematic temporal relation ships betweensimilar signal components on both channels can lead tophase-cancellation issues and the so-called comb-filter-in g effect. A comb-filter emerges when two sign als aresummed together that carry similar frequen cy-compo-n en ts with a problematic phase-shift (e.g. 180°). Thesefrequency components will cancel each other out whensummed together, leading to a metallic and hollow sound.12Width & CorrelationWidthThe width indicates how wide the stereo image will be perceived. A very low width indicates that most of the signal’s energy is coming from the center (this is typically a good idea for the low end), while a very high width shows that a lot of signal energy is coming from the sides.Correlationlation problems.Width Indicatorsrange.width of the signal.Make sure that your low-end is very narrow or mono.Sin ce very low frequen cies are n on -direction al when played back, you should always try to keep them mono. Bass signals in stereo are particularity prone to phase-cancellation issues – so always make sure that the width of your bass is not unnecessarily wide.Be careful with extremely wide panning.The further left or right a signal is panned in the stereo mix, the better that sources overlapping in frequency are separated by the additional layer of width. If you listen to your mix in mono and realize that one of your sources suddenly disappears, you may try to pan them closer to the center in the stereo mix (reduce the width).is a mono signal.Correlation Indicatorstive frequency range.signal.13 Output MeterThe output meter displays the current peak and RMSvalue for each channel. The small number above themeter shows the current RMS value.14All good!Balance Check The balance check feature analyzes the current spectral distribution of your signal and compares the result with the balance of the chosen reference target. Based on the analysis, small info boxes let you know if the levels in different frequency ranges are on track or if you should probably tweak the mix before publishing it.There’s a potential issue with this parameter thatshould be fixed.15Mono Check Mono CheckThe mono check feature analyzes the width and correla-tion of your signal. It will point out potential problems with high width values for the low and it will assess the currentcorrelation value in small info boxes.Mono FilterWhile mono check is active, the filter caused by mono-summing will be displayed as an additional yellow line inside the spectrum display. A more or less static filter shape indicates that the mono-summing will lead to spectral problems (e.g. comb-filters), while a constantly varying filter shows that summing to mono will not lead to any static filtering effects.16PresetsA preset saves the settings of the plug-in, including all currently loaded reference tracks. This means that a preset can be used to compare multiple different tracks(e.g. of an album) against the same custom references.SaveSave your preset.Preset DropdownLoad a saved preset from the preset dropdown.To delete a preset or change its name, go to the preset folder in your local file explorer.You can easily share your presets among differentworkstations. All presets are saved with the file extension “.spr” in the following folders:Preset FoldersOSX: ~/Library/Audio/Presets/sonible/truebalanceWindows: My Documents\Presets\sonible\truebalanceSettings17To visit the settings page, click the cogwheel in the upperright corner.Show TooltipsEnable/disable tooltips on hover.Use OpenGLOpenGL might cause rendering issues on certaincomputer hardware. Use this option to disable OpenGL.Share anonymous user datawith sonibleEnable to share fully anonymous user data with sonibleand help us improve our plug-ins.License InformationThis will display your license state and number (when notlicensed via iLok)Plug-in InformationHere you can find the name and version of your plug-in.Start the welcome tour – a quick overview of the plug-in –features by clicking on “show tutorial”.Update NotificationWhen a new version of the plug-in is available, you’llreceive a notification here and it’s also indicated by a littledot on the cogwheel in the main view of true:level. Clickon the green text to download the latest version.sonible GmbH Haydngasse 10/18010 GrazAustriaphone: +43 316 912288*******************All specifications are subject to change without notice.©2022, sonible GmbH. All rights reserved.Engineered & designed by sonible in Austria./truebalancetrue:balance。
基于机器学习的车辆目标检测与追踪研究车辆目标检测与追踪是自动驾驶技术中关键的研究领域之一。
随着机器学习算法的发展和计算硬件的提升,基于机器学习的方法在车辆目标检测与追踪中取得了明显的进展。
本文将针对基于机器学习的车辆目标检测与追踪进行研究,介绍相关方法和技术。
一、引言车辆目标检测与追踪技术在自动驾驶领域具有重要的应用价值。
车辆目标检测是指识别图像或视频中的车辆存在,并准确定位其位置,而车辆目标追踪是指在检测到的车辆基础上,进行连续的跟踪和预测。
二、基于机器学习的车辆目标检测基于机器学习的车辆目标检测方法可以分为两类:传统机器学习方法和深度学习方法。
1. 传统机器学习方法传统机器学习方法主要采用特征工程的方式,通过提取图像或视频中的各种特征,再通过分类器进行目标检测。
(1)特征提取在传统机器学习方法中,常用的特征提取方法包括Haar特征、边缘特征和HOG(Histogram of Oriented Gradients)特征等。
这些特征提取方法可以在图像或视频中提取到车辆的边缘、形状和纹理等信息。
(2)分类器在得到特征向量后,常用的分类器有支持向量机(SVM)、AdaBoost和随机森林等。
这些分类器可以通过训练样本来学习到车辆的特征模式,并进行目标检测。
2. 深度学习方法深度学习方法在车辆目标检测中取得了显著的进展。
它主要利用多层神经网络结构对图像或视频进行端到端的学习和特征提取。
(1)卷积神经网络(CNN)卷积神经网络是深度学习中最常用的网络结构之一。
它通过多层卷积和池化操作,自动学习到图像或视频中的特征信息。
在车辆目标检测中,常用的卷积神经网络结构有Faster R-CNN、YOLO和SSD等。
(2)循环神经网络(RNN)循环神经网络主要用于车辆目标的跟踪和预测。
通过记忆之前的状态信息,RNN可以在视频中实现车辆的连续追踪,并预测车辆的未来位置。
三、基于机器学习的车辆目标追踪基于机器学习的车辆目标追踪是车辆目标检测的延伸和细化,主要关注车辆的运动轨迹和未来预测。
基于Mean Shift的变尺度快速运动目标自适应跟踪算法杨志菊;刘宝华【期刊名称】《太赫兹科学与电子信息学报》【年(卷),期】2015(000)002【摘要】An auto-adaptive tracking algorithm for fast moving target is put forward based on the improved traditional Mean Shift tracking algorithm, in order to achieve good tracking of fast moving target with variable scale. This algorithm firstly adopts the color image constituted by the pixels of target region with spatial weighting as initial frame object template, and the true position of target is obtained by the iteration of Mean Shift algorithm, therefore the spatial localization of fast moving target is realized. Then the features of adjacent frame targets are matched by Scale Invariant Feature Transform(SIFT) operator;the kernel bandwidth of next frame is updated in real time according to the scaling factor of target; the tracking window size of the algorithm is amended, which can adapt to the variable scales of the target, so the scale localization of fast moving target is achieved. Finally, the experiments demonstrate that compared with the traditional Mean Shift tracking algorithm, the tracking accuracy rate of the algorithm is above 97%, and the algorithm can accurately track the fast moving target with variable scales.%为了实现对变尺度快速运动目标的良好跟踪,在对传统 Mean Shift跟踪算法改进的基础上,提出了一种运动目标自适应跟踪算法。
现代电子技术Modern Electronics Technique2023年12月1日第46卷第23期Dec. 2023Vol. 46 No. 230 引 言电商平台、广告公司等需要根据用户的喜好推荐内容,所以说对用户喜好进行推荐预测是非常重要的。
然而在推荐任务中寻找有意义的特征组合极为重要[1⁃2]。
一般来说,推荐算法主要分为传统推荐模型和基于深度学习的推荐模型两类。
传统推荐模型主要使用协同过滤[3⁃4]和矩阵分解[5⁃6],但这些模型忽略了与用户和物品相关的其他特征信息,不能有效地开发特征。
因式分解机(Factorization Machine, FM )模型[7]可以对低阶特征进行特征交叉,但无法挖掘高阶信息,而特征域因式分解机(Filed Factorization Machine, FFM )模型[8]则在此基础上增强了特征交叉,但其本质仍然是针对低阶特征。
随着深度学习在自然语言处理、计算机视觉、对抗攻击等领域的快速发展,为推荐算法开辟了新的机遇。
研究人员发现全连接层[9]在高阶特征的挖掘有良好的表现。
所以研究人员主要以深度神经网络(Deep Neural Network, DNN )为核心,结合传统推荐模型来进行改进融合门控单元与多头自注意力机制的特征自动交互推荐算法喻金平, 李 钰, 姚炫辰, 罗 琛(江西理工大学 信息工程学院, 江西 赣州 341000)摘 要: 为了解决推荐算法中使用手工制作、特征工程等方式枚举所有的特征组合不但会带来巨大的存储空间和计算成本,而且无用的特征交互会引入噪声使模型训练过程复杂化的问题,文中提出融合多头自注意力机制的特征自动交互推荐算法。
该算法首先利用门控机制对输入特征进行初次筛选;然后将特征送入多头自注意力机制中,选取关键特征进行不同阶的组合;最后利用残差网络进行特征融合输出预测结果。
该算法能有效地提高预测结果的准确性,同时具有良好的解释性。
关键词: 门控单元; 自动特征交互; 多头自注意力机制; 推荐算法; 特征组合; 可解释性中图分类号: TN911.1⁃34; TP301.6 文献标识码: A 文章编号: 1004⁃373X (2023)23⁃0126⁃07Feature automatic interactive recommendation algorithm integrating gating unitand multi⁃head self⁃attention mechanismYU Jinping, LI Yu, YAO Xuanchen, LUO Chen(School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, China)Abstract : In recommended algorithms, enumerating all feature combinations by means of manual production and feature engineering will result in huge storage space and computational costs. In addition, irrelevant feature interaction will bring noise, which will complicate the model training process, so a feature automatic interactive recommendation algorithm integrating multi⁃head self⁃attention mechanism is proposed. In the algorithm, a gating mechanism is used to filter the input features, the features are fed into the multi ⁃head self ⁃attention mechanism, and the key features are selected for combination of different stages.Finally, a residual network is utilized for feature fusion and output of prediction results. The proposed algorithm can effectively improve the prediction accuracy while maintaining good interpretability.Keywords : gating unit; automatic feature interaction; multi⁃head self⁃attention mechanism; recommended algorithm; featurecombination; interpretabilityDOI :10.16652/j.issn.1004⁃373x.2023.23.023引用格式:喻金平,李钰,姚炫辰,等.融合门控单元与多头自注意力机制的特征自动交互推荐算法[J].现代电子技术,2023,46(23):126⁃132.收稿日期:2023⁃06⁃07 修回日期:2023⁃06⁃28基金项目:中央引导地方科技发展专项资金(20201ZDI03003)126第23期和创新。
An Improved Heuristic Algorithm for UAV Path Planning in 3D Environment Zhang Qi1, Zhenhai Shao1, Yeo Swee Ping2, Lim Meng Hiot3, Yew Kong LEONG4 1School of Communication Engineering, University of Electronic Science and Technology of China2Microwave Research Lab, National University of Singapore3Intelligent Systems Center, Nanyang Technological University4Singapore Technologye-mail:beijixing2006@,zhenhai.shao@, eleyeosp@.sg,emhlim@.sg, leongyk@Abstract—Path planning problem is one of core contents of UAV technology. This paper presents an improved heuristic algorithm to solve 3D path planning problem. In this study the path planning model is built based on digital map firstly, and then the virtual terrain is introduced to eliminate a significant amount of search space, from 3-Dimensions to 2-Dimensions. Subsequently the improved heuristic A* algorithm is applied to generate UAV trajectory. The algorithm is featured with various searching steps and weighting factor for each cost component. The simulation results have been done to validate the effectiveness of this algorithm.Keywords-unmanned aerial vehicle (UAV); path planning; virtual terrain; heuristic A* algorithmI.I NTRODUCTIONPath planning is required for an unmanned aerial vehicle (UAV) to meet the objectives specified for any military or commercial application. The general purpose of path planning is to find the optimal path from a start point to a destination point subject to the different operational constraints (trajectory length, radar exposure, collision avoidance, fuel consumption, etc) imposed on the UAV for a particular mission; if, for example, the criterion is simply to minimize flight time, the optimization process is then reduced to a minimal cost problem.Over decades several path planning algorithms have been investigated. Bortoff [1] presented a two-step path planning algorithm based on Voronoi partitioning: a graph search method is first applied to generate a rough-cut path which is thereafter smoothed in accordance with his proposed virtual-force model. Anderson et al. [2] also employed Voronoi approaches to generate a family of feasible trajectories. Pellazar [3], Nikolos et al. [4] and Lim et al. [5] opted for genetic algorithms to navigate the UAV. The calculus-of-variation technique has been adopted in [6]-[7] to find an optimal path with minimum radar illumination.In this paper, an improved heuristic algorithm is presented for UAV path planning. The path planning environment is built in section II, and the algorithm is depicted in section III, the following section presents experimental results which can validate the effectiveness of the proposed algorithm.II.P ATH PLANNING MODELSeveral factors must be taken into account in path planning problem: terrain information, threat information, and UAV kinetics. These factors form flight constraints which must be handled in planning procedure.Many studies use the mathematical function to simulate terrain environment [4]. This method is quick and simple, but compared with the real terrain which UAV flying across, it lacks of reality and universality. In this study, terrain information is constructed by DEM (digital elevation model) data, which is released by USGS (U.S. Geological Survey) as the true terrain representation.Threat information is also considered in path planning. In modern warfare, almost all anti-air weapons need radar to track and lock air target. Here the main threat is radar illumination. Radar threat density can be represented by radar equation, because the intrinsic radar parameters are determined before path planning. The threat density can be regarded inversely proportional to R4, where R is the distance from the UAV’s current location to a particular radar site.For simplicity, UAV is modeled as a mass point traveling at a constant velocity and its minimum turning radius is treated as a fixed parameter.III.P ATH PLANNING A PPRO A CHA.Virtual terrain for three-dimensional path planningUnlike ground vehicle routing planning, UAV path planning is a 3D problem in real scenario. In 3D space, not only terrain and threat information is taken into account, but also UAV specifications, such as max heading angle, vertical angle, and turning radius are incorporated for comprehensive consideration.The straightforward method for UAV path planning is partitioning 3D space as 3D grid and then some algorithms are applied to generate path. However, for any algorithm the computational time is mainly dependent on the size of search space. Therefore, for efficiency consideration, a novel concept of constructing a 2D search space which is based on original 3D search space is proposed, which is called virtual terrain. The virtual terrain is constructed above the real terrain according to the required flight safety clearance2010 Second International Conference on Intelligent Human-Machine Systems and Cyberneticsheight, as it is shown in Figure 1. . A’B’C’D’ is the real terrain and ABCD is virtual terrain. H is the clearance height between two surfaces. Virtual terrain enables path planning in 2D surface instead of 3D grid and can reduce search spaceby an order of magnitude.Figure 1. virtual terrain above real terrainB. Path planning algorithmA* algorithm [8]-[9] is a well-known graph search procedure utilizing a heuristic function to guide its search. Given a consistent admissible condition, A* search is guaranteed to yield an optimal path [8]. At the core of the algorithm is a list containing all of the current states. At each iterative step, the algorithm expands and evaluates the adjacent states of all current states and decides whether any of them should be added to the list (if not in the list) or updated (if already in the list) based on the cost function:()()()f n g n h n =+ (1)where f(n) is the total cost at the current vertex, g(n)denotes the actual cost from the start point to the current point n , and h(n) refers to the pre-estimated cost from the current point n to the destination point. For applications that entail searching on a map, the heuristic function h(n) is assigned with Euclidean distance.UAV path planning is a multi criteria search problem. The actual cost g(n) in this study is composed by three items: distance cost D(n), climb cost C(n) and threat cost T(n). So g(n) can be described as follows:()()()()g n D n C n T n =++ (2) Usually, the three components of g(n) are not treatedequally during UAV task. One or two is preferred to the others. We can achieve this by introducing a weighting factor w in (2).123()()()()g n w D n w C n w T n =++ (3) w i is weighting factor and 11mi i w ==∑. For example, ifthreat cost T(n) is for greater concern in particular task, the value of w i should be increased respectively.C. The improvement of path planning strategyVirtual terrain in part A enhanced computational efficiency by transforming 3D path planning space into 2D search plane. The further improvement can be achieved by applying a new developed strategy. The path planner expands and evaluates next waypoint in virtual terrain by this developed strategy is shown in Fig. 2, 3. This planning strategy employs various searching steps by defining a searching window which can represent the information acquired by UAV on board sensors. It enables different searching steps to meet different threat cost distribution. After searching window is set, UAV performance limits is imposed in searching window based on virtual terrain. Here the UAV performance limits include turning radius, heading and vertical angle. In Fig. 3, the point P(x, y, z) is current state, and the arrow represents current speed vector. The gray points show available states which UAV can reach innext step under the limits imposed by UAV performance.Figure 2.Searching windowFigure 3. Available searching states at P(x, y, z)IV. SIMULATIONSimulation is implemented based on section II andsection III. In this simulation, terrain data is read from USGS1 degree DEM. The DEM has 3 arc-second interval alonglongitude and latitude respectively. Also five radar threats are represented according radar equation in simulation environment. Here clearance height h is set 200 to definevirtual terrain. UAV maximal heading angle and vertical angle is 20。
Automatic Feature Weighting in Automatic Transcription ofSpecified Part in Polyphonic MusicKatsutoshi Itoyama,Tetsuro Kitahara,Kazunori Komatani,Tetsuya Ogata and Hiroshi G.Okuno Dept.of Intelligence Science and Technology,Graduate School of Informatics,Kyoto UniversitySakyo-ku,Kyoto606-8501,Japan{itoyama,kitahara,komatani,ogata,okuno}@kuis.kyoto-u.ac.jpAbstractWe studied the problem of automatic music transcription (AMT)for polyphonic music.AMT is an important task for music information retrieval because AMT results enable retrieving musical pieces,high-level annotation,demixing, etc.We attempted to transcribe a part played by an instru-ment specified by users(specified part tracking).Only two timbre models are required in the specified part tracking to identify the specified musical instrument even when the number of instruments increases.This transcription is for-mulated into a time-series classification problem with mul-tiple features.We furthermore attempted to automatically estimate weights of the features,because the importance of these features varies for each musical signal.We esti-mated quasi-optimal weights of the features using a genetic algorithm for each musical signal.We tested our AMT sys-tem using trio stereo musical signals.Accuracies with our feature weighting method were69.8%on average,whereas those without feature weighting were66.0%.Keywords:automatic music transcription,specified part track-ing,feature weighting,genetic algorithm1.IntroductionRecently,because of the growth of the digital music indus-try,demand for music information retrieval(MIR)and man-agement of musical data has been increasing.Automatic music transcription(AMT)is needed to improve MIR be-cause musical scores enable MIR by melody or musical in-strument,etc.AMT for polyphonic music generally con-sists of two successive processes:note formation,which es-timates the onset time and pitch of each note,and stream formation,which classifies the formed notes by their in-struments(parts).The latter problem has not been studied enough,which made the AMT incomplete.Therefore,a method to form streams is strongly required to realize an AMT for polyphonic music.Previous studies of stream formation were classified broadly into two approaches.One identifies the musical instruments Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copiesare not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on thefirst page.c 2006University of VictoriaSpecified InstrumentFigure1.Overview of Specified Part Trackingof all parts and labels all the instruments given[1,2,3].Inthis approach,training data for all instruments which couldbe contained in musical pieces is required to separate all in-struments exactly.Another forms streams without informa-tion about musical instruments contained in musical pieces.This approach does not require training data[4].However, users cannot extract streams they wanted because the ob-tained streams have no label of the target instrument.We developed a new approach in which the AMT systemis given one of the musical instruments included in the mu-sical pieces and transcribes that part of the musical pieces.By focusing on only one musical instrument a user specified, we require only two timbre models to identify the specifiedmusical instrument.Specified part tracking is the streamformation based on our approach.We also developed a method for automatically estimat-ing weights of features which are used in the specified parttracking.The importance of these features depends on musi-cal signals.For example,directional reliability by alignment of instruments,distortion of timbre features by noises.Wedevelop a method for estimating quasi-optimal weights foreach musical signal using a genetic algorithm.2.Problem SpecificationSpecified part tracking classifies musical notes into the setof notes of a specified instrument N and that of other in-struments¯N.We defined a pair of them H=(N,¯N)as a hypothesis of the specified part tracking.The specified parttracking is performed as follows:1.Generate two initial hypotheses ({n 1},φ),(φ,{n 1})for the first note n 1.2.Expand each hypothesis H =(N,¯N)on notes n 1···n k into two new hypotheses H 0=(N ∪{n k +1},¯N)and H 1=(N,¯N∪{n k +1}),and calculate the reliability (corresponding likelihood)of each hypothesis.3.If the number of hypotheses exceeds a constant K ,delete all hypotheses except those reliabilities are in the top K .4.Iterate 2and 3for all notes.5.After expanding hypotheses and calculating their reli-ability through a note list,output a hypothesis which has the maximum reliability as the result of the spec-ified part tracking.3.ImplementationWe implemented the specified part tracking with four fea-tures.The features were classified into two:we used “Tim-bre Similarities to the Model”to evaluate the similarity be-tween note n and the specified instrument;“Timbre Simi-larities to the Specified Part,”“Proximity of Localization,”and “Pitch Transition Frequency”to evaluate the similarity between n and the specified part.The latter features were designed based on Sakuraba et al.[4].Timbre Similarity to the Model (f I )This feature representsthe timbre similarity between a note n and the spec-ified instrument.The timbre of note n is described by vector x (n )proposed by Kitahara et al.[5].The distance of n to the model of the specified instrument M represents the similarity between n and the instru-ment,but features extracted from mixed sounds are frequently distorted.We used global model G ,which does not depend on any musical instruments,and we used the distance of n to M (d (n,M ))divided by the distance n to G (d (n,G ))to evaluate the similarity.f I (n )is described as the statistical probability calcu-lated by an F-test:f I (n )= ∞d I (n )ξm/2−1B (m/2,m/2)(ξ+1)m d ξ,where d I (n )=d (n,M )/d (n,G ),m =dim(x (n )),and B (m 1,m 2)is the Beta function.Timbre Similarity to the Specified Part (f S )This featurerepresents the timbre similarity of a note n and the part N .The timbre features are described as the same as above.We used the distance n to the distribution of the timbre features of ˜n ∈N (d S (n ))to evaluate the similarity.f S (n )is calculated by a χ2-test:f S (n )= ∞d S (n )ξm/2−1e −ξ/22m/2Γ(m/2)d ξ,where Γ(m )is the Gamma function.Localizational Proximity (f L )This feature represents thelocalizational proximity of n to N .The localization of the note is the mode value of the interaural phase difference (IPD)of every frame.We used the distance of n to the distribution of the localization of ˜n ∈N (d L (n ))to evaluate the proximity.f L (n )is calculated by a χ2-test:f L (n )=∞d L (n )1√2πξ−1/2e −x/2d ξ.Pitch Transition Frequency (f T )This feature represents thefrequency of pitch transitions.This is the trigram probability that n follows N .We used the model of the pitch transition as a trigram model in which the pitch occurrence probability depends on the pitch of the adjacent two notes (pitch (n c −1)and pitch (n c )).f T (n )is described as a posterior probability under N :f T (n )=p (pitch (n )|pitch (n c −1),pitch (n c )).We used two different timbre features f I and f S .There is the risk that the timbre features of each musical piece and training data and then the reliability of f I becomes lower.Even if they are not similar,the reliability of f S keeps up because f S compares the timbre features between the musi-cal notes in the same musical piece.The reliability of a hypothesis f (H )is calculated based on multiple features as:f (H )=f I (H )×⎛⎝i ∈S,L,Tw i f i (H )⎞⎠,f i (H )=n ∈Nf i (n )−n ∈¯Nf i (n ).f I is not given the weight and previleged,because our aimis tracking the part that the user specified and f I is the only feature that evaluates the timbre similarity to the instrument that the user specified.4.Automatic Weighting of Multiple FeaturesAfter evaluation of hypotheses,optimal weights of features differ depending on the recording conditions of acoustic sig-nals,etc.Therefore,these weights must be automatically estimated from acoustic signals.To do this,we designed a fitness of the specified part tracking.Optimal weights can be estimated by searching for weights that maximize this fitness.We defined the two following conditions to estimatethe fitness of H =(N,¯N )and designed quantitative mea-sures.1.The number of notes derived from the specified in-strument in N is greater than in ¯N.We designed the difference between N and ¯Nof the feature on the similarities of timbre to the model as:E [f I (N )]−E [f I (¯N)].2.The majority of notes included in N are derived fromthe same sound source.We designed the summation of the ratio of within-class variance to between-classvariance as:i∈I,S,L,TE[f i(N)]−E[f i(¯N)]2 Var[f i(N)]+Var[f i(¯N)].We defined the product of these two values as thefitness of the specified part tracking.We used a genetic algorithm to search for quasi-optimal weights because theoretical calcu-lation of the weights that maximize thefitness is difficult. The procedure of automatic weight estimation is as follows:1.Generate initial genes randomly.2.Track a specified part with the weights of each gene.3.Calculate thefitness of each gene from the results ofthe specified part tracking.4.Select genes by elite and roulette wheel selection.5.Crossover between two randomly selected parents andgenerate a new gene which has a weight that is themean of the weights of parents.6.Mutate randomly selected genes into randomly calcu-lated weights.7.Output the weights of the gene with the highestfitnesswhen above steps repeated L times(L is a constant.)5.ExperimentWe conducted three experiments on AMT for polyphonic music to show the effectiveness of our method:1.We evaluated the effectiveness of automatic featureweighting.We used a trio musical signal includingviolin,flute and piano and tracked each instrumentpart.2.We evaluated the robustness of the specified part track-ing with automatic feature weighting to errors derivedfrom automatic note formation.We used HTC[6]asa baseline method of note formation.3.We evaluated whether automatic feature weighting canestimate appropriate weights:the estimated weightsreflect the importance of the features.We tested thespecified part tracking and automatic feature weight-ing,to see whether the weight for proximity of local-ization decreases according to the reliability of local-ization.We created the musical pieces with severalreliabilities of localization by adding following devi-ation to the localization of each note:50×X×(Variance Rate of Localization), where X is a random variable derived from N(0,1).We evaluated the accuracy F using the F-measure,which isdefined asP=#of notes which are correctly tracked#of notes the system outputs,R=#of notes which are correctly tracked#of notes which is on the scoreand,F=2×P×RP+R.In experiment2,correctly tracked notes mean that the notes have correct pitch and their onset time deviation is at most10ms.We compared three feature weightings:even weights; weights estimated by our method;weights estimated by our method using the accuracy as thefitness(upper limit).5.1.Data for ExperimentsThe polyphonic musical signal we used was“Auld lang syne,”played for about1minute,which included242notes.This musical signal was generated by mixing audio data taken from RWC-MDB-I-2001[7]according to a standard MIDIfile(SMF)on a computer.To create the timbre model and the global model,we used mixed sound templates[5].We used duo and trio musical pieces for mixed sound templates which were generated according to the SMFs from RWC-MDB-C-2001(Piece Nos.13,16and17)[8].We also used SMFs from RWC-MDB-C-2001(Piece Nos.1–50)to create the trigram model of pitch transition.5.2.Experimental ResultsThe results of experiments1and2are listed in Tables1and 2,ing automatic feature weighting,we im-proved the accuracies from66.0%to69.8%in experiment1on average.This shows that the introduction of weights avoided incorrect part tracking(e.g.,tracking the violin part even though theflute part was specified).We improved the accuracies from44.5%to55.3%in experiment2on average. This shows the robustness of our feature weighting methodto errors derived from automatic note formation.The resultsof experiment3are listed in Table3.This shows that the more a musical signal has variance of localization,the more the weight w L decreases(i.e.,appropriate weights were es-timated according to the importance of features).The accu-racies were also improved by feature weighting.In experiment1,the accuracy of the piano part decreased from98.3%to93.1%.This shows our feature weighting method cannot always estimate better weights than even weights. However,the results also show the number of false alarms decreased by feature weighting.This means the estimated weights can reject the notes of other part and noises derived from note formation.It was notable that the accuracies of theflute part in ex-periment1and2were reversal toward the accuracies of the violin and piano parts.We assumed this as follows. The timbre features of theflute notes were often distortedin polyphonic music,because the power of theflute notesTable1.Results of Experiment1Tracking F with Weights UpperPart Even Estimated Limit of FVn85.7%91.9%92.5%Fl14.2%24.6%24.6%Pf98.3%93.1%99.1%total66.0%69.8%72.1%Table2.Results of Experiment2Tracking F with Weights UpperPart Even Estimated Limit of FVn30.6%48.7%50.0%Fl36.9%40.7%43.2%Pf66.0%76.6%77.6%total44.5%55.3%56.9%Table3.Results of Experiment3Variance Rate Estimated Weights F with Weights of Localization w S w L w T Even Est.00.400.550.0592.6%97.4%10.590.200.2185.7%91.9%20.280.120.6074.4%81.5% at their onset time is smaller than the power of other in-struments.However,the timbre features of theflute notes were hardly distorted if its onset time varies slightly be-cause theflute notes have gradual power envelope at on-set time.The note formation detects a strong attack as the onset time,and the onset time offlute notes was estimated slightly late.The distortion of the timbre features of theflute notes caused by the onset time deviation was smaller than by mixed sounds.Therefore,the transcription was more correct with automatic note formation.6.ConclusionWe developed the specified part tracking and automatic fea-ture weighting,and showed that our method can estimate better weights than even weights in many cases.We also confirmed the robustness to the error derived from automatic note formation.We need to improve our feature weight-ing to bring the estimated weights close to optimal weights, specifically by investigating thefitness in the GA.We did not refer to conventional methods of note for-mation.Since accuracies of note formation were different among the parts,the results of experiment2were affected by note formation.Many studies have been done on note formation,and we need to examine several note formation methods.We are also planning to evaluate more complex musical pieces(e.g.,including drums and commercial CD music).We designed four features for the specified part track-ing.Specifically,we used two different features about tim-bre similarity because humans can often distinguish instru-ment sounds by previous contents of musical pieces if they have not listened to the instruments.In addition,we used only two timbre models to identify musical instruments:the model of the specified instrument and the global model.Al-though conventional studies on musical instrument identifi-cation have been using models of the all instruments that a musical piece contains,our method is a new approach.Many studies on musical instrument identification require that all instruments are known.However,this approach has several weak points:when the number of instruments in-creases,new data of those instruments must be created,etc. By contrast,the specified part tracking is scalable on the number of instruments that the system needs to prepare the data of instruments that users want to track.Because we did not evaluate the number of instruments of musical pieces, this is part of our future work.7.AcknowledgementsThis research was partially supported by the Ministry of Ed-ucation,Culture,Sports,Science and Technology(MEXT), Grant-in-Aid for Scientific Research and Informatics Re-search Center for Development of Knowledge Society In-frastructure(COE program of MEXT,Japan).This research used the RWC Music Database(Classic,Musical Instrument Sound)[7,8],and we thank everyone who contributed this database.References[1]K.Kashino and H.Murase,“A Sound Source IdentificationSystem for Ensemble Music Based on Template Adaptation and Music Stream Extraction,”Speech Communication,vol.27,pp.337–349,Mar.1999.[2]T.Kinoshita,S.Sakai and H.Tanaka,“Musical SoundSource Identification Based on Frequency Component Adap-tation,”Proc.IJCAI CASA Workshop,pp.18–24,Aug.1999.[3] E.Vincent and X.Rodet,“Instrument Identification in Soloand Ensamble Music Using Independent Subspace Analy-sis,”in Proc.ISMIR,pp.576–581,2004.[4]Y.Sakuraba,T.Kitahara and H.G.Okuno,“Comparing Fea-tures for Forming Music Streams in Automatic Music Tran-scription,”in Proc.ICASSP,vol.IV,pp.273–276,2004. [5]T.Kitahara,M.Goto,K.Komatani,T.Ogata and H.G.Okuno,“Instrument Identification in Polyphonic Music: Feature Weighting with Mixed Sounds,Pitch-Dependent Timbre Modeling and Use of Musical Context,”in Proc.IS-MIR,pp.558–563,2005.[6]H.Kameoka,T.Nishimoto and S.Sagayama.“Harmonic-Temporal Structured Clustering via Deterministic Annealing EM Algorithm for Audio Feature Extraction,”in Proc.IS-MIR,pp.115–122,2005.[7]M.Goto,H.Hashiguchi,T.Nishimura and R.Oka,“RWCMusic Database:Music Genre Database and Musical Instru-ment Sound Database,”in Proc.ISMIR,pp.229–230,2002.[8]M.Goto,H.Hashiguchi,T.Nishimura and R.Oka,“RWC Music Database:Popular,Classical,and Jazz Music Databases,”in Proc.ISMIR,pp.287–288,2002.。