multi-label classification of music into emotions
- 格式:pdf
- 大小:148.28 KB
- 文档页数:6
音乐的分类的英语作文Music Classification。
Music is a universal language that transcends cultural and linguistic barriers. It has the power to evoke emotions, inspire creativity, and bring people together. Throughout history, music has evolved and diversified into various genres and styles, each with its own unique characteristics and appeal. In this essay, we will explore theclassification of music based on its genre, and how each genre contributes to the rich tapestry of musical expression.One of the most common ways to classify music is by its genre, which refers to the specific style or category of music. Some of the most popular music genres include classical, jazz, rock, pop, hip-hop, electronic, and folk, among others. Each genre has its own distinct sound, instrumentation, and lyrical content, catering to different tastes and preferences.Classical music, for example, is known for itsintricate compositions, rich harmonies, and use oforchestral instruments such as the violin, piano, and cello. It encompasses a wide range of styles, from the Baroque and Classical periods to the Romantic and Contemporary eras. Classical music is often associated with sophistication and elegance, and is commonly performed in concert halls and opera houses.On the other hand, jazz is a genre that originated in the African American communities of New Orleans in the late 19th and early 20th centuries. It is characterized by its improvisational nature, syncopated rhythms, and expressive melodies. Jazz music often features instruments such as the saxophone, trumpet, and double bass, and is known for its vibrant and energetic performances.Rock and pop music, on the other hand, are two of the most commercially successful genres in the modern music industry. Rock music is characterized by its heavy use of electric guitars, powerful vocals, and driving rhythms,while pop music is known for its catchy melodies, upbeat tempo, and emphasis on commercial appeal. Both genres have a wide appeal and are often associated with youth culture and mainstream popularity.Hip-hop and electronic music are two genres that have gained significant prominence in recent decades. Hip-hop music, which originated in the Bronx, New York City, is characterized by its use of rap vocals, sampled beats, and turntable scratching. It has become a global phenomenon, influencing fashion, language, and popular culture. Electronic music, on the other hand, is created using electronic instruments and technology, and encompasses a wide range of subgenres such as techno, house, and trance.Folk music is another genre that holds a special place in the hearts of many music lovers. It is characterized by its simplicity, authenticity, and emphasis on storytelling. Folk music often features acoustic instruments such as the guitar, banjo, and fiddle, and is deeply rooted in the traditions and heritage of different cultures around the world.In addition to these main genres, there are numerous subgenres and fusion genres that further diversify the musical landscape. For example, reggae is a subgenre of Jamaican music that combines elements of ska and rocksteady, while country music is a genre that originated in the rural Southern United States and has evolved to encompass a wide range of styles, from traditional to contemporary.In conclusion, music is a diverse and dynamic art form that encompasses a wide range of genres and styles. Each genre has its own unique characteristics and appeal,catering to different tastes and preferences. Whether it's the timeless elegance of classical music, theimprovisational spirit of jazz, the raw energy of rock and pop, the urban poetry of hip-hop, the electronic sounds of techno, or the heartfelt storytelling of folk music, thereis a genre for every mood and occasion. As music continuesto evolve and innovate, it will undoubtedly continue to enrich our lives and inspire future generations ofmusicians and music lovers.。
多标签文本分类指标计算公式英文版Multi-Label Text Classification: Metric Calculation Formulas Multi-label text classification is a crucial task in natural language processing (NLP) that involves assigning multiple labels to a given text. Accurate evaluation of such a task requires the use of appropriate evaluation metrics. In this article, we will explore the common evaluation metrics for multi-label text classification and provide the corresponding calculation formulas.Accuracy: Accuracy is the simplest evaluation metric, defined as the ratio of correctly classified instances to the total number of instances. However, it can be misleading in the case of multi-label classification as it does not consider the不平衡性between classes.(Accuracy = \frac{TP + TN}{TP + FP + FN + TN})TP: True Positives (Correctly predicted positive instances)TN: True Negatives (Correctly predicted negative instances) FP: False Positives (Incorrectly predicted positive instances) FN: False Negatives (Incorrectly predicted negative instances) Precision and Recall: Precision and recall are two widely used evaluation metrics in multi-label classification. Precision measures the proportion of positive instances that are correctly predicted, while recall measures the proportion of actual positive instances that are correctly predicted.(Precision = \frac{TP}{TP + FP})(Recall = \frac{TP}{TP + FN})3. F1 Score: The F1 score is the harmonic mean of precision and recall, providing a single metric that balances both precision and recall.(F1 Score = 2 \times \frac{Precision \times Recall}{Precision + Recall})4. Hamming Loss: Hamming Loss is a metric that measures the fraction of labels that are incorrectly predicted. It isparticularly useful in multi-label classification where the goal is to minimize the number of misclassified labels.(Hamming Loss = \frac{1}{n} \sum_{i=1}^{n} \frac{1}{|L_i|}\sum_{j \in L_i} xor(y_{ij}, \hat{y}_{ij}))(n): Number of instances(L_i): Set of true labels for instance (i)(y_{ij}): True label for instance (i) and label (j)(\hat{y}_{ij}): Predicted label for instance (i) and label (j)(xor): Exclusive OR operationSubset Accuracy: Subset Accuracy measures the fraction of instances whose entire set of true labels is correctly predicted. It is a strict metric that requires perfect prediction of all labels.(Subset Accuracy = \frac{1}{n} \sum_{i=1}^{n} [L_i = \hat{L}_i]) (L_i): Set of true labels for instance (i)(\hat{L}_i): Set of predicted labels for instance (i)([L_i = \hat{L}_i]): Indicator function that returns 1 if (L_i) is equal to (\hat{L}_i), 0 otherwise.In conclusion, accurate evaluation of multi-label text classification requires the use of appropriate evaluation metrics. The metrics discussed in this article, including accuracy, precision, recall, F1 score, Hamming loss, and subset accuracy, provide a comprehensive toolbox for assessing the performance of multi-label text classification models.中文版多标签文本分类:指标计算公式多标签文本分类是自然语言处理(NLP)中的一项重要任务,它涉及为给定的文本分配多个标签。
音乐分类的英语作文Music is a universal language that speaks to the soul. It is an art form that has been around for centuries, evolving and adapting to different cultures and societies. The classification of music is a complex topic, as it encompasses a wide range of genres, styles, and periods. Here, we will explore some of the major categories of music and their characteristics.Classical MusicClassical music is a genre that has its roots in Western culture, dating back to the medieval period. It is characterized by its structured compositions, often featuring symphonies, concertos, and operas. Composers like Mozart, Beethoven, and Bach are synonymous with classical music, known for their intricate melodies and harmonies.JazzJazz originated in the African-American communities of New Orleans in the late 19th and early 20th centuries. It is a genre that emphasizes improvisation, syncopated rhythms, and a distinctive swing feel. Jazz has many subgenres, including bebop, swing, and fusion, and is often associated with great musicians such as Louis Armstrong, Duke Ellington, and Miles Davis.Rock and RollRock and roll emerged in the mid-20th century, blendingelements of blues, country, and gospel music. It is characterized by a strong backbeat and a focus on electric guitars, bass, and drums. The Beatles, The Rolling Stones,and Elvis Presley are some of the pioneers of rock and roll who have shaped its sound and style.Hip-HopHip-hop is a cultural movement that originated in the Bronx, New York in the 1970s. It includes rapping, DJing, breakdancing, and graffiti art. The music is known for its rhythmic speech, often called rap, and beats produced by sampling and digital instruments. Artists like Tupac Shakur, Notorious B.I.G., and Jay-Z have been influential in the development of hip-hop.Electronic Dance Music (EDM)EDM is a broad range of percussive electronic music genres made largely for nightclubs, festivals, and raves. It is characterized by synthesized sounds, repetitive beats, and a focus on creating an immersive dance experience. Genreswithin EDM include techno, house, trance, and dubstep, with DJs like Calvin Harris, Martin Garrix, and David Guettaleading the scene.World MusicWorld music is a term used to describe music that originates from various regions around the globe, often incorporating traditional instruments and styles. It can include folk music, indigenous music, and music from non-Western cultures. This category is vast and diverse, encompassing everything from African drumming circles to traditional Chinese opera.IndieIndie, short for independent, refers to music produced outside of the mainstream commercial music industry. Indie music often emphasizes creativity and artistic expression over commercial success. It can encompass a wide range of genres and styles, from indie rock to indie folk. Bands like Arcade Fire, Vampire Weekend, and Florence + The Machine are known for their indie sound.Each category of music has its unique characteristics and history, contributing to the rich tapestry of the musical world. As we continue to explore and appreciate the diversity of music, we enrich our cultural understanding and our own personal experiences.。
多标签分类的cls损失函数
多标签分类(Multi-label Classification)是一种机器学习任务,其中每个样本可以属于多个类别。
在多标签分类中,一个样本可以同时属于多个类别,这与传统的二分类或多分类问题不同。
在多标签分类中,一个常用的损失函数是BCELoss(Binary Cross Entropy Loss)。
BCELoss用于多标签分类任务,它的定义如下:
L=−∑yi∗log(p)i\text{L} = -\sum y_i \log(p_i)L=−∑iyi∗log(pi) 其中:
yi=1 如果样本i属于类别j
yi=0 如果样本i不属于类别j
pijp_ijpij 是模型预测样本i属于类别j的概率
使用BCELoss作为损失函数时,模型的输出应为每个类别的预测概率,而不是每个样本的预测类别。
这意味着模型在输出时需要对每个类别给出独立的概率估计。
然而,值得注意的是,某些情况下使用Hinge Loss或Hamming Loss等其他损失函数可能更为合适。
选择哪种损失函数取决于具体任务和数据集的特性。
使⽤迁移学习(TransferLearning)完成图像的多标签分类(Multi-Label)任务本⽂通过迁移学习将训练好的VGG16模型应⽤到图像的多标签分类问题中。
该项⽬数据来⾃于,每张图⽚可同时属于多个标签。
模型的准确度使⽤F score进⾏量化,如下表所⽰:标签预测为Positive(1)预测为Negative(0)真值为Positive(1)TP FN真值为Negative(0)FP TN例如真实标签是(1,0,1,1,0,0), 预测标签是(1,1,0,1,1,0), 则TP=2, FN=1, FP=2, TN=1。
Precision=\frac{TP}{TP+FP},\text{ }Recall=\frac{TP} {TP+FN},\text{ }F{\_}score=\frac{(1+\beta^2)*Presicion*Recall}{Recall+\beta^2*Precision}其中\beta越⼩,F score中Precision的权重越⼤,\beta等于0时F score就变为Precision;\beta越⼤,F score中Recall的权重越⼤,\beta趋于⽆穷⼤时F score就变为Recall。
可以在Keras 中⾃定义该函数(y_pred表⽰预测概率):from tensorflow.keras import backend# calculate fbeta score for multi-label classificationdef fbeta(y_true, y_pred, beta=2):# clip predictionsy_pred = backend.clip(y_pred, 0, 1)# calculate elements for each sampletp = backend.sum(backend.round(backend.clip(y_true * y_pred, 0, 1)), axis=1)fp = backend.sum(backend.round(backend.clip(y_pred - y_true, 0, 1)), axis=1)fn = backend.sum(backend.round(backend.clip(y_true - y_pred, 0, 1)), axis=1)# calculate precisionp = tp / (tp + fp + backend.epsilon())# calculate recallr = tp / (tp + fn + backend.epsilon())# calculate fbeta, averaged across samplesbb = beta ** 2fbeta_score = backend.mean((1 + bb) * (p * r) / (bb * p + r + backend.epsilon()))return fbeta_score此外在损失函数的使⽤上多标签分类和多类别(multi-class)分类也有区别,多标签分类使⽤binary_crossentropy,假设⼀个样本的真实标签是(1,0,1,1,0,0),预测概率是(0.2, 0.3, 0.4, 0.7, 0.9, 0.2): binary{\_}crossentropy\text{ }loss=-(\ln 0.2 + \ln 0.7 + \ln 0.4 + \ln 0.7 + \ln 0.1 + \ln0.8)/6=0.96另外多标签分类输出层的激活函数选择sigmoid⽽⾮softmax。
标题:多标签分类模型评价标准摘要:本文旨在讨论多标签分类模型的评价标准,通过对多种评价指标的介绍和分析,以及对实际案例的讨论,来帮助读者更好地理解和应用多标签分类模型。
一、引言多标签分类是指一个样本可以同时属于多个类别,如一篇新闻可以属于政治、经济和社会这三个类别。
在实际应用中,多标签分类模型得到了广泛的应用,如文本分类、音乐推荐等领域。
然而,如何评价多标签分类模型的好坏一直是一个备受关注的问题。
本文将从多个角度对多标签分类模型的评价标准进行讨论。
二、评价指标介绍1.准确率(Accuracy):准确率是指分类正确的样本数占总样本数的比例。
在多标签分类中,准确率可以分为宏观准确率和微观准确率两种。
宏观准确率是对每个标签求准确率的平均值,而微观准确率是对每个样本分别计算准确率后求平均值。
2.精确率(Precision):精确率是指被分类为正例的样本中真正为正例的比例。
在多标签分类中,同样可以有宏观精确率和微观精确率。
3.召回率(Recall):召回率是指真正为正例的样本中被分类为正例的比例。
同样可以有宏观召回率和微观召回率。
4.F1值:F1值是精确率和召回率的调和平均数,是一个综合考虑精确率和召回率的指标。
5.覆盖率(Coverage):覆盖率是指模型最终将样本正确分类的比例。
6.标签一致性(Label Consistency):标签一致性是指模型对同一样本的不同标签是否保持一致。
7.多样性(Diversity):多样性是指模型对不同样本的标签之间的差异程度。
以上是常见的多标签分类模型评价指标,不同指标对模型的评价角度有所侧重,结合实际业务场景需要选择合适的评价指标进行综合评估。
三、案例讨论以文本分类为例,假设有一个基于多标签分类的新闻分类模型。
我们如何评价这个模型的好坏呢?1.准确率(Accuracy):虽然准确率是一个常用的评价指标,但在多标签分类中并不适用,因为一个样本可能同时属于多个标签,准确率无法很好地反映模型的性能。
如何应对机器学习中的多类别分类问题机器学习中的多类别分类问题在实际应用中经常遇到。
在这个问题中,我们需要将数据实例分为三个或更多个不同的类别。
例如,在图像分类中,我们可能需要将图像分类为动物、交通工具和食品等多个类别。
本文将介绍一些常见的方法和技术,以帮助应对机器学习中的多类别分类问题。
首先,我们需要了解多类别分类问题的特点和挑战。
相比于二分类问题,多类别分类问题需要考虑更多的类别和更复杂的决策边界。
对于每个类别,我们需要学习到一组特定的特征和规律,以便对新的实例进行准确分类。
因此,多类别分类问题需要更复杂和精细的模型。
以下是一些应对多类别分类问题的方法和技术:1. 一对多方法(One-vs-Rest):这是一种常见的方法,它将多类别分类问题转化为多个二分类问题。
对于每个类别,我们训练一个分类器来区分该类别与其他所有类别的实例。
在预测时,我们使用这些分类器对新的实例进行分类,选择概率最高的类别作为最终分类结果。
虽然这种方法简单易懂,但是由于每个分类器只关注一对类别,可能会导致类别间的不平衡问题。
2. 多标签分类方法(Multilabel Classification):这种方法将多类别分类问题转化为多个二分类问题,每个二分类问题表示一个类别的存在与否。
不同于一对多方法,多标签分类方法允许一个实例属于多个类别。
在训练时,我们为每个类别训练一个二分类器,并且在预测时,我们可以根据需要选择一个或多个类别作为最终分类结果。
这种方法适用于存在重叠类别的情况,例如图像中的多个对象同时出现。
3. 多分类器方法(Multiple Classifier System):这种方法使用多个分类器来解决多类别分类问题。
每个分类器负责分类一部分类别,全部分类器的预测结果经过集成或投票得到最终分类结果。
这种方法可以降低每个分类器的复杂度,提高整体的分类精度。
常见的多分类器方法包括随机森林和AdaBoost等。
4. 神经网络方法(Neural Network):神经网络在多类别分类问题中表现良好。
多标签分类(multi-labelclassification)综述意义⽹络新闻往往含有丰富的语义,⼀篇⽂章既可以属于“经济”也可以属于“⽂化”。
给⽹络新闻打多标签可以更好地反应⽂章的真实意义,⽅便⽇后的分类和使⽤。
难点(1)类标数量不确定,有些样本可能只有⼀个类标,有些样本的类标可能⾼达⼏⼗甚⾄上百个。
(2)类标之间相互依赖,例如包含蓝天类标的样本很⼤概率上包含⽩云,如何解决类标之间的依赖性问题也是⼀⼤难点。
(3)多标签的训练集⽐较难以获取。
⽅法⽬前有很多关于多标签的学习算法,依据解决问题的⾓度,这些算法可以分为两⼤类:⼀是基于问题转化的⽅法,⼆是基于算法适⽤的⽅法。
基于问题转化的⽅法是转化问题数据,使之使⽤现有算法;基于算法适⽤的⽅法是指针对某⼀特定的算法进⾏扩展,从⽽能够处理多标记数据,改进算法,适⽤数据。
基于问题转化的⽅法基于问题转化的⽅法中有的考虑标签之间的关联性,有的不考虑标签的关联性。
最简单的不考虑关联性的算法将多标签中的每⼀个标签当成是单标签,对每⼀个标签实施常见的分类算法。
具体⽽⾔,在传统机器学习的模型中对每⼀类标签做⼆分类,可以使⽤SVM、DT、Naïve Bayes、DT、Xgboost等算法;在深度学习中,对每⼀类训练⼀个⽂本分类模型(如:textCNN、textRNN等)。
考虑多标签的相关性时候可以将上⼀个输出的标签当成是下⼀个标签分类器的输⼊。
在传统机器学习模型中可以使⽤分类器链,在这种情况下,第⼀个分类器只在输⼊数据上进⾏训练,然后每个分类器都在输⼊空间和链上的所有之前的分类器上进⾏训练。
让我们试着通过⼀个例⼦来理解这个问题。
在下⾯给出的数据集⾥,我们将X作为输⼊空间,⽽Y作为标签。
在分类器链中,这个问题将被转换成4个不同的标签问题,就像下⾯所⽰。
黄⾊部分是输⼊空间,⽩⾊部分代表⽬标变量。
在深度学习中,于输出层加上⼀个时序模型,将每⼀时刻输⼊的数据序列中加⼊上⼀时刻输出的结果值。
·工作研究·《中国图书馆分类法》(第五版)中乐谱文献分类的优化刘 莹(浙江音乐学院图书馆 浙江杭州 310024)摘 要:乐谱是专业音乐院校图书馆中重要的馆藏文献,乐谱分类的优化对文献资源建设和读者服务有着重要的意义。
文章通过对《中国图书馆分类法》(第五版)中乐谱相关类目设置的分析,梳理其与音乐学科领域相关概念范畴不相符的问题,提出优化方案,并提出在音乐院校图书馆增设专业排架标识的建议,以期更为贴合音乐院校读者的乐谱使用需求,促进音乐院校图书馆学科服务的开展。
关键词:乐谱文献;《中国图书馆分类法》;分类;音乐专业;学科服务中图分类号:G254.1 文献标识码:AOptimization of the Classification of Musical Score Literature in the Chinese Library Classification (5th edition)Abstract Music score is an important collection in the library of professional music colleges. The optimization of musicscore classification is of great significance to the construction of literature resources and reader services. The research analyzes the setting of music score-related categories in the Chinese Library Classification (5th edition) and identifies the problems of its incompatibility with the conceptual categories related to the field of music subject. The study proposes an optimization plan and suggests the addition of specialized shelf markings in music college libraries. Our goal is to better meet the needs of music colleges in the use of musical stores and to promote the development of library subject services in music colleges.Key Words music score literature; Chinese Library Classification ; classification; music major; subject service* 本文系2021年度浙江音乐学院理论研究类项目“乐谱文献在《中国图书馆分类法》基础上的优化分类研究”(项目编号:2021KL008)的研究成果。
multilabel classification多标签分类(MultilabelClassification)是机器学习中实现”标签”和”分类”之间的关联关系的一种技术。
它有助于对数据进行分析,并为业务提供有用的指导和信息。
多标签分类的概念可以追溯到19世纪,但直到20世纪末,它才被广泛采用。
多标签分类的基本思想是将数据中的项目分类成一组可以分析的标签。
简单来说,多标签分类是指把每个类别文件都分配一组标签,每个标签都表达不同的信息。
标签可以是相关词汇,关键字,属性,类别,时间戳等等。
多标签分类只有在每个类别下都有足够的标签可用时才能实现。
它还可以无限延伸,比如可以以层级结构的形式设计出更多的标签组合。
多标签分类的应用领域非常广泛,涉及计算机视觉,自然语言处理,文本挖掘,社交媒体,生物学和其他各种领域。
例如,在文本挖掘中,多标签分类可以用于对文档或评论进行标签分类;在计算机视觉领域,它可以用于图像分类;在自然语言处理中,它可以用于语义分析;社交媒体中,多标签分类可以应用于推文收集。
多标签分类有两种主要方法,一种是基于决策树的类型,另一种是基于支持向量机的类型。
决策树是一种以决策为基础的机器学习技术,通过构建决策树模型可以解释一组特征的相互关系,以达到对数据做出判断的目的。
而支持向量机的思想是,在数据空间中,可以找到一条超平面,将样本分成正确和错误两类,超平面相隔越远更容易分辨样本,可以通过不同的核函数,去拟合多维数据,来实现分类效果。
此外,多标签分类还有一些发展潜力,包括更高精度的分类性能和新的标签数据集的探索。
与传统的分类类似,多标签分类的错误也可以根据数据集的大小,模型的准确性等因素来衡量,也存在一些改进的方法,比如增强设计,模型集成,转移学习等等。
随着技术的进步,多标签分类技术将有望被应用到更多的场景中。
总之,多标签分类是机器学习中一个重要的技术,它可以将数据分类成一组标签,从而为业务提供有用的指导和信息。
MULTI-LABEL CLASSIFICATION OF MUSIC INTO EMOTIONSKonstantinos Trohidis Dept.of Journalism& Mass CommunicationAristotle Universityof Thessaloniki trohidis2000@ Grigorios TsoumakasDept.of InformaticsAristotle Universityof Thessalonikigreg@csd.auth.grGeorge KallirisDept.of Journalism&Mass CommunicationAristotle Universityof Thessalonikigkal@auth.grIoannis VlahavasDept.of InformaticsAristotle Universityof Thessalonikivlahavas@csd.auth.grABSTRACTIn this paper,the automated detection of emotion in music is modeled as a multilabel classification task,where a piece of music may belong to more than one class.Four algorithms are evaluated and compared in this task.Furthermore,the predictive power of several audio features is evaluated using a new multilabel feature selection method.Experiments are conducted on a set of593songs with6clusters of music emotions based on the Tellegen-Watson-Clark model.Re-sults provide interesting insights into the quality of the dis-cussed algorithms and features.1INTRODUCTIONHumans,by nature,are emotionally affected by music.Who can argue against the famous quote of the German philoso-pher Friedrich Nietzsche,who said that“without music,life would be a mistake”.As music databases grow in size and number,the retrieval of music by emotion is becoming an important task for various applications,such as song selec-tion in mobile devices[13],music recommendation systems [1],TV and radio programs1and music therapy.Past approaches towards automated detection of emotions in music modeled the learning problem as a single-label classification[9,20],regression[19],or multilabel classi-fication[6,7,17]task.Music may evoke more than one dif-ferent emotion at the same time.We would like to be able to retrieve a piece of music based on any of the associated (classes of)emotions.Single-label classification and regres-sion cannot model this multiplicity.Therefore,the focus of this paper is on multilabel classification methods.A secondary contribution of this paper is a new multil-abel dataset with72music features for593songs catego-rized into one or more out of6classes of emotions.The dataset is released to the public2,in order to allow com-parative experiments by other researchers.Publicly avail-able multilabel datasets are rare,hindering the progress of research in this area.1/2http://mlkd.csd.auth.gr/multilabel.htmlThe primary contribution of this paper is twofold:•A comparative experimental evaluation of four multi-label classification algorithms on the aforementioneddataset using a variety of evaluation measures.Previ-ous work experimented with just a single algorithm.We attempt to raise the awareness of the MIR commu-nity on some of the recent developments in multilabelclassification and show which of those algorithms per-form better for musical data.•A new multilabel feature selection method.The pro-posed method is experimentally compared against twoother methods of the literature.The results show thatit can improve the performance of a multilabel clas-sification algorithm that doesn’t take feature impor-tance into account.The remaining of this paper is structured as follows.Sec-tions2and3provide background material on multilabel classification and emotion modeling respectively.Section4 presents the details of the dataset used in this paper.Section 5presents experimental results comparing the four multi-label classification algorithms and Section6discusses the new multilabel feature selection method.Section7presents related work andfinally,conclusions and future work are drawn in Section8.2MULTILABEL CLASSIFICATION Traditional single-label classification is concerned with learn-ing from a set of examples that are associated with a single labelλfrom a set of disjoint labels L,|L|>1.In multil-abel classification,the examples are associated with a set of labels Y⊆L.2.1Learning AlgorithmsMultilabel classification methods can be categorized into two different groups[14]:i)problem transformation meth-ods,and ii)algorithm adaptation methods.Thefirst groupcontains methods that are algorithm independent.They trans-form the multilabel classification task into one or more single-label classification,regression or ranking tasks.The second group contains methods that extend specific learning algo-rithms in order to handle multilabel data directly.2.2Evaluation MeasuresMultilabel classification requires different evaluation mea-sures than traditional single-label classification.A taxon-omy of multilabel classification evaluation measures is given in [15],which considers two main categories:example-based and label-based measures .A third category of measures,which is not directly related to multilabel classification,but is often used in the literature,is ranking-based measures,which are nicely presented in [21]among other publications.3MUSIC AND EMOTIONHevner [4]was the first to study the relation between mu-sic and emotion.She discovered 8clusters of adjective sets describing music emotion and created an emotion cycle of these categories.Hevner’s adjectives were refined and re-grouped into ten groups by Farnsworth [2].Figure 1shows another emotion model,called Thayer’s model of mood [12],which consists of 2axes.The horizon-tal axis described the amount of stress and the vertical axis the amount ofenergy.Figure 1.Thayer’s model of moodThe model depicted in Figure 2extends Thayer’s model with a second system of axes,which is rotated by 45degrees compared to the original axes [11].The new axes describe (un)pleasantness versus (dis)engagement.4DATASETThe dataset used for this work consists of 100songs from each of the following 7different genres:Classical,Reggae,Rock,Pop,Hip-Hop,Techno and Jazz.The collection was created from 233musical albums choosing three songsfromFigure 2.The Tellegen-Watson-Clark model of mood (fig-ure reproduced from [18])each album.From each song a period of 30seconds after the initial 30seconds was extracted.The resulting sound clips were stored and converted into wave files of 22050Hz sam-pling rate,16-bit per sample and mono.The following sub-sections present the features that were extracted from each wave file and the emotion labeling process.4.1Feature ExtractionFor the feature extraction process,the Marsyas tool [16]was used.The extracted features fall into two categories:rhyth-mic and timbre.4.1.1Rhythmic FeaturesThe rhythmic features were derived by extracting periodic changes from a beat histogram.An algorithm that identi-fies peaks using autocorrelation was implemented.We se-lected the two highest peaks and computed their amplitudes,their BMPs (beats per minute)and the high-to-low ratio of their BPMs.In addition,3features were calculated by sum-ming the histogram bins between 40-90,90-140and 140-250BPMs respectively.The whole process led to a total of 8rhythmic features.4.1.2Timbre FeaturesMel Frequency Cepstral Coefficients (MFCCs)are used for speech recognition and music modeling [8].To derive MFCCs features,the signal was divided into frames and the ampli-tude spectrum was calculated for each frame.Next,its log-arithm was taken and converted to Mel scale.Finally,the discrete cosine transform was implemented.We selected the first 13MFCCs.Another set of 3features that relate to timbre textures were extracted from the Short-Term Fourier Transform (FFT):Spectral centroid,spectral rolloff and spectral flux.For each of the 16aforementioned features (13MFCCs,3FFT)we calculated the mean,standard deviation (std),mean standard deviation(mean std)and standard deviation of standard deviation(std std)over all frames.This led to a total of64timbre features.4.2Emotion LabelingThe Tellegen-Watson-Clark model was employed for label-ing the data with emotions.We decided to use this par-ticular model because the emotional space of music is ab-stract with many emotions and a music application based on mood should combine a series of moods and emotions.To achieve this goal without using an excessive number of la-bels,we reached a compromise retaining only6main emo-tional clusters from this model.The corresponding labels are presented in Table1.Label Description#ExamplesL1amazed-surprised173L2happy-pleased166L3relaxing-calm264L4quiet-still148L5sad-lonely168L6angry-fearful189Table1.Description of emotion clusters The sound clips were annotated by three male experts of age20,25and30from the School of Music Studies in our University.Only the songs with completely identical label-ing from all experts were kept for subsequent experimen-tation.This process led to afinal annotated dataset of593 songs.Potential reasons for this unexpectedly high agree-ment of the experts are the short track length and their com-mon background.The last column of Table1shows the number of examples annotated with each label.5EMPIRICAL COMPARISON OF ALGORITHMS 5.1Multilabel Classification AlgorithmsWe compared the following multilabel classification algo-rithms:binary relevance(BR),label powerset(LP),random k-labelsets(RAKEL)[15]and multilabel k-nearest neigh-bor(ML k NN)[21].Thefirst three are problem transforma-tion methods,while the last one is an algorithm adaptation method.Thefirst two approaches were selected as they are the most basic approaches for multilabel classification tasks. BR considers the prediction of each label as an independent binary classification task,while LP considers the multi-class problem of predicting each member of the powerset of L that exists in the training set(see[15]for a more extensive presentation of BR and LP).RAKEL was selected,as a re-cent method that has been shown to be more effective than thefirst two[15].Finally,ML k NN was selected,as a re-cent high-performance representative of problem adaptation methods[21].Apart from BR,none of the other algorithms have been evaluated on music data in the past,to the best of our knowledge.5.2Experimental SetupLP,BR and RAKEL were run using a support vector ma-chine(SVM)as the base classifier.The SVM was trained with a linear kernel and the complexity constant C equal to1.The one-against-one strategy is used for dealing with multi-class tasks in the case of LP and RAKEL.The number of neighbors in ML k NN was set to10.RAKEL has three parameters that need to be selected prior to training the algorithm:a)the subset size,b)the number of models and c)the threshold for thefinal output. We used an internal5-fold cross-validation on the training set,in order to automatically select these parameters.The subset size was varied from2to5,the number of models from1to100and the threshold from0.1to0.9with a0.1 step.10different10-fold cross-validation experiments were run for evaluation.The results that follow are averages over these100runs of the different algorithms.5.3ResultsTable2shows the predictive performance of the4compet-ing multilabel classification algorithms using a variety of measures.We notice that RAKEL dominates the other al-gorithms in almost all measures.BR LP RAKEL ML k NN Hamming Loss0.19430.19640.18450.2616 Micro F10.65260.69210.70020.4741 Micro AUC0.74650.77810.82370.7540 Macro F10.60020.67820.67660.3716 Macro AUC0.73440.77170.81150.7185 One-error0.30380.29570.26690.3894 Coverage 2.4378 2.226 1.9974 2.2715 Ranking Loss0.45170.36380.26350.2603 Avg.Precision0.73780.76690.79540.7104Table2.Performance resultsTable3shows the cpu time in seconds that was consumed during the training,parameter selection and testing phases of the algorithms.We notice that BR and ML k NN require very little training time,as their complexity is linear with respect to the number of labels.The complexity of LP de-pends on the number of distinct label subsets that exist in training set,which is typically larger than the number of la-bels.While the training complexity of RAKEL is bound by the subset size parameter,its increased time comes fromthe multiple models that it builds,since it is an ensemble method.RAKEL further requires a comparatively signifi-cant amount of time for parameter selection.However,this time is still affordable(2.5minutes),as it is only run offline.BR LP RAKEL ML k NN Training0.77 3.07 6.660.51 Parameter selection00151.590 Testing0.000.020.030.06 Table3.Average training,parameter selection and testing cpu time in secondsConcerning the test time,we notice that BR is the fastest algorithm,followed by LP and RAKEL.ML k NN is the most time-consuming algorithm during testing,as it must calcu-late the k nearest neighbors online after the query.Table4shows the classification accuracy of the algo-rithms for each label(as if they were independently pre-dicted),along with the average accuracy in the last column. We notice that based on the ease of predictions we can rank the labels in the following descending order L4,L6,L5,L1, L3,L2.L4is the easiest with a mean accuracy of approx-imately87%,followed by L6,L5and L1with mean accu-racies of approximately80%,79%and78%respectively. The hardest labels are L2and L3with a mean accuracy of approximately73%and76%respectively.BR LP RAKEL ML k NN AvgL10.79000.79060.79820.74460.7809L20.71150.73800.75870.71950.7319L30.77200.77050.78540.72210.7625L40.89970.89920.90310.79690.8747L50.82870.80930.82360.70510.7917L60.83220.81420.82380.74220.8031Table4.Accuracy per labelBased on the results,one can see that the classification model performs better for emotional labels such as label4 (quiet)rather than label2(happy).This can be interpreted due to the fact that emotions such as quietness can be eas-ily perceived and classified by humans in a musical context, as it is more objective than more difficult and abstract emo-tional labels such as happiness,which is more subjective.6EMPIRICAL FEATURE EV ALUATIONMultilabel feature selection has been mainly applied to the domain of text categorization,due to the typically high di-mensionality of textual data.The standard approach is to consider each label separately,use an attribute evaluation statistic(such asχ2,gain ratio,etc)for each label,and then combine the results using an averaging approach.Two aver-aging approaches that appear in the literature are max,which considers the maximum score for each feature across all la-bels and avg,which considers the average of the score of each feature across all labels,weighted by the prior proba-bility of each label.The disadvantage of these approaches, similarly to BR,is that they do not consider label correla-tions.We propose a different approach in this paper,which has not been discussed in the literature before,to the best of our knowledge.At afirst step,we apply the transformation of the LP method in order to produce a single-label classifica-tion dataset.Then,we apply a common attribute evaluation statistic.We argue that this approach could be more benefi-cial than the others,as it considers label correlations.In order to evaluate our hypothesis we compare the Ham-ming loss of the ML k NN algorithm(known to suffer from the curse of dimensionality)using the best1to71features according to the three feature selection approaches using the χ2statistic.Figure3shows the results.Figure3.Hamming loss of the ML k NN classifier using the best1to71features as ordered by theχ2feature selection method,using the max and avg averaging approaches and the proposed method.We notice that all approaches have similar performance for most of the horizontal axis(number of features retained), apart from the section from36to50features,where the pro-posed method leads to better results.It is in this area that the best result is achieved for ML k NN,which is a Ham-ming loss of approximately0.235.This is an indication that taking label correlations may be fruitful,especially for se-lecting important features beyond those that are correlated directly with the labels.7RELATED WORKWe discuss past efforts on emotion detection in music,mainly in terms of emotion model,extracted features and the kindof modeling of the learning problem:a)single label classi-fication,b)regression,and c)multilabel classification.7.1Single-label ClassificationThe four main emotion classes of Thayer’s model were used as the emotion model in[9].Three different feature sets were adopted for music representation,namely intensity,tim-bre and rhythm.Gaussian mixture models were used to model each of the four classes.An interesting contribution of this work,was a hierarchical classification process,which first classifies a song into high/low energy(vertical axis of Thayer’s model),and then into one of the two high/low stress classes.The same emotion classes were used in[20].The au-thors experimented with two fuzzy classifiers,using the15 features proposed in[10].They also experimented with a feature selection method,which improved the overall accu-racy(around78%),but they do not mention which features were selected.The classification of songs into a single cluster of emo-tions was a new category in the2007MIREX(Music Infor-mation Retrieval Evaluation eXchange)competition.The top two submissions of the competition3were based on support vector machines.The model of mood that was used in the competition,was5clusters of moods proposed in [5],which was compiled based on a statistical analysis of the relationship of mood with genre,artist and usage meta-data.Among the many interesting conclusion of the compe-tition,was the difficulty to discern between certain clusters of moods,due to their semantic overlap.A multilabel classi-fication approach could overcome this problem,by allowing the specification of multiplefiner-grain emotion classes. 7.2RegressionEmotion recognition is modeled as a regression task in[19]. V olunteers rated a training collection of songs in terms of arousal and valence in an ordinal scale of11values from -1to1with a0.2step.The authors then trained regression models using a variety of algorithms(again SVMs perform best)and a variety of extracted features.Finally,a user could retrieve a song by selecting a point in the two-dimensional arousal and valence mood plane of Thayer.Futhermore,the authors used a feature selection algo-rithm,leading to an increase of the predictive performance. However,it is not clear if the authors run the feature se-lection process on all input data or on each fold of the10-fold cross-validation used to evaluate the regressors.If the former is true,then their results may be optimistic,as the feature selection algorithm had access to the test data.A similar pitfall of feature selection in music classification is discussed in[3].3/mirex/20077.3Multilabel ClassificationBoth regression and single-label classification methods suf-fer from the same problem:No two different(clusters of) emotions can be simultaneously predicted.Multilabel clas-sification allows for a natural modeling of this issue.Li and Ogihara[6]used two emotion models:a)the10 adjective clusters of Farnsworth(extended with3clusters of adjectives proposed by the labeler)and b)a further cluster-ing of those into6super-clusters.They only experimented with the BR multilabel classification method using SVMs as the underlying base single-label classifier.In terms of features,they used Marsyas[16]to extract30features re-lated to the timbral texture,rhythm and pitch.The predic-tive performance was low for the clusters and better for the super-clusters.In addition,they found evidence that genre is correlated with emotions.In an extension of their work,Li and Ogihara[7]consid-ered3bipolar adjective pairs(Cheerful vs Depressing),(Re-laxing vs Exciting),and(Comforting vs Disturbing).Each track was initially labeled using a scale ranging from-4to +4by two subjects and then converted to a binary(posi-tive/negative)label.The learning approach was the same with[6].The feature set was expanded with a new extraction method,called Daubechies Wavelet Coefficient Histograms. The authors report an accuracy of around60%.The same13clusters as in[6]were used in[17],where the authors modified the k Nearest Neighbors algorithm in order to handle multilabel data directly.They found that the predictive performance was low,too.Compared to our work,none of the three aforementioned approaches discusses feature selection from multilabel data, compares different multilabel classification algorithms or uses a variety of multilabel evaluation measures in its empirical study.8CONCLUSIONS AND FUTURE WORKThe task of multi-label mapping of music into emotions was investigated.An evaluation of four multi-label classifica-tion algorithms was performed on a collection of593songs. Among these algorithms,RAKEL was the most effective and is proposed for emotion categorization.The overall pre-dictive performance was high and encourages further inves-tigation of multilabel methods.The performance per each different label varied.The subjectivity of the label may be influencing the performance of its prediction.In addition,a new multilabel feature ranking method was proposed,which seems to perform better than existing meth-ods in this domain.Feature ranking may assist researchers working on feature extraction by providing feedback on the predictive performance of current and newly designed indi-vidual features.It also improves the performance of multi-label classification algorithms,such as ML k NN,that don’ttake feature importance into account.Multilabel classifiers such as RAKEL could be used for the automated annotation of large musical collections with multiple emotions.This in turn would support the imple-mentation of music information retrieval systems that query music collections by emotion.Such a querying capability would be useful for song selection in various applications.Future work will explore the effectiveness of new fea-tures based on time frequency representation of music and lyrics,as well as the hierarchical multilabel classification approach,which we believe has great potential in this do-main.9REFERENCES[1]Rui Cai,Chao Zhang,Chong Wang,Lei Zhang,andWei-Ying Ma.Musicsense:contextual music recom-mendation using emotional allocation modeling.In MULTIMEDIA’07:Proceedings of the15th interna-tional conference on Multimedia,pages553–556,2007.[2]P Farnsworth.The social psychology of music.The Dry-den Press,1958.[3]R.Fiebrink and I.Fujinaga.Feature selection pitfallsand music classification.In Proceedings of the Interna-tional Conference on Music Information Retrieval(IS-MIR2006),pages340–341,2006.[4]K.Hevner.Experimental studies of the elements of ex-pression in music.American Journal of Psychology, 48:246–268,1936.[5]X Hu and J.S.Downie.Exploring mood metadata:rela-tionships with genre,artist and usage metadata.In Pro-ceedings of the8th International Conference on Mu-sic Information Retrieval(ISMIR2007),pages67–72, 2007.[6]T.Li and M.Ogihara.Detecting emotion in music.InProceedings of the International Symposium on Mu-sic Information Retrieval,pages239–240,WashingtonD.C.,USA,2003.[7]T.Li and M.Ogihara.Toward intelligent music in-formation retrieval.IEEE Transactions on Multimedia, 8(3):564–574,2006.[8]B.Logan.Mel frequency cepstral coefficients for musicmodeling.In Proceedings of the1st International Sym-posium on Music Information Retrieval(ISMIR2000), Plymouth,Massachusetts,2000.[9]L.Lu,D.Liu,and H.-J.Zhang.Automatic mood detec-tion and tracking of music audio signals.IEEE Trans-actions on Audio,Speech,and Language Processing, 14(1):5–18,January2006.[10]E.Schubert.Measurement and Time Series Analysis ofEmotion in Music.PhD thesis,University of New South Wales,1999.[11]A.Tellegen,D.Watson,and L.A.Clark.On the dimen-sional and hierarchical structure of affect.Psychological Science,10(4):297–303,July1999.[12]R.E.Thayer.The Biopsychology of Mood and Arousal.Oxford University Press,1989.[13]M.Tolos,R.Tato,and T.Kemp.Mood-based naviga-tion through large collections of musical data.In2nd IEEE Consumer Communications and Networking Con-ference(CCNC2005),pages71–75,3-6Jan.2005. [14]G.Tsoumakas and I.Katakis.Multi-label classification:An overview.International Journal of Data Warehous-ing and Mining,3(3):1–13,2007.[15]G.Tsoumakas and I.Vlahavas.Random k-labelsets:An ensemble method for multilabel classification.In Proceedings of the18th European Conference on Ma-chine Learning(ECML2007),pages406–417,Warsaw, Poland,September17-212007.[16]G.Tzanetakis and P.Cook.Musical genre classificationof audio signals.IEEE Transactions on Speech and Au-dio Processing,10(5):293–302,July2002.[17]A.Wieczorkowska,P.Synak,and Z.W.Ras.Multi-labelclassification of emotions in music.In Proceedings of the2006International Conference on Intelligent Infor-mation Processing and Web Mining(IIPWM’06),pages 307–315,2006.[18]D.Yang and W.Lee.Disambiguating music emotion us-ing software agents.In Proceedings of the5th Interna-tional Conference on Music Information Retrieval(IS-MIR’04),Barcelona,Spain,2004.[19]Y.-H.Yang,Y.-C.Lin,Y.-F.Su,and H.-H.Chen.A re-gression approach to music emotion recognition.IEEE Transactions on Audio,Speech and Language Process-ing(TASLP),16(2):448–457,February2008.[20]Y.-H.Yang,C.-C.Liu,and H.-H.Chen.Music emo-tion classification:A fuzzy approach.In Proceedings of ACM Multimedia2006(MM’06),pages81–84,Santa Barbara,CA,USA,2006.[21]M-L Zhang and Z-H Zhou.Ml-knn:A lazy learningapproach to multi-label learning.Pattern Recognition, 40(7):2038–2048,2007.。