1 Audio Classification (Test/Train) tasks
2 Participation in previous years and Links to Results

Audio Classification (Test/Train) tasks

Description

Many tasks in music classification can be characterized into a two-stage process: training classification models using labeled data and testing the models using new/unseen data. Therefore, we propose this "meta" task which includes various audio classification tasks that follow this Train/Test process. For MIREX 2011, five classification sub-tasks are included:

Audio Classical Composer Identification
Audio US Pop Music Genre Classification
Audio Latin Music Genre Classification
Audio Mood Classification

All five classification tasks were conducted in previous MIREX runs (please see ). This page presents the evaluation of these tasks, including the datasets as well as the submission rules and formats.

Task specific mailing list

In the past we have use a specific mailing list for the discussion of this task and related tasks. This year, however, we are asking that all discussions take place on the MIREX "EvalFest" list. If you have an question or comment, simply include the task name in the subject heading.

Data

Audio Classical Composer Identification

This dataset requires algorithms to classify music audio according to the composer of the track (drawn from a collection of performances of a variety of classical music genres). The collection used at MIREX 2009 will be re-used.

Collection statistics:

2772 30-second 22.05 kHz mono wav clips
11 "classical" composers (252 clips per composer), including:
- Bach
- Beethoven
- Brahms
- Chopin
- Dvorak
- Handel
- Haydn
- Mendelssohn
- Mozart
- Schubert
- Vivaldi

Audio US Pop Music Genre Classification

This dataset requires algorithms to classify music audio according to the genre of the track (drawn from a collection of US Pop music tracks). The MIREX 2007 Genre dataset will be re-used, which was drawn from the USPOP 2002 and USCRAP collections.

Collection statistics:

7000 30-second audio clips in 22.05kHz mono WAV format
10 genres (700 clips from each genre), including:
- Blues
- Jazz
- Country/Western
- Baroque
- Classical
- Romantic
- Electronica
- Hip-Hop
- Rock
- HardRock/Metal

Audio Latin Music Genre Classification

This dataset requires algorithms to classify music audio according to the genre of the track (drawn from a collection of Latin popular and dance music, sourced from Brazil and hand labeled by music experts). Carlos Silla's (cns2 (at) kent (dot) ac (dot) uk) Latin popular and dance music dataset [1] will be re-used. This collection is likely to contain a greater number of styles of music that will be differentiated by rhythmic characteristics than the MIREX 2007 dataset.

Collection statistics:

3,227 audio files in 22.05kHz mono WAV format
10 Latin music genres, including:
- Axe
- Bachata
- Bolero
- Forro
- Gaucha
- Merengue
- Pagode
- Sertaneja
- Tango

Audio Mood Classification

This dataset requires algorithms to classify music audio according to the mood of the track (drawn from a collection of production msuic sourced from the APM collection [2]). The MIREX 2007 Mood Classification dataset [3] will be re-used.

Collection statistics:

600 30 second audio clips in 22.05kHz mono WAV format selected from the APM collection [4], and labeled by human judges using the Evalutron6000 system.
5 mood categories [5] each of which contains 120 clips:

Cluster_1: passionate, rousing, confident,boisterous, rowdy
Cluster_2: rollicking, cheerful, fun, sweet, amiable/good natured
Cluster_3: literate, poignant, wistful, bittersweet, autumnal, brooding
Cluster_4: humorous, silly, campy, quirky, whimsical, witty, wry
Cluster_5: aggressive, fiery,tense/anxious, intense, volatile,visceral

2014/5/15 11:54:45

Cluster Set: Many albums and songs appear in multiple mood label lists. This overlap can be exploited to group similar mood labels into several mood clusters. Clustering condenses the data distribution and gives us a more concise, higherlevel view of the mood “space”. The set of albums and songs assigned to the mood labels in the mood clusters forms our third dataset (described below).

Audio Formats

For all datasets, participating algorithms will have to read audio in the following format:

Sample rate: 22 KHz
Sample size: 16 bit
Number of channels: 1 (mono)
Encoding: WAV

Evaluation

This section first describes evaluation methods common to all the datasets, then specifies settings unique to each of the tasks.

Participating algorithms will be evaluated with 3-fold cross validation. For Artist Identification and Classical Composer Classification, album filtering （保证每张专辑的在训练和测试数据中都有）will be used the test and training splits, i.e. training and test sets will contain tracks from different albums; for US Pop Genre Classification（应该是对应Mixed genre classification） and Latin Genre Classification, artist filtering will be used the test and training splits, i.e. training and test sets will contain different artists.

The raw classification (identification) accuracy, standard deviation and a confusion matrix for each algorithm will be computed.

Classification accuracies will be tested for statistically significant differences using Friedman's Anova with Tukey-Kramer honestly significant difference (HSD) tests for multiple comparisons. This test will be used to rank the algorithms and to group them into sets of equivalent performance.

In addition computation times for feature extraction and training/classification will be measured.

Submission Format

File I/O Format

The audio files to be used in these tasks will be specified in a simple ASCII list file. The formats for the list files are specified below:

Feature extraction list file

The list file passed for feature extraction will be a simple ASCII list file. This file will contain one path per line with no header line.I.e.

<example path and filename>

E.g.

/path/to/track1.wav/path/to/track2.wav...

Training list file

The list file passed for model training will be a simple ASCII list file. This file will contain one path per line, followed by a tab character and the class (artist, genre or mood) label, again with no header line.

I.e.

<example path and filename>\t<class label>

E.g.

/path/to/track1.wav	rock/path/to/track2.wav	blues...

Test (classification) list file

The list file passed for testing classification will be a simple ASCII list file identical in format to the Feature extraction list file. This file will contain one path per line with no header line.

I.e.

<example path and filename>

E.g.

/path/to/track1.wav/path/to/track2.wav...

Classification output file

Participating algorithms should produce a simple ASCII list file identical in format to the Training list file. This file will contain one path per line, followed by a tab character and the artist label, again with no header line.

I.e.

<example path and filename>\t<class label>

E.g.

/path/to/track1.wav	classical/path/to/track2.wav	blues...

Submission calling formats

Algorithms should divide their feature extraction and training/classification into separate runs. This will facilitate a single feature extraction step for the task, while training and classification can be run for each cross-validation fold.

Hence, participants should provide two executables or command line parameters for a single executable to run the two separate processes.

Executables will have to accept the paths to the aforementioned list files as command line parameters.

Scratch folders will be provided for all submissions for the storage of feature files and any model files to be produced. Executables will have to accept the path to their scratch folder as a command line parameter. Executables will also have to track which feature files correspond to which audio files internally. To facilitate this process, unique file names will be assigned to each audio track.

Example submission calling formats

 extractFeatures.sh /path/to/scratch/folder /path/to/featureExtractionListFile.txt TrainAndClassify.sh /path/to/scratch/folder /path/to/trainListFile.txt /path/to/testListFile.txt /path/to/outputListFile.txt

 extractFeatures.sh /path/to/scratch/folder /path/to/featureExtractionListFile.txt Train.sh /path/to/scratch/folder /path/to/trainListFile.txt  Classify.sh /path/to/scratch/folder /path/to/testListFile.txt /path/to/outputListFile.txt

 myAlgo.sh -extract /path/to/scratch/folder /path/to/featureExtractionListFile.txt myAlgo.sh -train /path/to/scratch/folder /path/to/trainListFile.txt  myAlgo.sh -classify /path/to/scratch/folder /path/to/testListFile.txt /path/to/outputListFile.txt

Multi-processor compute nodes will be used to run this task, however, we ask that submissions use no more than 4 cores (as we will be running a lot of submissions and will need to run some in parallel). Ideally, the number of threads to use should be specified as a command line parameter. Alternatively, implementations may be provided in hard-coded 1, 2 or 4 thread/core configurations.

 extractFeatures.sh -numThreads 4 /path/to/scratch/folder /path/to/featureExtractionListFile.txt TrainAndClassify.sh -numThreads 4 /path/to/scratch/folder /path/to/trainListFile.txt /path/to/testListFile.txt /path/to/outputListFile.txt

 myAlgo.sh -extract -numThreads 4 /path/to/scratch/folder /path/to/featureExtractionListFile.txt myAlgo.sh -TrainAndClassify -numThreads 4 /path/to/scratch/folder /path/to/trainListFile.txt /path/to/testListFile.txt /path/to/outputListFile.txt

Packaging submissions

All submissions should be statically linked to all libraries (the presence of dynamically linked libraries cannot be guaranteed). IMIRSEL should be notified of any dependencies that you cannot include with your submission at the earliest opportunity (in order to give them time to satisfy the dependency).
Be sure to follow the Best Coding Practices for MIREX
Be sure to follow the MIREX 2011 Submission Instructions

All submissions should include a README file including the following the information:

Command line calling format for all executables including examples
Number of threads/cores used or whether this should be specified on the command line
Expected memory footprint
Expected runtime
Approximately how much scratch disk space will the submission need to store any feature/cache files?
Any required environments/architectures (and versions) such as Matlab, Java, Python, Bash, Ruby etc.
Any special notice regarding to running your algorithm

Note that the information that you place in the README file is extremely important in ensuring that your submission is evaluated properly.

Time and hardware limits

Due to the potentially high number of participants in this and other audio tasks, hard limits on the runtime of submissions will be imposed.

A hard limit of 24 hours will be imposed on feature extraction times.

A hard limit of 48 hours will be imposed on the 3 training/classification cycles, leading to a total runtime limit of 72 hours for each submission.

Submission opening date

Friday August 5th 2011

Submission closing date

Friday August 26th 2011

Potential Participants

name / email

Participation in previous years and Links to Results

Year	Participating Algorithms	URL

2010	27	http://nema.lis.illinois.edu/nema_out/4ffcb482-b83c-4ba6-bc42-9b538b31143c/results/evaluation/
	24	http://nema.lis.illinois.edu/nema_out/6731c97a-240c-4d3d-8be9-90d715ea04e1/results/evaluation/
	24	http://nema.lis.illinois.edu/nema_out/2b5839b3-3012-4f76-8807-31823588ae25/results/evaluation/
	36	http://nema.lis.illinois.edu/nema_out/9b11a5c8-9fcf-4029-95eb-51ed561cfb5f/results/evaluation/
2009	30	http://www.music-ir.org/mirex/wiki/2009:Audio_Classical_Composer_Identification_Results
	33	http://www.music-ir.org/mirex/wiki/2009:Audio_Genre_Classification_%28Latin_Set%29_Results
	31	http://www.music-ir.org/mirex/wiki/2009:Audio_Genre_Classification_%28Mixed_Set%29_Results
	33	http://www.music-ir.org/mirex/wiki/2009:Audio_Music_Mood_Classification_Results
2008	11	http://www.music-ir.org/mirex/wiki/2008:Audio_Artist_Identification_Results
	11	http://www.music-ir.org/mirex/wiki/2008:Audio_Classical_Composer_Identification_Results
	13	http://www.music-ir.org/mirex/wiki/2008:Audio_Genre_Classification_Results
	13	http://www.music-ir.org/mirex/wiki/2008:Audio_Music_Mood_Classification_Results
2007	7	http://www.music-ir.org/mirex/wiki/2007:Audio_Artist_Identification_Results
	7	http://www.music-ir.org/mirex/wiki/2007:Audio_Classical_Composer_Identification_Results
	7	http://www.music-ir.org/mirex/wiki/2007:Audio_Genre_Classification_Results
	9	http://www.music-ir.org/mirex/wiki/2007:Audio_Music_Mood_Classification_Results
2005	7	http://www.music-ir.org/evaluation/mirex-results/audio-artist/index.html
	13	http://www.music-ir.org/evaluation/mirex-results/audio-genre/index.html

来自为知笔记(Wiz)

2011:Audio Classification (Train/Test) Tasks - MIREX Wiki的更多相关文章

2013:Audio Tag Classification - MIREX Wiki
Contents [hide] 1 Description 1.1 Task specific mailing list 2 Data 2.1 MajorMiner Tag Dataset 2.2 M ...
pointnet++之classification/train.py
1.数据集加载 if FLAGS.normal: assert(NUM_POINT<=10000) DATA_PATH = os.path.join(ROOT_DIR, 'data/modeln ...
[MIREX] MIREX评测介绍
MIREX作为国际最权威音频检索评测大赛,竟然在百度上找不到任何介绍,只有几个与什么搜狗.腾讯获得什么成绩相关的检索内容,相比而言,TRECVID的内容收到重视多了...由于研究生阶段主要研究音频领域 ...
PH_Pooled Featrues Classification MIREX 2011 Submission
Abstract Principal Mel-Spectrum Components (Feature) Temporal Pooling Functions (Model) Single Hidde ...
#论文阅读# Universial language model fine-tuing for text classification
论文链接:https://aclweb.org/anthology/P18-1031 对文章内容的总结文章研究了一些在general corous上pretrain LM,然后把得到的model t ...
提高神经网络的学习方式Improving the way neural networks learn
When a golf player is first learning to play golf, they usually spend most of their time developing ...
Machine and Deep Learning with Python
Machine and Deep Learning with Python Education Tutorials and courses Supervised learning superstiti ...
### Paper about Event Detection
Paper about Event Detection. #@author: gr #@date: 2014-03-15 #@email: forgerui@gmail.com 看一些相关的论文. 1 ...
VGGNet论文翻译-Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image Recognition Karen Simonyan[‡] & Andrew Zi ...

随机推荐

windows SDK创建一个窗体
#include <windows.h> /* Declare Windows procedure */ LRESULT CALLBACK WindowProcedure (HWND, U ...
CAD参数绘制半径标注（网页版）
主要用到函数说明: _DMxDrawX::DrawDimRadial 绘制一个半径标注.详细说明如下: 参数说明 DOUBLE dCenterX 被标注的曲线的中点X值 DOUBLE dCenter ...
Python框架Django的入门
本篇文章主要给大家介绍Django的入门知识:
IDEA、Eclipse快捷键对比
IDEA.Eclipse快捷键对比序号功能 IDEA Eclipse 1 很多功能:导入包,处理异常,强转cast Alt+Enter 2 导入包,自动修正??? Ctrl+Enter 3 ...
Python 判断是否存在Excel表
Python 判断是否存在Excel表,无则生成,有则删除重建 import os import xlwt from openpyxl import workbook def sheet_method ...
Spring对象类型——单例和多例
由于看淘淘商城的项目,涉及到了项目中处理spring中bean对象的两种类型,分别是单例和多例,就在此记录一下,方便加深理解,写出更加健壮的代码. 一.单例和多例的概述在Spring中,bean可以 ...
Quartz任务调度2
注意: 不同的版本的jar包,具体的操作不太相同,但是思路是相同的:比如1.8.6jar包中,JobDetail是个类,直接通过构造方法与Job类关联.SimpleTrigger和CornTrigge ...
PHP条件运算符的“坑”
今天遇到一个关于PHP 嵌套使用条件运算符(ternary expressions)的问题现象先来看一段C语言代码(test.c): #include<stdio.h> int mai ...
BNUOJ 26228 Juggler
Juggler Time Limit: 3000ms Memory Limit: 32768KB This problem will be judged on HDU. Original ID: 42 ...
Android ImageView加载圆形图片且同时绘制圆形图片的外部边缘边线及边框
Android ImageView加载圆形图片且同时绘制圆形图片的外部边缘边线及边框在Android早期的开发中,如果涉及到圆形图片的处理,往往需要借助于第三方的实现,见附录文章1,2.And ...

2011:Audio Classification (Train/Test) Tasks - MIREX Wiki

Contents

Audio Classification (Test/Train) tasks

Description

Task specific mailing list

Data

Audio Classical Composer Identification

Audio US Pop Music Genre Classification

Audio Latin Music Genre Classification

Audio Mood Classification

Audio Formats

Evaluation

Submission Format

File I/O Format

Feature extraction list file

Training list file

Test (classification) list file

Classification output file

Submission calling formats

Example submission calling formats

Packaging submissions

Time and hardware limits

Submission opening date

Submission closing date

Potential Participants

Participation in previous years and Links to Results

2011:Audio Classification (Train/Test) Tasks - MIREX Wiki的更多相关文章

随机推荐

热门专题