Skip to content | Accessibility Information

Southall, C., Stables, R., Hockman, J., 2018.

Player vs transcriber: A game approach to data manipulation for automatic drum transcription

Output Type:Conference paper
Publication:Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR 2018
Pagination:pp. 58-65

State-of-the-art automatic drum transcription (ADT) approaches utilise deep learning methods reliant on time-consuming manual annotations and require congruence between training and testing data. When these conditions are not held, they often fail to generalise. We propose a game approach to ADT, termed player vs transcriber (PvT), in which a player model aims to reduce transcription accuracy of a transcriber model by manipulating training data in two ways. First, existing data may be augmented, allowing the transcriber to be trained using recordings with modified timbres. Second, additional individual recordings from sample libraries are included to generate rare combinations. We present three versions of the PvT model: AugExist, which augments pre-existing recordings; AugAddExist, which adds additional samples of drum hits to the AugExist system; and Generate, which generates training examples exclusively from individual drum hits from sample libraries. The three versions are evaluated alongside a state-of-the-art deep learning ADT system using two evaluation strategies. The results demonstrate that including the player network improves the ADT performance and suggests that this is due to improved generalisability. The results also indicate that although the Generate model achieves relatively low results, it is a viable choice when annotations are not accessible.