Fine-Grained Classification of Metal and Hardcore Music Using a Hybrid CNN–GRU Framework
Download as PDF
DOI: 10.25236/iwmecs.2025.023
Corresponding Author
Luyao Yang
Abstract
The problematic and overlapping nature of Metal and Hardcore has made automatic classification of extreme music difficult. The rhythmic structures and vocal texturees of these genres are close and it is hard to differentiate them using the traditional audio models and recommendation systems. We suggest a fine-grained audio classification model to overcome this issue and integrate convolutional neural networks (CNNs) and bidirectional gated recurrent units (BiGRUs). The CNNs learn local spectrotemporal patterns of Mel-spectrograms, whereas the BiGRUs learn long-term rhythmic dependencies. We use the dataset of 216 tracks (100 Metal and 116 Hardcore) recorded in 1980-2020 and divided into 12,960 clips and described by 64-bin Mel-spectrograms. The CNN-GRU model proposed has an accuracy of 92.40 percent in classifying, which is better than the traditional and single deep learning baselines. The method makes the classification of extreme music more precise and also offers a scalable genre analysis framework in the context of retrieval of music information.
Keywords
Music Genre Classification; Gated Recurrent Units; Metal; Hardcore