The Chordinator: Modeling Music Harmony ...
Document type :
Communication dans un congrès avec actes
Title :
The Chordinator: Modeling Music Harmony By Implementing Transformer Networks and Token Strategies
Author(s) :
Dalmazzo, David [Auteur]
Department of Speech, Music and Hearing [KTH Stockholm] [KTH TMH]
Deguernel, Ken [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Sturm, Bob [Auteur]
Department of Speech, Music and Hearing [KTH Stockholm] [KTH TMH]
Department of Speech, Music and Hearing [KTH Stockholm] [KTH TMH]
Deguernel, Ken [Auteur]
Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189 [CRIStAL]
Sturm, Bob [Auteur]
Department of Speech, Music and Hearing [KTH Stockholm] [KTH TMH]
Conference title :
EvoMUSART
City :
Aberystwyth
Country :
Royaume-Uni
Start date of the conference :
2024
Book title :
Proceedings of EvoMUSART 2024
English keyword(s) :
Chord progressions Transformer Neural Networks Music Generation
Chord progressions
Transformer Neural Networks
Music Generation
Chord progressions
Transformer Neural Networks
Music Generation
HAL domain(s) :
Informatique [cs]/Intelligence artificielle [cs.AI]
Informatique [cs]/Interface homme-machine [cs.HC]
Sciences de l'Homme et Société/Musique, musicologie et arts de la scène
Statistiques [stat]/Machine Learning [stat.ML]
Informatique [cs]/Interface homme-machine [cs.HC]
Sciences de l'Homme et Société/Musique, musicologie et arts de la scène
Statistiques [stat]/Machine Learning [stat.ML]
English abstract : [en]
This paper compares two tokenization strategies for modeling chord progressions using the encoder transformer architecture trained with a large dataset of chord progressions in a variety of styles. The first strategy ...
Show more >This paper compares two tokenization strategies for modeling chord progressions using the encoder transformer architecture trained with a large dataset of chord progressions in a variety of styles. The first strategy includes a tokenization method treating all different chords as unique elements, which results in a vocabulary of 5202 independent tokens. The second strategy expresses the chords as a dynamic tuple describing root, nature (e.g., major, minor, diminished, etc.), and extensions (e.g., additions or alterations), producing a specific vocabulary of 59 tokens related to chords and 75 tokens for style, bars, form, and format. In the second approach, MIDI embeddings are added into the positional embedding layer of the transformer architecture, with an array of eight values related to the notes forming the chords. We propose a trigram analysis addition to the dataset to compare the generated chord progressions with the training dataset, which reveals common progressions and the extent to which a sequence is duplicated. We analyze progressions generated by the models comparing HITS@k metrics and human evaluation of 10 participants, rating the plausibility of the progressions as potential music compositions from a musical perspective. The second model reported lower validation loss, better metrics, and more musical consistency in the suggested progressions.Show less >
Show more >This paper compares two tokenization strategies for modeling chord progressions using the encoder transformer architecture trained with a large dataset of chord progressions in a variety of styles. The first strategy includes a tokenization method treating all different chords as unique elements, which results in a vocabulary of 5202 independent tokens. The second strategy expresses the chords as a dynamic tuple describing root, nature (e.g., major, minor, diminished, etc.), and extensions (e.g., additions or alterations), producing a specific vocabulary of 59 tokens related to chords and 75 tokens for style, bars, form, and format. In the second approach, MIDI embeddings are added into the positional embedding layer of the transformer architecture, with an array of eight values related to the notes forming the chords. We propose a trigram analysis addition to the dataset to compare the generated chord progressions with the training dataset, which reveals common progressions and the extent to which a sequence is duplicated. We analyze progressions generated by the models comparing HITS@k metrics and human evaluation of 10 participants, rating the plausibility of the progressions as potential music compositions from a musical perspective. The second model reported lower validation loss, better metrics, and more musical consistency in the suggested progressions.Show less >
Language :
Anglais
Peer reviewed article :
Oui
Audience :
Internationale
Popular science :
Non
European Project :
Collections :
Source :
Files
- document
- Open access
- Access the document
- TheChordinator_Evomusart-3.pdf
- Open access
- Access the document
- document
- Open access
- Access the document
- TheChordinator_Evomusart-3.pdf
- Open access
- Access the document