Music Information Retrieval (MIR) is an essential part of what I am and what I do. My work in MIR spans multiple domains, from rhythm and pitch analysis to synthesis techniques, contributing to the intersection of machine learning and musicology. Below, I summarize the key themes and contributions of my research in MIR, which encompasses generative models, singing style transfer, electric guitar modeling, and the study of traditional music from various cultures.
Pitch and Rhythm Analysis in Contemporary Music
A significant portion of my research focuses on analyzing pitch and rhythm across various music genres. One line of research examines pitch variability in modern vocal styles and how it interacts with the dynamic rhythmic properties of genres like rap. By exploring pitch variation in vocals, we aim to uncover evolving musical trends that have implications for the creation of generative music models.
Analyzing Traditional Music and Tonal Systems
Another line of research explores the tonality and melody structures of traditional music, particularly within Sub-Saharan African music. My latest study examines the pitch content of seperewa songs—a traditional Akan harp-lute genre. Using field recordings from the mid-twentieth century, we applied Gaussian Mixture Models to analyze and model the pitch scales. This work not only advances the study of traditional music tonality but also reveals the challenges of using separation models like Demucs in non-Western music, indicating areas for future improvement in MIR tools.
Electric Guitar Modeling and Robustness in Transcription
In the area of instrument modeling, one of my contributions lies in improving the robustness of Guitar Tablature Transcription. Transcription of solo guitar performances, particularly with diverse tones and effects, is essential for applications in education and musicological studies. My research demonstrates that incorporating real electric guitar tones with various audio effects into synthetic training datasets enhances the robustness of transcription models. This research, leveraging both real and synthetic data, bridges the gap between traditional instrument modeling and contemporary machine learning techniques.
Exploring Approaches to Sound Synthesis
In the realm of sound synthesis, my work investigates multi-task approaches to synthesizer programming, specifically focusing on transforming audio signals into parameters for various virtual instruments. This research contributes to advancements in automatic synthesizer programming, with applications in music production and audio synthesis.
Generative Singing Style Transfer Across Genres
In recent work I am developing SingStyleTransfer, a generative VAE-GAN model that performs singing style transfer across genres. Voice style transfer has been extensively studied in the context of speech, yet singing style, independent of the speaker’s identity, remains underexplored.
Each of these research themes aims to deepen our understanding of music through computational models while also providing practical applications for music production, musicology, and audio technology. By integrating approaches from signal processing, machine learning, and cognitive neuroscience, my work seeks to uncover the underlying patterns that define human musical expression, and in turn, to enable machines to generate and interpret music more naturally.
You can find papers related to these topics (and more) on my google scholar page.
© 2024 Iran R. Roman