Drum Synthesis and Rhythmic Transformation with Adversarial Autoencoders
Audio examples accompanying paper for ACM International Conference on Multimedia (ACM MM) 2020.
Work conducted during an internship at Media Interaction Group, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Japan.
1. Audio synthesis with trained generator (G)
We demonstrate reconstruction of bar-length drum patterns from the generator model trained on real drum recordings. Examples at 22.05kHz sample rate are recreated with Griffin-Lim algorithm together with their corresponding output from the proposed AAE-GM model. More detailed information about data used here can be viewed in Section 3.1 of the paper.
2. Latent Space Interpolation
The proposed model performs rhythmic transformation of bar-length drum patterns as follows:
- Generator reconstruction of source input
- Transformation into an intermediate rhythmic pattern
- Resulting output transformation
A user is given the freedom to manipulate the structure within a bar without reliance on discrete identification of rhythmic boundaries towards a continuous transformation.
- Interpolations in the latent space allow for the mixing of two different drum patterns
- A gradual change is achievable from the source rhythmic pattern to the target pattern
- The intermediate latent codes are produced using a linear interpolation between source and target latent codes