Replies: 1 comment 9 replies
-
| You cannot pass a length that is not a power of 2, I'd suggest you to pad the cardiograms with zeros to a length of 8192. Then since the sequence is very short you can use a smaller model (~70M params), something like this: from audio_diffusion_pytorch import AudioDiffusionModel model = AudioDiffusionModel( in_channels=12, patch_size=4, kernel_sizes_init=[1, 3, 7], multipliers=[1, 2, 4, 4, 4], factors=[4, 2, 2, 2], num_blocks=[2, 2, 2, 2], attentions=[False, True, True, True], ) # Train model with cardiograms sources x = torch.randn(1, 12, 8192) loss = model(x) loss.backward() # Do this many times # Sample 2 cardiograms given start noise noise = torch.randn(2, 12, 8192) sampled = model.sample( noise=noise, num_steps=10 # Suggested range: 2-50 ) # [2, 12, 8192] |
Beta Was this translation helpful? Give feedback.
9 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment


Uh oh!
There was an error while loading. Please reload this page.
-
Hello, thank you for the great repo for audio diffusion!
I just wanted to ask if you had any experience with multi-channel audio diffusion?
This might be out of the scope for this repo, but I am currently trying to train a diffusion model on a 12-lead electrocardiogram sampled in 500 Hz for 10 seconds ([12, 5000] shape). However, I am having difficulty training the model, so was wondering if you might have any insight to what parameters I should attempt from your intuition or experience.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions