Great work! Can this technique also be used to run image diffusion models on low...

GTP · 2025-09-23T15:01:34 1758639694

Not an expert in machine learning, but AFAIK diffusion models use a completely different architecture, therefore you can't use the same code to run optimized versions of both. But maybe the core ideas can be adapted to diffusion somehow.

anuarsh · 2025-09-23T21:52:01 1758664321

Thanks! I don't have much experience with diffusion models, but technically any multi-layer model could benefit from loading weights one by one