Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
addandsubtract
7 months ago
|
parent
|
context
|
favorite
| on:
Show HN: Run Qwen3-Next-80B on 8GB GPU at 1tok/2s ...
Great work! Can this technique also be used to run image diffusion models on lower VRAM GPUs?
GTP
7 months ago
|
next
[–]
Not an expert in machine learning, but AFAIK diffusion models use a completely different architecture, therefore you can't use the same code to run optimized versions of both. But maybe the core ideas can be adapted to diffusion somehow.
anuarsh
7 months ago
|
prev
[–]
Thanks! I don't have much experience with diffusion models, but technically any multi-layer model could benefit from loading weights one by one
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: