Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Don't have access to Dall-E 2 or Imagen but I do have [1] and [2] locally and they produced [3] with that prompt.

[1] https://github.com/nerdyrodent/VQGAN-CLIP.git [2] https://github.com/CompVis/latent-diffusion.git [3] https://imgur.com/a/dCPt35K



Nice. Latent-diffusion has come out very traditional but the VQGAN/CLIP ones are fairly original.


From my experiments, the LD one doesn't seem to have been trained on as big or as tagged data set - there's a whole bunch of "in the style of X" that the VQGAN knows* about but the LD doesn't. That might have something to do with it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: