Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Paint yourself in the style of classic art, in your browser (github.com/reiinakano)
266 points by reiinakano on Oct 2, 2017 | hide | past | favorite | 99 comments


Cool work! We made this demo using the same style transfer network in TensorFire: https://tenso.rs/demos/fast-neural-style/! Your size slider was a nice touch :)


You're lengstrom from Github! I must let you know your models are exactly what I used! I don't have a beefy GPU so I was forced to use what you already had, haha.

For everyone reading, these guys coded up the neural network I used, all I did was port it to a different library. :)

Eagerly waiting for the Tensorfire API!


Any news on the fast-neural-style release date?


When I see the style transfers, I'm reminded of the early days of Photoshop, and whatever else was a commonplace photo editing application of the mid-90s (like Fractal Painter). It was fun to play with the idea of oil painting, or watercolor, or to simulate paper textures. How is style transfer an improvement? Undoubtedly, it's more sophisticated and requires more computational resources.

Can it be used for artistic purposes, or is it only skin deep like those old filters? It looks to me as if this is pure imitation, your images of a night with friends at the street cafe, or stacking bails of hay can look like a Van Gogh when you upload it to Instagram.

Am I missing something?


Sure, for example https://imgur.com/a/HUN9G. Just by playing around with drawing different textures, I was able to get the above with the wave model. I just see it as another digital tool like you do, but with creative use I could see some pretty cool applications in modern art.


After seeing this, and thinking through what I wrote yesterday, I have a bit of a different take. I don't like the style transfers of famous art-piece, to user-provided image. A skin-deep transfer of the look of Van Gogh, or Katsushika Hokusai, or Renoir, or whatever doesn't make art (in my opinion).

I realize these tools are just a proofs-of-concept, which is something I forgot yesterday. What I'd like to be able to do is style transfer between arbitrary images, whether it's my own work or that of others. That would open up the creative possibilities.


Fast Neural Style transfer is a variation of a "regular" Neural Style algorithm.

With regular Neural Style, you can do style transfer between any two arbitrary images. The disadvantage is it takes longer so doing it in the browser might be unfeasible for now. See https://github.com/anishathalye/neural-style

Anyway, even with Fast Neural Style, you can use any arbitrary image as a Style, but you'll have to train it first (4-6 hours on a GPU).


Wow, that is actually super cool!


As you can see from the output, the model can simplify the shape, alter the geometry, put a texture on an object (projecting a texture like UV mapping) without being explicitly programmed to do so. It is inherently different from Photoshop filters that apply a fixed transformation on the pixels. The fundamental difference is that the model is able to abstract away from the pixel representation of the image and it can decorelate content and style. Also, it is not necessarily more computationally intensive in the generation / forward phase.


It still looks like Poor Photoshop Art Imitation++ though.

Outside of the developer community, no one cares how this works. They only care about whether or not they connect with the result.

And a good historical hint is that no one uses those old Photoshop filters now, because they're uniformly cheesy and terrible.

Artistically, this is similar.

The Deep Dream output was an interesting novelty, for a while, but to most people this style transfer process is going to be Just Another Photo Filter Effect.

Technically, it's not abstracting shape and texture in an intelligent way. It's applying a mechanical effect in a mechanical way with absolutely no insight into texture or spatial geometry.

Even in abstract art, shapes mean something. They're not just a set of coordinates. This process doesn't understand that meaning - and generally, developers who think "art" means "I made an image with my computer" don't understand it either.

Sometimes this process gets lucky and something passable falls out, but mostly the results are mediocre.


I mostly agree that the results are not always good. As someone who read the original style transfer paper, and implemented it, I disagree that it is applying a mechanical effect. Maybe I'm biased but the algorithm use a pre-trained CNN on object recognition. The CNN develops a representation of the image that is more and more abstract along the layers of the network. You can visualize the information in each layer by reconstructing the image only from the feature maps. And when you do this you will see that the network, learns the objects and textures of the image compared to its pixel values (pixels -> edges -> regions -> objects, etc).


Dunno about you, but when I first saw some of the results, I was amazed. Here are a few more examples:

https://github.com/jcjohnson/fast-neural-style

https://github.com/lengstrom/fast-style-transfer

Admittedly, though, I haven't seen the Photoshop filters you speak of. Could you link to some of them that show these same effects?


I think there's something of a human selection effect with the more impressive outputs they've opted to show; much as you can throw a far dumber algorithm at far less selective samples of Shakespeare and end up with the non-incoherent examples seeming wonderfully poetic

Take the example of the woman's face and Munch's The Scream. Ask an art student to pain a woman in the style of The Scream and you'll probably get her with a simplified, harrowed expression, surrounded by swirly waves and skies. Run this through the algorithm and you'll get a photograph of a woman's face with brushwork superimposed over it including a choppy, vibrant orange forehead instead of a choppy, vibrant orange sky. It's not particularly visually interesting and is obviously the work of a dumb algorithm[1]

On the other hand, I combined the woman's face with Picabia's Udnie and the results were very pleasing, especially the way the contrast of black and white and sharpened edges happened to emphasise the model's cheekbone structure and eyes, and it looks like something a human might draw and sell as an original semi-abstracted portrait (albeit something an art teacher setting the "draw her in the style of Picabia" assignment would probably still frown at and say they'd missed the point of the exercise in producing representative art and not really grasped Picabia's massing or shading either)

[1]And I'm sure it's not the first time I've critiqued computer generated art based on combining Munch's the Scream because it treats the top of the image as vivid orange sky regardless of what it actually is...


What do you mean? It is what it is. You can use it for artistic purposes if you can think of some creative way of using it, but if you're just uploading images and spitting out altered versions I don't suppose many people would consider that to be particularly artistic


Of course, computers can't be creative, so it is just imitation. What it creates is not novel. It's just doing what the programmer told it to do. Blah blah blah.

In my opinion, it is one step -- and an impressive step -- in the direction of computers being able to actually be creative. The areas where humans can do something that computers can't is shrinking. You can dismiss it if you want, but at what point will you acknowledge that what the computer is doing isn't simply imitation? I guess you can keep moving the goalposts, but to me, this is pretty impressive.


I think the definition of "creativity" is very vague when we try to explain that in terms of computer or AI. Generative artwork/systems, which are highly random tends to appear "creative" and not just any "imitation".


> Of course, computers can't be creative, so it is just imitation.

I take it that you've never worked in IT support :)


This assumes creativity is anything more than copying from multiple sources.


If it's not more, how come there is anything to copy from?


Copying from nature with tiny little variations I guess?


Little compared to what? They're infinitely big compared to no variations. Ignoring that part because it's unclear is like throwing out the baby and keeping the bathwater to me.


What is your definition of "creative"?


Having the ability to make something original, I suppose. But then you can question what is original.

I tend to think everything is somewhat derivative. But some things more than others. The things that are the least derivative, are the most creative.

And what this program does is more creative than most things computer programs do, and probably more creative than lots of things that, if a human did it, you'd call "creative."


I don't know if the un-repurposed output is artistic, but surely creating these algorithms is.


I don't know if it's my machine or whether my expectations are too high, but every generated image looks like the same blurry version of the original but with a different colour pallet. I certainly don't get anything as beautiful as the output examples.

This happens regardless of the source content I use (any of the examples or even my own uploaded files) nor any of the reference styles.

User agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36

(Chromium running on ArchLinux AMD64)


Looks like I'm getting the same results with Firefox on the same host PC as well.

Screenshot showing the same input values used as the one in your example:

https://armyofcrabs.com/rand/Screenshot_20171002_170417.png


These definitely look wrong. Unfortunately, it's super hard to debug since I can't reproduce. :(

My only advice I guess is to try it on a different machine when you have one available. Sorry, if this is a problem with the library I used, I can't really do anything to fix it. :/


Not sure if this helps but I've tried the project you based this off[1] and that produces images correctly (or at least that look more like my expectations).

(I say "based", I'm not sure just how much of their code - if any - you used. But you did compliment them earlier in this discussion)

[1] https://tenso.rs/demos/fast-neural-style/


Didn't really use any of their code. I reimplemented their model in a new library called deeplearn.JS.

If the issue is indeed with the library, there's little I can do.

I will still try to find what the issue is but can't make promises. Sorry for the letdown!


To everyone facing this issue, please leave a comment here with your browser and system specs:

https://github.com/reiinakano/fast-style-transfer-deeplearnj...


Demo website: https://reiinakano.github.io/fast-style-transfer-deeplearnjs

Deeplearn.JS library doesn't work on mobile yet, sorry to folks on the go!


Is the aspect ratio important? I'm choosing a roughly square image and it only calculates half an image.


Aspect ratio shouldn't matter, the output should still be the same size as the Content image. Mind sharing the pic?



I was able to stylize it fine. Maybe refresh and try again with a smaller size?


I'm getting 404 on js bundle so the page won't load.


Fixed now. Sorry!


Might want to try some basic HTTP caching settings for the model. This way the browser doesn't need to re-download everything every time.

This can probably be configured in the server instead of in the application.


I think most CDNs will add reasonable cache-control headers as well - IIRC Cloudflare does this with sane defaults out of the box.


Stupid question: Is CloudFlare free for CDN purposes? Right now I'm serving everything from Github Pages. I don't expect there to be much traffic after this initial HN spike dies down, so if it isn't free, it won't be worth it for me to pay for it.


Yep[1]! I used it for my girlfriend's small business site (Wordpress) and was very happy with it.

I think CSS, JS, and images are cached - HTML is not.

1: https://www.cloudflare.com/plans/


tl;dr: Impressive Prisma clone with six filters but running all calculations locally in the browser. Compared to Prisma it feels slightly slower or as slow, one filter almost froze my MBP 4@2.7.


It is currently VERY slow since there are a lot of non-optimized (non-GPU) calls being made. Once those are implemented in deeplearn.JS, things should run much faster. In fact, one of the creators of deeplearn.JS says it might be feasible to do this realtime in a webcam (although not at 60FPS).

This was a super quick PoC demo I hacked together in a week on a very new library. :)


Weirdly, something about the generated images (colour range?) breaks Twitter's profile photo update. I actually took screenshots of the original taken with my webcam and the deep learning style transferred version, the former worked for my Twitter profile photo, and the latter resulted in an error. I tried different resolutions too, and it always broke.

Shame, I really like the effect!


You took screenshots and it still broke? That's.. very odd, haha. I will look into it and see if it outputs out-of-range colors.

EDIT: My bad, there are indeed out of range colors! I will update here when fixed.


Updated!

There should no longer be a problem with Twitter. The way I saved the styled photo was Right click > Save image in Chrome.

Thanks for reporting!


The demo site doesn't appear to work on my end -- it gets a 404 attempting to load "bundle.js"



It appears to be working again (for me)


Yep, it's working now.


Works now. Sorry guys. I tried to hotfix something into production and screwed things up. This is a lesson I will not forget... Lol.


Despite being able to fetch https://raw.githubusercontent.com/reiinakano/fast-style-tran... I'm still getting 404s for https://reiinakano.github.io/fast-style-transfer-deeplearnjs... (but I can get https://reiinakano.github.io/fast-style-transfer-deeplearnjs...)

Maybe throw ?<random> on your linked urls? Github's cdn appears to be in terrible shape.

¯\_(ツ)_/¯


Maybe clear the cache? Sorry not sure how to debug as it's up for me on all my browsers + curl


Clearing cache doesn't fix it. So far the only way I can get it to load is by putting a breakpoint inside initializePolymerPage just before it appends bundleScript to head, and setting bundleScript.src = 'bundle.js?<anything>' and then resuming execution. Like I said, ¯\_(ツ)_/¯


I guess this is Github's issue then? Things should settle down in a few days, it might just be having trouble with the HN traffic.

I'm not very well versed with the web, what does 'bundle.js?<anything>' do? Will adding that to my code solve the problem?


A question mark indicates the beginning of a query string ( https://en.wikipedia.org/wiki/Query_string ). Basically most web servers ignore it and anything that comes after it if they're not expecting to process any query parameters, with one useful bonus. Query string parameters unique-ify the request, so that any intermediate address caching doesn't interfere with getting the latest version of the URL content if you append something like: "?" + Math.random() to any file fetch requests. It doesn't really matter what comes after the question mark. If it's different than a previous request, it will almost always bypass any caching layers.

(fwiw, the cdn has finally resolved itself and the page now loads for me too)


Ah, that makes sense. Thanks a lot for the detailed explanation!


I don't even get that far. It just don't load anything.


Cool. I'm getting some great results. Would be awesome if you could train your own model in the browser too.

By the way: How does a NN learn from a single image? I understand how a NN can learn to classify images. But how can it learn a style from a single image?


Fast Neural Style Transfer works like this: Each style image is actually trained on a neural network outside of the browser (using Python on Tensorflow) for around 6+ hours with a GPU.

The output is a neural network that can take any content image and outputs it in the style it was trained on.

The browser does this: 1. Downloads the model (6.6MB~) for the particular style. 2. Does one pass through the content image and outputs the styled image.

No training is done in the browser whatsoever.


Yes, I am aware of that.

But I would like to know how the training works. How does a NN learn the style of an image?


I think you can consider one single image as a large corpus of data, in the same was that you could train a simple Markov model from a large paragraph of text. It doesn't matter to the model whether the original text was 500 words in one piece of text, or five 100-word texts.

It's mostly looking at local regions and seeing what guesses it can make about the next region given one region. (If I understand correctly).


Ahh I see, these are some resources that should help you.

The original Neural Style Transfer paper should help you understand the loss functions involved: https://arxiv.org/abs/1508.06576

The paper introducing Real-Time Style Transfer is basically the algorithm used here: https://arxiv.org/abs/1603.08155


Is it using the same tech as android/ios app Deep Art Effects?

https://www.deeparteffects.com/

Results look very similar


Yes, most likely.

At most they probably tweaked things a little bit for some minor improvements, but the underlying idea and algorithm is very likely the same.

Can't confirm this, though. :)


Hey, it looks very cool. However, it's hanging my browser. Haven't looked at the source, but maybe this should be implemented with web workers to increase perceived performance.

Anyway, great work. Congrats!


Things are currently quite inefficient as there are some non-optimized (non-WebGL) calls being made. DeeplearnJS is a new library but once optimized functions are available, I'll update the site. :)


Uncaught ReferenceError: dialogPolyfill is not defined at HTMLDocument.<anonymous> ((index):22672) at (index):24


Okay, looked a bit into this and this piece of code is only run if an unsupported browser is detected, which is one of three cases, Mobile or Safari or a non-WebGL enabled browser.

I guess if you're not on mobile or Safari, then your browser doesn't support WebGL and you wouldn't be able to run it either. Sorry!


This doesn't happen when I visit the site on Chrome on Linux. Do you mind sharing your system + browser?


Sure! Running Chrome (Version 61.0.3163.100 (Official Build) (64-bit)) on windows 7. Let me know if you need anything else.


Same with Chrome/63.0.3223.8 on Intel Mac OS X 10_11_6

It’s the call to dialogPolyfill.registerDialog(dialog);


Hi, thanks for the report, but I'm not really a JS developer and have no idea how to work around this. Would you happen to know if there are any fixes I should look into?


Am I the only one that get "the stuff of nightmares" kind of results ?

Ubuntu 17.04 FF 57 & Chrome 59 with face and Udnie:

https://screenshots.firefox.com/4bMiETLnl1yx8NOx/reiinakano....


To everyone facing this issue, please leave a comment here with your browser and system specs:

https://github.com/reiinakano/fast-style-transfer-deeplearnj...


I've seen similar results in my open source work when trying to deploy models using batch normalization. I wonder if the author is using this operation.


Nope, same results on Ubuntu 16.04, FF 57

Edit: Looks really cool though! Maybe I can find a windows machine to give it a try on.


I get the same nightmare results on Linux in Firefox FF 59 and Chromium 58.


I don't know if it's loading or what but it's just blank on my phone's chrome browser.


Doesn't support mobile :) Please use desktop (Chrome suggested)


The chat program on Line, Rina, does this as well if you send a picture of a person to her.


What size images can it generate? I've used other libraries that can't go past 800x600.


Theoretically, for this algorithm, there is no size limit.

Practically, if you select something too high, your browser will die for lack of memory or you will grow old waiting for it to finish. My crappy 4GB RAM laptop can only handle around 300 x 500. Try setting the size to maximum, and if your computer can handle it, I'd be glad to guide you in using bigger images.


How do I input an image? There's no option to do that on Firefox 58.


"Upload from file" option not appearing?


No controls at all were appearing. Now, from a different computer --- but still at Firefox 58 -- everything is working fine (maybe you've fixed it in the meantime, maybe not).


Error: Failed to link vertex and fragment shaders.

I am on windows 7 using firefox 55.0.3


Do you mind trying Chrome? Deeplearn.JS is a new library and there are bound to be weird bugs like this I have no hope of debugging. I believe the most supported version is desktop Chrome.


Mobile not supported - please post some screens of results..


You can see one result in the repository README! :)


Looks like it does not work on mobile chrome


DeeplearnJS does not work on mobile at all unfortunately. It's a new library, they'll get there. :)


That makes sense. An error-message would have been nice.


you should likely put a warning message for mobile users to let them know the website doesn't work on mobile


I get "Error: Unable to create WebGLTexture." every time I try and run a sample. I got a 780ti, so its no graphics hardware problem...?


you probably want to file that on the issue tracker, not here.


Why not here? - if someone is going to put something "in my face" that does not work (for me), I do not feel it inappropriate to respond in the same forum.


Demo doesn't load anythin


Fixed now. Sorry!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: