Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To me, this answers the question "How well can neural networks crack sudoku if we forget all of our domain knowledge, and just try to brute force our way through with insufficient training data and a neural network that's poorly sized?".

I suggest an alternate approach that would use a training set consisting of real-world data of sudoku puzzles that are in the process of being solved: before a box is filled in, and then after it's filled in. This would teach the network to solve the puzzle one step at a time, instead of all at once.

If you insist on only using the as-received puzzle and the solution for training, my intuition is that you'll need more layers than 10 in the NN, and much much more data.

A taxonomy of what complex solving strategies need to be learned already exists: see e.g. http://www.sudokuwiki.org/y_wing_strategy and the categories "basic", "tough", diabolical", and "extreme". My guess is that the NN (using the original author's strategy, but with more layers and lots more data) will eventually do well on "basic" and "tough", but will start to miss on some of the latter conditions depending on the training set and how the network is set up.



To me, it's an interesting question to ask how well NNs can perform without hand-holding, i.e. can they figure out all of the complex strategies that humans have figured out, starting from a blank slate?

C.f. AlphaGo, which invented new strategies that proved interesting to human players.

Of course, if you're training your NN to solve a real-world problem, you wouldn't choose to train it in this way, you would do as you suggested and feed it as much relevant training data as possible, to give it a head-start.

(Your points RE: sizing and quantity of training data sound reasonable and are out of my expertise, so nothings to add there).


There is a very simple strategy that solves all Sudoku puzzles.

Pick one open square, try a number that is possible for that, backtrack if you get stuck.

For better results, pick the open square which has the fewest number of possible choices.

Human solvers might try something more efficient, but the above strategy is not at all bad for computer implementation.

It's very much a solved problem with existing technology, see

https://en.wikipedia.org/wiki/Satisfiability_modulo_theories

adding a neural net to it doesn't help in any way.


Yeah, yeah, sudoku is easy for computers. That's not the point. The point is learning about neural networks. So yeah, "adding a neural net" does help with that.


In which case the hard part is finding a suitable problem; you might be better off learning about SMT solvers because you don't have Marc Cuban telling everybody to do it.


We don't really know if it's suitable until we try it.


I just feel bad about this "my intuition is that you'll need more layers than 10 in the NN, and much much more data."

I still find current AI as basically a filter. Image to text, speech to text e.tc

A smart AI would be able to figure out sudoku rules from a very small sample of games, it would then figure out a rudimentary backtracking algorithm. It would then sleep over it and figure out more patterns like naked singlets e.t.c and build a really fast sudoku solver with 100% accuracy. All this happening in a matter of seconds.


When a pig suddenly starts to dance the remarkable thing is not that it can not do a tango, but that it can dance at all.


Pigs dance all the time for suitable definitions of dance and time.

I don't think optimizing a rats nest of functions with SGD is that impressive. If a model lacks explanatory power what use is it to people?

In this case when the solution from the NN is incorrect what recourse do you have? The actual solution might be an arbitrary permutation of what the NN gave you and there is no way to tell which rows and columns will have to be reshuffled to get the actual solution or even if there is a solution. The constraints might be inconsistent and you will never know.


I suspect there may be a method by which one could convert solved sudoku's into numerous candidate puzzles.

I imagine strategically removing numbers from solved puzzles so as to reinforce the neural connections for solution and filter out possible 'noise'


My immediate thought was to generate training data by making up solved puzzles, and then removing numbers one at a time (while ensuring there's still a unique solution), with a few different branches at each step from an initial solution.

You could generate massive datasets that way -- it would be pretty easy to generate a few billion pre-images of a solution. I mean, a solution has 80 numbers and a puzzle about 20. 80 choose 20 is about 10^18, or a billion billion.


Well, sudoku is a solved problem. This would amount to teaching a neural net graph coloring.


There's a very elegant solution using dancing links, I wonder could you teach a NN to essentially do that. Somehow.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: