Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My guess is that scientific computation is a lot of fun. Plenty of clean mathematics. Probably not too much dingy boiler room stuff. I can see how one would be attracted to using languages like Python (and maybe Scheme too) for such tasks.


There is lots and lots of "dingy boiler room stuff" in scientific computing. What the math is doing is clean. Doing that as efficiently as possible in a computer is generally not.


And there is all the data handling/mangling that needs to be done.


Oh man. Nonono.

It's huge amounts of glue and connect-this-program-to-that-program-with-a-pile of regexes. Most of the number-crunching's in hand-optimized C or F90...

During my PhD, I designed/wrote a transition-state prediction algorithm in Python, hooking it up to atomistics codes written in Fortran. One of my colleagues, now one of my cofounders at Timetric, wrote a standards-compliant XML library in ANSI F95 - http://uszla.me.uk/FoX/ - which, believe it or not, is one of the rare cases where XML made life a lot better!


On a computer, you have to also be concerned about error propagation and efficiency, and that can take what might be a very beautiful, simple solution on paper and explode it into a big ugly mess in code.

That doesn't mean it isn't fun, but it's a lot less clean than you might think.


In NLP there is a large amount of boilerplate and perprocessing code. The barrier to entry is actually quite high. New machine learning model for machine translation? "Sorry I'm not convinced your model would work with a sophisticated multitext grammar using available translation lexicons etc etc".

Building a convincing baseline is hard. Which means that it is difficult to show your approach works in general


The Python NLTK helps a fair bit with some of that, especially at an introductory level.

Of course, I hand-wrote a lot of my NLP algorithms in Perl, back in the day. Now that's a good use of a time machine...


For beginners there is also an on-line book:

http://www.nltk.org/book


As well as a printed version, published by O'Reilly, for those of you who prefer dead trees to pixels (like me).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: