Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> How did you train your search engine?

I used some pretty simple techniques: one bayesian filter to filter jobs from other posts in mailing lists etc, one bayesian filter to score jobs based on keywords. Both filters were trained by feedback from the console ui. The main problem is excluding site-specific keywords that distort the scoring (eg if a site with mostly crappy jobs includes its own name in the listing then even the good jobs will score low by association). A lot of job sites have manky markup so I also had a different scraping script for each site to extract text. All in all its only a couple of hours work. I've been thinking recently about extending it and adding a simple web ui, since finding freelance work is pretty time consuming.

> What sort of collaborative filtering techniques?

I didn't have any specific in mind but there are plenty of good machine learning books that cover different tecniques. If you don't already have a background in maths then 'Programming Collective Intelligence' is a good book to start with. 'The Elements of Statistical Learning' goes into a lot more detail but requires some basic maths.

> Would love to chat further via email.

Email is in my profile.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: