Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here's the problem I have with Google's claims about Bing: Google suggests that Bing is cheating but hasn't demonstrated that Bing's behavior somehow deviates from what is optimal in a machine-learning sense. Thus, the real question is whether Bing should cripple its machine-learning algorithms if they infer (correctly) that Google-suggested results are likely to be relevant.

That is, if any learning system is observing click-stream behavior from users and mining it for relevance evidence, I'd expect it to ultimately home in on the true weight of each piece of evidence in that click-stream data. Since Google's contributions to that data are likely to be highly relevant, any good machine learning system is going to learn that they are relevant and start recommending them.

In effect, then, what Google is arguing is that if Bing's machine-learning algorithms are correctly inferring that results that happen to come from Google are highly relevant, Bing should blind itself to that knowledge.

I'm not sure that's good for anyone except Google.

(For a completely uninformed guess about why Google might be interested in raising copycat claims about Bing, I go out on a limb here: http://blog.moertel.com/articles/2011/02/02/the-google-micro...)



But that means that you have not created an actual search program. You are just using googles data.

Or to put it another way: If google did not exist do you still have a (good) search program? If the answer is no, then there is no reason for them to exist.


> But that means that you have not created an actual search program. You are just using googles data.

No, if your learning system is working properly you should be using Google's data only to the extent it is legitimately observable and more relevant than anything else you're feeding your system. And, for lots of searches, Google's data leave much room for other sources to be more relevant. In the limiting case, when you're feeding your system everything that Google is feeding its, you should almost never return Google-derived knowledge because your system should almost always be able to come up with greater relevance from knowledge derived from primary sources.

Your final question is on the right track, but I'd suggest a small tweak: If Google didn't exist, would the system still offer highly relevant results and, if Google did exist, would the system be able to learn from Google-supplied knowledge, to the extent allowed by law and terms of service and so forth, to offer results at least as relevant?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: