Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Taking dumps of analytics logs and pulling out relevant info for our customers on app usage


This is the `grep/awk` use case. The nice thing about streaming mr interface to hadoop (calling external programs) is that you can literally take your grep/awk workflow and move it to the cluster. Retaining line oriented records is a huge step in having a portable data processing workflow.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: