Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm the author. People are asking how this thing is faster than ack or grep. Here's how:

- Literal matches use Boyer-Moore-Horspool strstr.[1]

- Files are mmap()ed instead of read into a buffer.

- If you're building with PCRE 8.21 or greater, regex searches use the JIT compiler.[2] Also I call pcre_study() before executing the regex on a jillion files.

- Ag reads your .gitignore and .hgignore files to ignore code you don't care about.

- Instead of calling fnmatch() on every pattern in your ignore files, non-regex patterns are loaded into an array and binary searched.

I wrote a couple of blog posts about profiling The Silver Searcher and improving performance. http://geoff.greer.fm/2012/01/23/making-programs-faster-prof... is the most informative one, IMO.

1. http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore%E2%80%93Hor...

2. http://sljit.sourceforge.net/pcre.html



Oooh, I forgot about your project when I did the betterthangrep.com website overhaul. I just added it to http://betterthangrep.com/more-tools/


FWIW, git grep also uses threads in some circumstances to get better performance.

Also, obligatory link whenever Boyer-Moore is mentioned: http://ridiculousfish.com/blog/posts/old-age-and-treachery.h...


That's a cool blog post. I've tried not to look at the grep source code until I've written my own solutions, so I didn't know grep made that tradeoff.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: