Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Masscan: scan the entire Internet in under 6 minutes, 10 million packets/second (github.com/robertdavidgraham)
138 points by nvk on Sept 16, 2013 | hide | past | favorite | 30 comments


What's impressive about this or zmap? they do not magically invent bandwidth. Zmap requires a 1 Gbps connection or more, this will require 10X the bandwidth. Those projects are basically dumb packet generators, what I find really impressive is the Internet routing infrastructure that can accept those outputs.


Having bandwidth is one thing, utilizing it to its fullest potential is anther thing. Only the problem of randomizig target ip and port in real time and not hurting the throughput of the scan in the process is one tough nugget. I suggest that you read the readme before making judgment about the complexity of this project.


>Having bandwidth is one thing, utilizing it to its fullest potential is anther thing.

It's the same thing if you stop coding in javascript


The Readme file is very detailed and well laid out. While reading it, it felt like I was having a conversation with the author. I whish more repos could have that.


> To get beyond 2 million packets/second, you need an Intel 10-gbps Ethernet adapter and a special driver known as "PF_RING DNA" from http://www.netop.org.


Robert Graham (masscan author) gave a talk on scaling to 10 millions packets at Shmoocon 2013 - C10M Defending The Internet At Scale [1, 2]. You basically bypass the kernels network stack and its robust features, and use a special purpose built drive (i.e. "PF_RING DNA"), which you heavily customize. You also need to write your application in a way that can assign CPU and memory for your tasks. This is not just a, run X and get 10 millions packets second, it is a very planned out exercise.

[1] http://www.youtube.com/watch?v=73XNtI0w7jA

[2] http://c10m.robertgraham.com/


So are you telling me this does not have a wonderful GUI where there is a big massive button that says "SCAN INTERNET" ?


I'm sure it wouldn't be too hard to have another server with the big button that would then send the command to initiate the scanning system. I wuldn't be surprised if you could even add a progress bar of sorts with big text reading "downloading the internet". I have good reason to believe it looks like this in its finished wireless form https://lh5.googleusercontent.com/-T-ec3sb6Xgc/ThXz2ApbsjI/A...


I'll create a GUI interface using Visual Basic. See if I can track all the IP addresses.


Then you can zoom, rotate (in 3D), and enhance the results!


Similar to: https://zmap.io/

And the recent HN discussion on ZMap: https://news.ycombinator.com/item?id=6226105


I doubt you can scan the entirety of IPv6 address space in under 6 minutes ...


I'd really like to know what the answer would be, so I thought I'd do the math. So, anybody please correct me, but my math comes up to something like 3.4x10^21 times the age of the universe given the rate above.


2^128 ip-addresses / 10 million packets a second = 7.8 * 10^13 times the age of the universe [1]

To actually scan the entirety of ipv6 address space in under 6 minutes you would need to send 1 billion billion billion billion packets a second [2], or 100 octillion times faster than 10 million packets a second. [3]

[1] http://www.wolframalpha.com/input/?i=2%5E128%2F10%5E7+second...

[2] http://www.wolframalpha.com/input/?i=2%5E128%2F360

[3] http://www.wolframalpha.com/input/?i=10%5E36%2F10%5E7


Assuming ideal conditions, a 10GbE adapter uses about 20W to send at most about 15 million packets per second. Assuming no improvements in efficiency (unlikely), the network adapter that could do this would use 3500 times the power of the sun for that six minutes[1], which would be an amount of energy comparable to the kinetic energy of the Earth orbiting the sun (1/6th) [2]. That amount of energy would be enough to more than boil the oceans[3], it would practically liquefy the Earth. It would be enough energy to ionize a ball of water the size of the earth into plasma, according to some sources, but I'm skeptical that 10000 Kelvin water at that volume would remain a plasma for long.

tl;dr: We won't be scanning the IPv6 address space any time soon. And hopefully not on Earth.

[1] http://www.wolframalpha.com/input/?i=2%5E128%2F360+*+%2820+w...

[2] http://www.wolframalpha.com/input/?i=2%5E128%2F360+*+%2820+w...

[3] http://www.wolframalpha.com/input/?i=%282%5E128%2F360+*+%282...

[4] http://www.wolframalpha.com/input/?i=%282%5E128%2F360+*+%282...



Plus the original story addressing the abuse complaints caused by scanning the entire internet which also included the link to masscan on github: http://news.ycombinator.com/item?id=6383562


I haven't checked what exactly massscan is doing to randomize the IP and port sequences, but if reduction modulo some runtime constant is that much of a problem (according to the Readme at least) perhaps you should consider replacing the modulo and division operations by multiplications?

The canonical reference is Granlund and Montgomery [1]. Luckily, there are ready-made libraries for this, like libdivide [2], which would probably lower the reported 90 cycles into something more palatable (and pipelineable).

[1] http://gmplib.org/~tege/divcnst-pldi94.pdf

[2] https://github.com/ridiculousfish/libdivide


This is impressive, but is there any reason someone would want to scan the entire Internet?

In other words, is that a feature, or is it just a performance metric?


There are plenty of reasons to want to perform that kind of scan on the internet. Sometimes, if all you're looking for is a a sample of how many people are running version X of a product versus version Y, you really don't need to map the whole internet.  You can just sweep for your port of interest across a few million random IPs and call it a day.

Sometimes, though, if you are building a directory, or if you just want a really fantastically beautiful map, you scan the whole shebang.

Doing it all in a few minutes is a little unncessary, though, even if it is neat how much better scanners have been getting lately. A scan that merely takes a few days is plenty fast.


The shorter time scale means that it is now getting easy enough for everybody to do it. With that in mind I feel like the responsible thing is to only release the source as an educational resource and also maintain a public data store of any results gathered by this tool so that people can perform analysis without having to fire so many packets at every machine on the Internet. Not providing a Datastore in Tarball format to mitigate the actual use is a tab irresponsible.


It's really not that easy for someone to do it. The problem is getting someone to give you a connection that can actually put that many packets onto the Internet. Even the author has issues with this points. See http://blog.erratasec.com/2013/09/masscan-entire-internet-in...

> The problem is that I don't have a 10-gbps network to test on. My ISP let's me go out to 100,000 packets/second as long as I deal with the abuse complaints, but that's around 44-mbps.


I believe they (Errata Security) are using it for surveys of the internet landscape (i.e. What server software is running on open ports for web and SSH). But there are many other possibilities if you're creative. What about sending a GET request to every port 80 and indexing the results? A new form of web crawler.


You wouldn't see much, without the correct host header you won't see the many sites that IP could be serving.


True, its purpose is a port scanner, not a website scanner.


Well written, well documented code. Kudos.


This could be handy for joining a fledgling peer-to-peer network, where there are connected peers forming a network, but for one reason or another new nodes cannot find them - just do a masscan of the default listen port to find IPs to attempt connections to.


Impressive engineering; I'm learning a lot going through the source -- thanks!


Wasn't there a similar project on HN earlier this month?


zmap!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: