Bleichenbacher's Ghost

Hey man, computer security is hard.

Understanding the Mirai Botnet

Posted on — Mar 30, 2020;   Reading Time — 8 minutes

Understanding the Mirai Botnet
Antonakakis, Manos, Tim April, Michael Bailey, Matt Bernhard, Elie Bursztein, Jaime Cochran, Zakir Durumeric, et al.
USENIX Security 17

Mirai is an interesting botnet, for many reasons.

  1. It attacked KrebsOnSecurity.
  2. It was very powerful, the attack against KrebsOnSecurity was ~620Gbps.1
  3. It was very stupid.
    1. It did not use DDoS 101 techniques like reflection and amplification.
    2. It used a dumb propagation, brute-forcing telnet creds.

The source code of the botnet is available, it was first posted on hackforums.net2 but you can find it on GitHub. As Krebs points out this is not the author being charitable, this is to gain plausible deniability. Regardless of the author’s intentions, the source code and accompanying forum post are worth a skim. If you can get past the sneers, they reveal how the botnet works.

The figure illustrates real-time-load as described in the Mirai Forum Post
An overview of real-time-load.

This is enough context, let’s jump into the paper.

What did Mirai do?

The functioning of the botnet can be broken into seven main steps.

Step 1: Rapid Scanning

Each bot in Mirai started out in the rapid scanning phase where it pseudorandomly hit Telnet ports TCP/23 and TCP/2323 and if the server responded, it would try to login with 10 randomly picked login credentials from a list of 62 login credentials including the following 6 combinations.

root     admin
user     user
admin    (none)
666666   666666
mother   fucker
admin    7ujMko0admin

Step 2: Report Successes

If a Mirai bot was able to login using the above method, it would report the victim IP and the credentials that worked to the server listening for scans (the reporting server) whose IP is known to the bot (the port is fixed to be 48101.)

Step 3: Dispatch to Loader

The reporting server, which is the go script scanListen.go listens on port 48101, and dispatches the results to the loader.

Why do we have a server in between? Why not let the bot talk to the loading server? The idea is that you’ll have many loading servers and typically the loading servers are high-performance dedicated servers, while the server in the middle is a low-end server running a simple script, in the case of Mirai it is the scanListen.go script. So, the middle server acts as a kind of load-balancer. Also, in the case of a global botnet it might make sense to geographically distribute the loading servers, since loading is kinda bandwidth intensive, this helps.

Step 4: Load Malware

The loader logs in, checks the architecture (MIPS, ARM,…) and loads an architecture-specific binary. After loading, the malware file is deleted. This makes the attack non-persistent; i.e., if you restart, the malware is gone. The binary also removed other infections like qbot.

Step 5: Issue Command

The attacker sends a command to the Command and Control (C2) server.

Step 6: Relay to Bot

The C2 server relays the command to the bots.

Step 7: Attack

The bots perform the attack against the target. The attacks will be described in a bit.

How did Mirai Spread?

Bootstrapping

Who owns a bot? It’s original owner, of course. But in a botnet, the bot is owned and commanded by the botnet C2, but how does the bot know that? The C2 location is hardcoded into the infection. So, once a bot is loaded, it listens to the C2 for instructions. The authors of the paper recovered 62 C2 domains from the infection binaries.

Now, to get a botnet started, you need to have a first infection which infects more devices, which infects more devices, and so on. The authors of the paper call this bootstrapping.

The first scan occurred from a DataWagon IP address (DataWagon is a well-known bulletproof host,3 see Krebs’s post for more.) And within 20 hours, 64,500 devices were infected. To put this in perspective, the attack on KrebsOnSecurity used only 24,000 devices!

Eventually the botnet reached a steady state of about 300k devices. After which the source code was released and a variant (one that used a CWMP exploit) blow up the number to a peak of 600k devices.

Spread over Space

A disproportionate number of devices infected were located in South America and Southeast Asia. However, they were not clustered in AS,4 scans by the authors revealed that the top 10 ASes had 44.3% of the infections and top 100 ASes had 78.6% of the infections.

Spread over Devices

From the description of the attack above, it is no surprise that Mirai was not very successful with enterprise web servers. Indeed, the authors of Mirai knew this and targeted their malware at IoT devices by including hardcoded passwords from known IoT vendors like 7ujMko0admin in the list above which was commonly hardcoded in Dahua IP Cameras. But intended targets don’t mean shit, what matters is the devices that were actually affected. The authors studied device banners to identify the devices infected by this malware and were able conclude that most of the Mirai infections were security cameras, DVRs, and customer routers.

Spread over Bandwidth

IoT devices typically don’t have too much bandwidth. The author’s network telescope observed that most devices were scanning at a rate of about 250 bytes per second. Further, the authors note that there was no rate-limiting code in the infection. This further confirms the hypothesis that Mirai was mostly low-power IoT devices.

What did Mirai do?

Mirai’s Attack Kit

Author’s studies revealed that the 39.8% of the attacks were TCP state exhaustion, 34.5% were application layer attacks, and 32.8% were volumetric attacks This is in stark contrast to other DDoS-for-hire services which primarily use amplification attacks, scraped data of VDO (a major booter) by Karami, Park, and McCoy showed that over 72% of the attacks used amplification.

I have thrown a lot of terms at you, so lets define them one-by-one.

A booter is a DDoS-for-hire service, they are usually called booters or stressors and advertised as services to network admins to stress-test their infrastructure to seem legitimate. (Of course, there are legitimate uses for a stressor, but a lot of these services use botnets and no legitimate company can use them without crossing ethical and legal boundaries.)

A TCP state exhaustion attack is something like a SYN flood, where the attacker floods the victims server with SYNs. Recall that the TCP handshake is SYN, SYNACK, and ACK, so the server responds to a SYN with a SYNACK and stores state so it knows that the server already received a SYN from this IP. At scale, this attack can exhaust the server’s “state”; in practice, this can be something like the number the ports available or the amount of memory available (suppose a server makes allocations on a SYN packet assuming that client will initiate a connection.) Mirai also did other variations on this attack like the ACK flood (stateless network devices like Firewalls process all packets and this could exhaust their memory) and the ACK-STOMP flood.

An application layer attack is something like a HTTP flood, where the attacker floods the victim server with GET requests (typically, for expensive assets like images which require a lot of work from the victim server) or POST requests (these typically require a lot of work as well because most servers do server-side validation of POST data.) Mirai also did other variations like the GRE flood and DNS flood.

A volumetric attack is something like a UDP flood, where the attacker floods the victim with UDP packets. The intuition is that the server will check if any port is listening on the port and if not replies back with an ICMP server unreachable, further if one can flood the server with requests, it should be overwhelmed.

Mirai’s Targets

Studies by the authors showed that Mirai targeted around 5k victims of which 4730 were individual IPs, 196 were subnets, and 120 were domains. Including an attack on /0 subnet i.e., everyone (makes for good for lolz I guess.) And, as expected, most of the targets were located in the United States. Looking at the port numbers for TCP attacks, the authors noticed that most attacks were on ports 80 (HTTP), 53 (DNS), 25565 (Minecraft), 443 (HTTPS), 20000 (DNP3), and 23594 (Runescape). As expected, the targets included competing Mirai C2 servers (this was after the source code was released.)

The attack on KrebsOnSecurity mentioned above, which was clocked at ~620Gbps used only about 24k devices in the botnet. This is frightening because sheds light on the potential of IoT attacks. Imagine the impact if the attack used all bots at Mirai’s peak, 600k devices, and more sophisticated attack methods?

While the attack on KrebsOnSecurity was bad, regular people didn’t really notice it. The attack on Dyn is what caught the world’s attention, it affected Twitter, Reddit, PayPal, GitHub, and the Playstation Network. If the last item looks a little out of place you’ll be even more surprised to know that this (the Playstation Network) was the only intended target! Reverse DNS queries and more sleuthing by the authors revealed that the target probably was ns<05-06>.playstation.net which was managed by Dyn. The title of the corresponding Forbes article illustrates the absurdity quite well: Angry Gamer Blamed For Most Devastating DDoS Of 2016.

Finally, there was the attack on Liberia’s Lonestar Cell which some people, including The Guardian, claimed to have taken Liberia offline, but as Krebs points out, with tonnes of evidence, that seems unlikely.

Further Reading

Read the paper, I have skipped over many parts including the entirety of the methodology which is fascinating.

Plugs

Support USENIX, they made this paper open-access and do a lot more awesome things!


This article is for informational purposes only and cannot be interpreted as advice, nor is it to be relied on in making a decision. In particular, please don't do illegal things.


  1. I am using networking terms, Gbps is gigabits per second. ↩︎

  2. hackforums.net is a weird place. It seems to have a lot of computer security beginners and people selling exploits. Before Mirai, apparently there used to be a DDoS-for-hire board. It seems to have a lot of people I would characterize as script kiddies (people who run exploits written by other people) which makes it a great place to dump source code for botnet. ↩︎

  3. A bulletproof host is one that does not handle abuse reports. For example, if you are port-scanning from big host like AWS, they will probably receive an abuse complaint from someone and AWS will promptly kick you. Bulletproof hosts will not even read the abuse reports! If you want to do things that will lead to abuse reports (like scanning for telnet) you better use a bulletproof host (THIS IS NOT ADVICE, this is my conversational tone.) Typically, these hosts are located offshore in countries where there are lax restrictions on stuff like this. If you wanna do this, you probably also want a bulletproof domain registrar (if you are using domains instead of raw IPs) because a domain registrar could also kick you by neutering your DNS. ↩︎

  4. An autonomous system (AS) is a group of routers (usually under a single operator) with a clear routing policy. The internet is a group of ASes, with an intra-AS routing policy and an inter-AS routing policy. In practice, ISPs (e.g., Sprint) and large organizational networks (e.g., large universities like MIT) are ASes. ↩︎