SSH honeypot, deployed in the wild, collecting and sharing data

Setting Up a Pure Honeypot

6 Dec 2013 • 9 min read

Retro comoputerIn last week's blog post I discussed an ongoing language engineered brute-force attack on the honeypot and I also discussed duplicate attacks on an identical honeypot under a different IP address.

This week the language engineered brute-force attack has fizzled out; with Saturday 30th November 2013 seeing a total of 663 brute-force attempts - there have been no further attacks from the attacking IP. The total number of attempts from the attacking IP was 4,341. Of those attempts, the top 10 used passwords and usernames are shown in the two tables below:

Top 10 passwords from language engineered brute-force attack:

password count
P@ssw0rd 31
P@ssw0rd1 28
1qaz@WSX 26
123123 23
Password1 17
Password01 14
102030 12
123321 10
password1234 9
password12345 9

Top 10 usernames from language engineered brute-force attack:

username count
root 3648
ftpuser 35
admin 31
test 25
oracle 24
nagios 21
svn 20
www-data 18
user 17
www 15

The table above shows that the password "P@ssw0rd" was used 31 times while the username "root" was used 3,648 times. While it's no surprise to see the username "root" was used for 84% of the brute-force attempts (since any lucky attacker that gets their hands on a root account has super-user access to the entire system), it's interesting to see some commonly used Linux usernames such as "www" (which usually owns Apache web server) and "nagios" which is a systems and network monitoring tool (

Duplicate Attacks

The duplicate attacks have continued on the two identical honeypots running under different IP addresses. This supports the original theory, mentioned in the blog post Let The Hackers In, that attackers are probably scanning IP ranges and then blindly firing out automated brute-force attempts.

To illustrate these duplicate brute-force attacks I've included some logs below:

IP address 208.**.**.***

date ip username password honeypot
Sat 30 Nov 2013, 13:35:52 208.**.**.*** root root charlie
Sat 30 Nov 2013, 13:35:53 208.**.**.*** root toor charlie
Sat 30 Nov 2013, 13:35:54 208.**.**.*** root root alpha
Sat 30 Nov 2013, 13:35:55 208.**.**.*** root 1234 charlie
Sat 30 Nov 2013, 13:35:55 208.**.**.*** root toor alpha
Sat 30 Nov 2013, 13:35:56 208.**.**.*** root 12345 charlie
Sat 30 Nov 2013, 13:35:57 208.**.**.*** root 1234 alpha
Sat 30 Nov 2013, 13:35:57 208.**.**.*** root 123456 charlie
Sat 30 Nov 2013, 13:35:58 208.**.**.*** root 12345 alpha
Sat 30 Nov 2013, 13:36:00 208.**.**.*** root 123456 alpha

IP address 192.**.**.***

date ip username password honeypot
Fri 29 Nov 2013, 07:39:46 192.**.**.*** root aaaaaa alpha
Fri 29 Nov 2013, 07:39:48 192.**.**.*** root password alpha
Fri 29 Nov 2013, 07:39:49 192.**.**.*** root 111111 alpha
Fri 29 Nov 2013, 07:39:50 192.**.**.*** root 123456 alpha
Sat 30 Nov 2013, 18:14:34 192.**.**.*** root aaaaaa charlie
Sat 30 Nov 2013, 18:14:36 192.**.**.*** root password charlie
Sat 30 Nov 2013, 18:14:37 192.**.**.*** root 111111 charlie
Sat 30 Nov 2013, 18:14:38 192.**.**.*** root 123456 charlie
Sat 30 Nov 2013, 18:15:17 192.**.**.*** root aaaaaa alpha
Sat 30 Nov 2013, 18:15:19 192.**.**.*** root password alpha
Sat 30 Nov 2013, 18:15:20 192.**.**.*** root 111111 alpha
Sat 30 Nov 2013, 18:15:21 192.**.**.*** root 123456 alpha
Sun 1 Dec 2013, 17:35:51 192.**.**.*** root aaaaaa charlie
Sun 1 Dec 2013, 17:35:52 192.**.**.*** root password charlie
Sun 1 Dec 2013, 17:35:54 192.**.**.*** root 111111 charlie
Sun 1 Dec 2013, 17:35:55 192.**.**.*** root 123456 charlie

IP address 46.**.**.***

date ip username password honeypot
Wed 4 Dec 2013, 22:48:05 46.**.**.*** root root charlie
Wed 4 Dec 2013, 22:48:07 46.**.**.*** root password charlie
Wed 4 Dec 2013, 22:48:09 46.**.**.*** root 111111 charlie
Wed 4 Dec 2013, 22:48:11 46.**.**.*** root 123456 charlie
Wed 4 Dec 2013, 22:48:21 46.**.**.*** root root alpha
Wed 4 Dec 2013, 22:48:23 46.**.**.*** root password alpha
Wed 4 Dec 2013, 22:48:25 46.**.**.*** root 111111 alpha
Wed 4 Dec 2013, 22:48:27 46.**.**.*** root 123456 alpha
Thu 5 Dec 2013, 03:15:19 46.**.**.*** root root charlie
Thu 5 Dec 2013, 03:15:22 46.**.**.*** root password charlie
Thu 5 Dec 2013, 03:15:23 46.**.**.*** root 111111 charlie
Thu 5 Dec 2013, 03:15:26 46.**.**.*** root 123456 charlie
Thu 5 Dec 2013, 03:15:37 46.**.**.*** root root alpha
Thu 5 Dec 2013, 03:15:39 46.**.**.*** root password alpha
Thu 5 Dec 2013, 03:15:41 46.**.**.*** root 111111 alpha
Thu 5 Dec 2013, 03:15:43 46.**.**.*** root 123456 alpha

I've masked the IP addresses of the attacker(s) to keep identities anonymous and the honeypots are named alpha and charlie (there was previously a bravo honeypot but it hasn't produced much data so far).

These three log snapshots show that various attacking IP addresses (which could possibly be the same attacker using proxies to look like various different clients) are carrying out brute-force attacks with the same credentials on multiple IP addresses at the same time.

An example of this can be seen on Wednesday 4th December 2013 at 22:48:11 (highlighted above) when the IP address 46.**.**.*** attempts the username "root" and the password "123456" on honeypot charlie and then 16 seconds later, at 22:48:27, attempts the same username "root" and password "123456" on honeypot alpha.

Another interesting point to note from the logs is that one of the IP addresses which tried to upload a Trojan Horse via the emulated shell CLI (mentioned in the blog post Trojan Horse Uploaded) is still brute-forcing the honeypot for another username and password: there have been a total of 766 brute-force attempts from the IP address.

Also, since duplicating honeypot alpha, the same IP address client has executed exactly the same shell commands (trying to upload a Trojan Horse, as described in the blog post Trojan Horse Uploaded) on honeypot charlie. This also support the theory that the automation of uploading malicious software by the attacker is attempted on every IP where a valid username and password is found.

The Pure Honeypot

The big accomplishment for the project this week was setting up a pure honeypot. A pure honeypot is basically a fully-fledged operating system with some sort of bug or tap installed (e.g a keylogger), which means I can watch over the attacker's shoulder and see what they're doing on the system.

Setting up the pure honeypot wasn't easy and turned into one of these tasks which rolled over into a few days.

At first, I thought setting up a pure honeypot would be easy by using a tool called Sebek.

Sebek is described as: "...a data capture tool designed to capture attacker's activities on a honeypot, without the attacker (hopefully) knowing it. It has two components. The first is a client that runs on the honeypots, its purpose is to capture all of the attackers activities (keystrokes, file uploads, passwords) then covertly send the data to the server. The second component is the server which collects the data from the honeypots".

Which sounds like a perfect tool for the job and exactly what I'm after. However, for this project I'm using Amazon EC2 powered servers. The main reasons for using EC2s is that:

  1. I'm taking advantage of their Free Usage Tier which keeps the cost of this project low (being a student)
  2. It's an easy way to setup a remote virtual machine (I need remotely accessible machines due to the nature of SSH honeypots).

This creates a slight complication: Sebek uses a system called Honeywall to log attackers activities on the honeypot. Honeywall is: "...a bootable CD that installs onto a hard drive and comes with all the tools and functionality for you to implement data capture, control and analysis".

The problem is: I can't figure out a way to install Honeywall onto an Amazon EC2 instance because the ISO image for Honeywall requires local access to setup the system - which, as far as I can tell, EC2 instances don't provide.

So unfortunately the Sebek/Honeywall option came to somewhat of a dead end. However, my perseverance in the matter brought me a tool called syslog-ng. Developed by Balabit IT Security Ltd syslog-ng is a system logging application that supports remote logging.

Following Hal Pomeranz's guide: Remote Logging with SSH and Syslog-NG I was able to setup a system logger that sends logs remotely over a reverse SSH connection. A pretty neat setup.

So, the current setup: two remote systems (Amazon EC2 instances). One system setup as a pure honeypot to capture an attacker's movements and the other system setup as a secure logging server to collect data gathered from the pure honeypot.

Slight problem: syslog-ng just logs standard system messages (such as when users login, unauthorised user attempts, sudo attempts etc). But what I want is to see all commands entered onto the shell by an attacker. Trying to find a solution to this problem took a while.

I thought that perhaps a simple keylogger would do the trick, but even this proved to be more difficult to implement than I anticipated. So, while trying to find a good keylogging implementation, I discovered some useful resources:

  • John Simpson describes a handy solution for logging client side SSH sessions using a pre-installed tool which is on most Linux systems called tee. John's article Recording SSH sessions also explains using a private key implementation to log remote commands. However, this implementation wasn't quite what I was after since I needed to allow attackers to login with just a username and password.
  • A useful tool called logkeys provides a keylogger for most Linux machines. But unfortunately only works when a keyboard is physically plugged into the machine - so doesn't work on SSH sessions.
  • Stack Overflow question How to capture all the commands typed in Unix/Linux by any user? provided a few clues along with a link to the article Logging every shell command which provided little reassurance: "logging every shell command that a user makes turns out to be more difficult that initially imagined". Although the article did mention a tool called snoopy.
  • Finally, the Server Fault question How can I log users' bash commands? also pointed in the direction of snoopy so I decided to investigate this tool.

I installed the keylogging application Snoopy which is "designed to aid a sysadmin by providing a log of commands executed. Snoopy is completely transparent to the user and applications. It is linked into programs to provide a wrapper around calls to execve(). Logging is done via syslog.".

Although snoopy isn't strictly meant to be used to monitor attackers on a honeypot; it does the job pretty well.

So, at the end of this journey, I finally have a pure honeypot up and running (powered by Linux Ubuntu) which logs all commands entered by hackers and sends these logs to a remote logging server.

The only potential future problem I'm aware of is that, if someone knows what they're doing, it's quite easy to detect snoopy running on a machine. Thanks to Ryan Capman's post Bypassing snoopy logging, an attacker only has to enter the command:

ldd `which ls`

To see that snoopy is running in the background:

[ryan@buggy ~]# ldd `which ls`
        /usr/local/lib/ (0x00002af2d1210000) => /lib64/ (0x00002af2d1412000) => /lib64/ (0x00002af2d161b000) => /lib64/ (0x00002af2d1822000) => /lib64/ (0x00002af2d1a3a000) => /lib64/ (0x00002af2d1d91000) => /lib64/ (0x00002af2d1f96000)
        /lib64/ (0x00002af2d0ff3000) => /lib64/ (0x00002af2d21b1000) => /lib64/ (0x00002af2d23b5000)

At this stage I'm not going to worry about snoopy being detected. I just want to see how effective this setup is at monitoring an attacker on the pure honeypot. If attackers start to detect snoopy then I'll look into other keylogging implementations.

Hacking: The Art of Exploitation

This week I finished reading Jon Erickson's book Hacking: The Art of Exploitation. The book has provided me with a fascinating insight into computer security vulnerabilities. In particular: John explains a lot of the vulnerabilities in C and, since these relate strongly to my project, I'll be using some of the knowledge learnt from the book to toughen up my code.

I haven't published any code for this project to GitHub for a long time. This is partly because the code's currently looking a little messy, but mainly because I've become aware of some vulnerabilities in my code implementation. So the plan is that once I've toughened up the code; I'll upload it to GitHub.

I'm aiming to write a small book review on Hacking: The Art of Exploitation and how it relates to this project soon. Watch this space for more.

Since finishing Jon Erickson's book, I've started reading Clifford Stoll's The Cuckoo's Egg: Tracking a Spy Through the Maze of Computer Espionage.

The book provides one of the first well documented investigations into computer hacking by using a honeypot which Stoll uses to lead him to the hacker Markus Hess. The book is set in the late 80s and was published in 1989, so although the book is outdated in terms of modern computer security, it does provide an interesting read into early digital forensics, hacking and honeypots.

Image credit: "ADM 3A" by Daniel Sancho,

About the author

Simon BellSimon Bell is an award-winning Cyber Security Researcher, Software Engineer, and Web Security Specialist. Simon's research papers have been published internationally, and his findings have featured in Ars Technica, The Hacker News, PC World, among others. He founded Secure Honey, an open-source honeypot and threat intelligence project, in 2013. He has a PhD in Cyber Security from Royal Holloway's world-leading Information Security Group.

Follow Simon on Twitter: @SimonByte