Thursday, December 8, 2016

Infinite Login to a Lightspeed-controllled network

If you've never heard of them before, Lightspeed Systems is a company that makes content-filtering appliances for schools and businesses. I happen to frequent a WiFi network that is heavily filtered by one of these devices, and as you can imagine, it is quite bothersome to come across a "this site is blocked" page in the course of doing my work. I understand the necessity of blocking certain pages, but not all blocking rules make sense:


The other big problem that I have on a regular basis with the system is that I have to log in to the internet every day in order to access anything. This becomes a huge problem with a headless device such as a Raspberry Pi. On my laptop, with a GUI web browser and a saved password for the login, it's not a huge issue, but with a Pi-based project that needs to connect to the internet continuously, it becomes a problem.

Last year, with an older version of the content filter, I was able to build a very simple automatic login script with Python and Requests that simply submitted the login form with my credentials and worked very well. This year, the website for the login was changed to dynamically load the login form with a highly-obfuscated and very large Javascript file. While attempting to write a new script that automated the new login system, I came across many of the different pieces of the login page.

The login page loads three Javascript files before it can dynamically render the page. Interestingly, it also loads a font from fonts.gstatic.com, which must be whitelisted to allow access through the filter even before a user is authenticated. Of my interest, however, are the Javascript files. The first one, access-default-<long hex string>.js, which is over 2.4MB, contains all the internationalization data, as well as the functions that load the page dynamically. This file is minified and obfuscated to make it very difficult to read, and its size causes some pretty bad lag in several text editors that I tried.

The second file is called data.js, which holds a base64-encoded json array that contains info about the images and Google authentication data, neither of which would help me log in. The third file, redir.js, was the one that caught my interest, containing this:

$(function(){renderAccessPage('another long base64 string')});

The function renderaccesspage() was somewhere inside the obfuscated js file, and the base64 string was another encoded json array, with more useful data inside it:

{"id":"d661beba-bc95-11e6-85be-00e0ed5686c8","ip":"10.23.25.159","host_id":"00000000-0000-0000-0000-000000000000","ident_id":14742636577554739,"filter_id":null,"tier_id":1,"reason":32769,"url":"portswigger.net/burp/","target":"portswigger.net","policy_id":null,"policy_rule_set_id":6,"cat_id":28,"cat_name":"security.proxy","is_auth":true,"can_auth":true,"auth_expire":600,"user":"redacted","host":"","host_dn":"","host_ou":"","can_override":false,"auth_override":false,"override_list_id":null,"override_time":0,"custom_access_page_id":null}

Clearly, the renderaccesspage() function takes these data and integrates them with a hidden field called redir in the form that is submitted. All my attempts to submit without this field resulted in an error, but without deciphering the Javascript, I had no idea how to generate the right data.

However, you may have already noticed that there is a field called auth_expire that is set to 600. If I'm correct, this is the login lease in minutes, as in my experience, the lease is almost exactly ten hours. Because this value is used by the function to generate the redir info, I thought it may play a part in the server deciding how long to grant a lease. Using Burpsuite and Python to edit redir.js, I was able to change the time of my lease to 60000 minutes, ensuring that I won't have to login again for a very long time.

It has so far worked, as I haven't been prompted for a login page after several days...

In conclusion, I find it strange that certain parts of the login process are left vulnerable like this. If the developers wanted the lease to be set to ten hours for everyone, it would probably be easier to do on the server rather than allow editable data like this to pass through a client device. I don't think that it's a huge vulnerability, but it may allow users access to the internet well after their account has expired, as long as the login persists in the system.

On another note, I believe this opens up an avenue for me to make my Raspberry Pi project work. Because the authentication is linked to a MAC address, I simply need to spoof my MAC address to that of the Pi, login with an arbitrarily long lease period, and then change back. Then I won't have to worry about it any more...



Monday, February 15, 2016

Exploring the OS of a Segate GoFlex Home

I recently happened upon a Segate GoFlex Home, which is a consumer grade desktop NAS with a single removable drive. I had really no use for it in its original state, since it required proprietary software to manage, so I disassembled the disk module and stuck the disk into one of my servers. The base, which you can pick up for around $35, sat unused until I decided to take off the case and investigate.

Hardware

The inside is fairly simple. The CPU is a Marvell 88F6-281, an ARMv5 running at 1.5GHz. It has a Nanya NT5TU64M16HG-AC 1GB DDR2 ram chip (I am unsure because other sources claim it's only 256 MB, and the official Nanya website is down at time of writing). For storage, it has a Toshiba TC58NVG1S3ETA00 256MB NAND flash chip. There is also a Marvell 88E1116R gigabit ethernet controller, as well as various voltage regulators, Ethernet and USB ports, and a SATA port on the bottom of the board. The power supply is 12V at 2A, likely to support the overhead of a mechanical hard drive.





(I added the heatsink after I disassembled it.)

Software

The process of rooting the device is quite simple and documented in several places. First you need to connect to the webserver on the device to create an account. Simply find the IP of the device, navigate there in a web browser, and follow the prompts. When finished, you should have a username and password. Now look on the bottom of the case and find the product key, look for something like PK: XXXX-XXXX-XXXX-XXXX.

Now, get an SSH client and connect to username_hipserv2_seagateplug_product-key@ip.address, where username is the username you previously created (in lowercase), product-key is the key in the form of XXXX-XXXX-XXXX-XXXX (all uppercase), and ip.address is the ip of the device. You should be able to log in with the password you created earlier. If all went well, a bash prompt should pop up.

Then, in order to gain root access, all you need to do is sudo -E -s

I went exploring into some of the files:

-bash-3.2$ cat /proc/cpuinfo
Processor : ARM926EJ-S rev 1 (v5l)
BogoMIPS : 1192.75
Features : swp half thumb fastmult edsp 
CPU implementer : 0x56
CPU architecture: 5TE
CPU variant : 0x2
CPU part : 0x131
CPU revision : 1
Cache type : write-back
Cache clean : cp15 c7 ops
Cache lockdown : format C
Cache format : Harvard
I size : 16384
I assoc : 4
I line length : 32
I sets : 128
D size : 16384
D assoc : 4
D line length : 32
D sets : 128

Hardware : Feroceon-KW
Revision : 0000

Serial : 0000000000000000

-bash-3.2$ cat /proc/cmdline 
console=ttyS0,115200 ubi.mtd=2,2048 root=ubi0:rootfs rootfstype=ubifs init=/linuxrc

-bash-3.2$ cat /proc/version 
Linux version 2.6.22.18 (ramang@es5x86.axentra.com) (gcc version 4.3.2 (sdk3.2rc1-ct-ng-1.4.1) ) #16 Thu Jun 17 01:37:53 EDT 2010

bash-3.2# uname -a
Linux axentraserver.jimmy.seagateshare.com 2.6.22.18 #16 Thu Jun 17 01:37:53 EDT 2010 armv5tejl armv5tejl armv5tejl GNU/Linux

Linux seems a bit out of date, don't you think? They didn't seem to try to slim it down much:

-bash-3.2$ file /bin/cat
/bin/cat: ELF 32-bit LSB executable, ARM, version 1 (SYSV), for GNU/Linux 2.6.16, dynamically linked (uses shared libs), for GNU/Linux 2.6.16, not stripped

However, It does fit, barely:

-bash-3.2$ df -h
Filesystem            Size  Used Avail Use% Mounted on
rootfs                212M  160M   52M  76% /

bash-3.2# ls /usr
X11R6  bin  etc  games include  kerberos  lib libexec  local sbin  share  src  tmp

What? X11? Games?? There is nothing in those folders, but still...

I'm working on a way to mount the filesystem and dump the whole thing... Might be shared soon.

It appears to be running several services, as indicated by an Nmap scan:

PORT      STATE    SERVICE     VERSION
21/tcp    open     ftp         vsftpd 2.0.7
22/tcp    open     ssh         Seagate GoFlex NAS device sshd 4.3 (protocol 2.0)
80/tcp    open     http        Apache httpd 2.2.3 ((Red Hat))
139/tcp   open     netbios-ssn Samba smbd 3.X (workgroup: SEAGATEGROUP)
443/tcp   open     ssl/http    Apache httpd 2.2.3 ((Red Hat))
445/tcp   open     netbios-ssn Samba smbd 3.X (workgroup: SEAGATEGROUP)
515/tcp   filtered printer
548/tcp   open     afp         Netatalk 2.2.0 (name: GoFlexHome; protocol 3.3)
631/tcp   open     ipp         CUPS 1.2
6689/tcp  open     daap        mt-daapd DAAP svn-1696
8200/tcp  open     upnp        MiniDLNA 1.0 (DLNADOC 1.50; UPnP 1.0)
49152/tcp open     upnp        Portable SDK for UPnP devices 1.4.6 (Linux 2.6.22.18; UPnP 1.0)

I'm confused as to why there are printer daemons, unless it shares a printer plugged into the USB port. As for the various media servers, I assume it indexes your media and serves it up (UPnP worked, but I don't have a disk so I wasn't able to test its full capabilities. As for the FTP and Samba, probably just fileservers. It's a pretty neat system that most people probably wouldn't ever use to full potential...

I'm still confused about most of the filesystem though. There appears to be a huge amount of random empty folders for apps like OpenVPN, Transmission Bittorrent, Gnome, and X11. All the files have been removed, but the folders have been left, cluttering the drive.

Controlling the LEDs

If you're interested in running custom code on this, LED indicators are useful. There are 3 LEDs on the front of the board. In order to set these, call the set-led-status executable like so:

set-led-status <led> <mode>

Where <led> is green_led, orange_led,  or  hdd_led
and <mode> is off, on,  or  blink

To be continued...

Sunday, January 3, 2016

Easy passive host discovery, featuring Scapy

If you've ever Wiresharked a large, high traffic LAN, you may have been overwhelmed by the volume of data. Package captures can easily exceed 500 packets per second, and although there is a lot of useful information, you as a human can't possibly process it fast enough.

Enter the world of Python and Scapy. If you haven't ever heard of Scapy, you're missing out. Among other things, it can send and receive custom packets with whatever layers you desire. It's also capable of dissecting packets, including pcap streaming and live sniffing. The latter is what I'll be talking about today. Most packet sniffing tools, like Wireshark and tcpdump, can dissect packets and do lots of powerful analysis on them. The main problem is that both these tools, and others like them, only can do per-packet analysis. What's really useful is when you can do per-host inspection. That's where Python comes in.

If you understand all this stuff, take a quick look at the Github for this project. If not, read on.

Packet Dissection

To understand the following scripts, you'll have to know a bit about what is going on inside a network packet. Each packet is transmitted as simply a string of binary, but depending on the type of information, this binary packet can have many different layers of information.

The lowest layer (usually) is called the Ethernet layer. It usually contains information like the source MAC address, destination MAC address, and a few other parameters, along with an encapsulated higher-level packet. A MAC address is a unique identifier of an interface card. It's not necessarily unique to each computer though, since your Wifi card, ethernet card, bluetooth module, etc all have different MAC addresses. In the following script, I identify each computer by its MAC address, assuming it only has one interface card connected to the network at one time. A useful feature of knowing a MAC address is that you can take a good guess as to the manufacturer of the network card or device. Manufacturers prefix MAC addresses with identifying information, so the manufacturer of a network card can usually be found.

Often, the next layer up is Internet Protocol, either IPv4 or IPv6. This layer is also used for routing of raw packets, but instead of identifying everything by its MAC address, it identifies by an IP address. This is the primary mode of communication for mainstream internet protocols, like HTTP.

Many different messages can be contained within these two layers of packets. For example, DHCP requests are sent every time a host connects to a network. These help give a host its IP address dynamically when it joins. For our purposes, it associates a hostname with an IP address. This can be extremely useful, since most people by default will leave their name in the hostname of their computer (e.g. JoeSmith-Macbook-Pro).

Passive Versus Active

The benefits of passive scanning over active are immense. Although active scanning can reveal much more information, it is much noisier on the network. Mass portscans are the biggest contender for noise, since most IDS or sysadmins will become suspicious of thousands of packets being sent at random to different hosts. Try Wiresharking while you perform an Nmap scan:


If it's done right, you can passively scan without sending a single packet onto the network. Then, once you have enough info, targeting single hosts and specific ports will likely remain unnoticed.

On To The Script

I have created a Python program to attempt to simplify per-host analysis. It uses Scapy to sniff and dissect packets, and then uses various plugins to process the data. The idea is that each plugin has a specific list of required layers, and if a packet has all those layers, it is sent to the plugin for processing.

For example I have a plugin called "ip" which associates IP addresses with MAC addresses in a table of the database. I also have a DHCP plugin that associates IP addresses with hostnames. There are a few more, and custom add-ons can be made easily.

Running it

Running it is pretty simple. You need to do some prep work the first time, like install a 2.x version of p0f. You also need to build the database of MAC address vendors, by going to analyzers/data and running gendb.py.  Finally, you need to make the data directory inside the main dir. Once that is done, you can run scanner.py <interface> to begin sniffing on the selected interface. You can also specify a pcap file, though I haven't tested that yet. 

The program creates no output until you end it with ctrl-c or similar. It will then spit out a line saying how many packets it captured and then exit. Depending on how long it was running, it will have amassed various amount of data. I recommend leaving it for at least an hour or two, and more than 24 hours is the most preferred since it can grab all the DHCP requests.

Viewing the data

There is also a viewer script, viewer.py, which displays all the database information in an easy-to-browse format. The database works on association, so there is a table associating each MAC address to an IP address, a table associating IP to hostname, etc. Along with simply pulling this kind of thing directly from dissected packets, I wrote a plugin to fingerprint operating systems using Scapy's p0f plugin (note that Scapy's p0f support is only for the p0f 2.x databases, so you can't simply install the latest p0f). It doesn't always work, but sometimes it is useful.

Upon running viewer.py, you get a nice listing of all the hosts that were identified. Unfortunately there is often a lot of traffic that falls below the IP layer, so there are a bunch of listings that only show a MAC address. Here's a sample of some more interesting listings:

mac: a8:bb:cf:07:92:50:
manuf: Apple
ip: 10.0.1.19
hostname: None
os: Linux:2.4.2x
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
mac: 68:64:4b:55:93:8e:
manuf: Apple
ip: 10.0.1.10
hostname: LivingRmAppleTV
os: None
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
mac: 00:16:cb:c1:d9:7f:
manuf: Apple
ip: 10.0.1.70
hostname: None
os: None
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
mac: f8:1e:df:df:8c:41:
manuf: Apple
ip: 10.0.1.16
hostname: <redacted>-MBP
os: None
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
mac: ec:35:86:4e:50:d2:
manuf: Apple
ip: 10.0.1.4
hostname: <redacted>-iMac
os: None

Soon, I hope to sort hosts by the amount of information known about them to make looking through this data a bit easier.

As you can see, the first entry's MAC address identified as Apple, but the OS fingerprint identified it as Linux. I don't usually trust these OS fingerprints because they've never really been accurate in my experience. The hostname is really what I deem as important. As you can see, the default hostname was created for each one, so I can see that there are some Mac computers and an Apple TV. This is somewhat of a vulnerability when it comes to default installations, as the default hostnames for most personal computers are usually based on the computer model and name of the owner. If you know someone's name, you now know their IP.

Tuesday, December 22, 2015

Big data is sometimes a big pain...

EDIT: scroll to bottom for a full script.

To start off, I want to explain the background of this project. I run an ADS-B reciever for FlightRadar24.com. If you don't know what ADS-B is, you should definitely check it out. It is the protocol that airplane radar beacons use to transmit location and other data. Using an RTL-SDR module, available for around $30, you too can see local aircraft using a program such as dump1090.

Anyway, the equipment I host is a slightly more expensive, professional grade reciever, with much better decoding capabilities. Combined with the supplied antenna, it is much more accurate than an RTL-SDR reciever. There is a little box that sits on my desk (call it "the module"?) containing the receiver circuitry and a small ARM-based Linux system that sends data back to the main website for everyone to view. After installing it and confirming it was working, I pretty much ignored it.

That is, until recently, when I was debugging some port scanning code I'm writing. I was doing service scans on several hosts on my LAN, and decided to scan all 65536 ports of the module. Something then caught my eye:

$ python main.py 10.0.1.75 1-65535
Scanning 1 hosts (65535 ports per host)
Hosts up: 1
10.0.1.75 22 : SSH-2.0-OpenSSH_6.0

10.0.1.75 10685 : 
10.0.1.75 30003 : AIR,,333,1,AB14D9,101,2015/12/22,23:23:43.125,2015/12/22,23:23:43.125

10.0.1.75 30334 : 2RDa{]?/???

Ports 30003 and 30334 didn't show up in any port databases. Netcatting 30334 resulted in a bunch of unprintables (sample hexdump here), but 30003 ended up more promising:

MSG,3,333,8,780214,108,2015/12/22,23:33:14.4294967089,2015/12/22,23:33:14.4294967089,,35000,,,47.94058,-116.75212,,,0,0,0,0
MSG,4,333,8,780214,108,2015/12/22,23:33:14.4294967119,2015/12/22,23:33:14.4294967119,,,550.0,94.6,,,128,,,,,
MSG,3,333,2,A49441,102,2015/12/22,23:33:14.4294967122,2015/12/22,23:33:14.4294967122,,34975,,,47.56073,-113.82550,,,0,0,0,0
MSG,4,333,2,A49441,102,2015/12/22,23:33:14.4294967122,2015/12/22,23:33:14.4294967122,,,523.0,97.5,,,0,,,,,
MSG,4,333,3,A968DF,103,2015/12/22,23:33:14.4294967163,2015/12/22,23:33:14.4294967163,,,410.0,259.9,,,-64,,,,,
MSG,3,333,7,A94B48,107,2015/12/22,23:33:14.001,2015/12/22,23:33:14.001,,32975,,,46.94609,-115.34965,,,0,0,0,0
MSG,3,333,8,780214,108,2015/12/22,23:33:14.222,2015/12/22,23:33:14.222,,35000,,,47.94049,-116.75061,,,0,0,0,0
MSG,3,333,2,A49441,102,2015/12/22,23:33:14.230,2015/12/22,23:33:14.230,,34975,,,47.56056,-113.82371,,,0,0,0,0


It looks to me like a real-time output of the ADS-B decoder, in CSV format. The only problem is that I have no idea what the columns mean. After capturing several thousand lines of output, I tried to analyze the columns by finding the most common values in each column, using a long shell command:

$ for i in {1..30}; do echo "=============== column $i ==============="; cat flightradar_dump.csv | cut -d ',' -f $i  | sort | uniq -c | sort -rn | head -n20; done

=============== column 1 ===============
6667 MSG
   5 AIR
   4 STA
   3 ID
=============== column 2 ===============
2607 8
1662 4
1650 3
 541 5
 203 1
  12 
   4 6
=============== column 3 ===============
6679 333
=============== column 4 ===============
1595 18
1485 8
1284 20
1127 17
 808 21
 220 6
 125 7
  22 22
  13 19
=============== column 5 ===============
1595 AD4C2A
1485 780214
1284 A48272
1127 AE07FF
 808 AB013C
 220 A81077
 125 A94B48
  22 A53317
  13 AA1F97
=============== column 6 ===============
1595 118
1485 108
1284 120
1127 117
 808 121
 220 106
 125 107
  22 122
  13 119
=============== column 7 ===============
6679 2015/12/22
=============== column 8 ===============
   4 23:45:55.349
   3 23:46:14.257
   3 23:46:10.4294966829
   3 23:46:10.4294966821
   3 23:46:05.4294967104
   3 23:46:05.4294967081
   3 23:46:05.4294967073
   3 23:46:00.4294967272
   3 23:46:00.090
   3 23:45:55.373
   3 23:45:41.221
   3 23:45:41.214
   3 23:45:37.4294966801
   3 23:45:37.4294966793
   3 23:45:32.4294967084
   3 23:45:32.4294967076
   3 23:45:32.252
   3 23:45:28.4294966934
   3 23:45:27.070
   3 23:45:18.4294966925
=============== column 9 ===============
6679 2015/12/22
=============== column 10 ===============
   4 23:45:55.349
   3 23:46:14.257
   3 23:46:10.4294966829
   3 23:46:10.4294966821
   3 23:46:05.4294967104
   3 23:46:05.4294967081
   3 23:46:05.4294967073
   3 23:46:00.4294967272
   3 23:46:00.090
   3 23:45:55.373
   3 23:45:41.221
   3 23:45:41.214
   3 23:45:37.4294966801
   3 23:45:37.4294966793
   3 23:45:32.4294967084
   3 23:45:32.4294967076
   3 23:45:32.252
   3 23:45:28.4294966934
   3 23:45:27.070
   3 23:45:18.4294966925
=============== column 11 ===============
6469 
  48 CPA846
  48 AAL1519
  46 HIRE71
  35 N39WP
  22 SCX285
   4 UAL670
   2 SL
   2 RM
   1 SCX285
   1 N39WP
   1 AAL1519
=============== column 12 ===============
4484 
 630 38000
 452 35000
 308 43000
 280 35975
 215 36000
  98 37975
  73 43025
  57 35025
  23 32975
  21 42975
  13 33000
  11 24000
   5 37025
   5 37000
   4 3675
=============== column 13 ===============
5017 
 163 394.0
 124 393.0
 111 541.0
 103 400.0
 102 543.0
  94 410.0
  92 411.0
  80 396.0
  69 545.0
  67 542.0
  58 395.0
  57 399.0
  55 547.0
  49 407.0
  46 546.0
  42 540.0
  37 406.0
  37 401.0
  35 392.0
=============== column 14 ===============
5017 
 175 273.2
 132 270.3
 125 270.1
 111 272.8
 110 96.1
  96 270.4
  92 272.5
  87 273.3
  84 96.4
  82 96.5
  75 96.6
  51 273.1
  51 272.6
  46 96.2
  42 273.5
  40 272.4
  39 96.3
  27 96.7
  25 272.9
=============== column 15 ===============
5029 
  21 47.47307
  15 47.47311
  15 47.47302
  15 47.47298
  14 47.47293
  14 47.47284
  12 47.47275
  11 47.47243
  10 47.47289
  10 47.47285
  10 47.47270
  10 47.47266
   9 47.47279
   9 47.47234
   9 47.47220
   8 47.47304
   8 47.47183
   7 47.47309
   7 47.47261
=============== column 16 ===============
5029 
   2 -114.02531
   2 -114.01941
   2 -113.90083
   2 -113.83525
   2 -113.78294
   2 -113.61738
   2 -113.53587
   2 -113.51678
   2 -113.48582
   2 -113.46893
   2 -113.46729
   2 -113.46295
   2 -113.45102
   2 -113.42651
   2 -113.41482
   2 -113.35567
   2 -113.34924
   2 -113.29115
   2 -113.28186
=============== column 17 ===============
5017 
 880 0
 388 64
 308 -64
  74 -128
  12 128
=============== column 18 ===============
6675 
   2 1756
   2 1366
=============== column 19 ===============
4484 
2195 0
=============== column 20 ===============
5025 
1654 0
=============== column 21 ===============
4484 
2195 0
=============== column 22 ===============
4582 0
1865 
 220 1
  12 

The first column in the output is the number of occurrences of the value in the second column.

Now allow me to guess using this example message, based on what I know:

MSG,3,333,7,A94B48,107,2015/12/22,23:33:14.001,2015/12/22,23:33:14.001,,32975,,,46.94609,-115.34965,,,0,0,0,0

Column 1 looks like the message type. MSG is the most common, but AIR, STA, and ID were also seen occasionally. No idea what the difference is.

Column 5 is almost certainly the airplane identification number.

Column 6 might be the message length.

Columns 7 and 9 are both a date field. I don't understand why there are two dates, as they are exactly the same.

Columns 8 and 10 are a time field. Yet again they are the same, so the presence of two is strange. None seemed to be different at all:

>>> for i in open('flightradar_dump.csv').readlines():
...     a = i.strip().split(',')
...     if a[7] != a[9]:
...             print 'different:',a[7],a[9]
(no output)

Column 11 appears to be the flight number, including the airline identifier.

Column 12, I would guess to be the plane's altitude. Whether this is meters or feet is unknown.

I'm relatively sure column 13 is speed in mph.

Column 14 could be the plane's direction in degrees.

Columns 15 and 16 are almost certainly latitude and longitude of the plane.

As for the rest of the columns, I have no idea. The internet was no help in this endeavour, though I might be able to find something out from FlightRadar support. I doubt they'd add this feature if they didn't intend people to use it.

EDIT: I've built a simple script to process streamed data. It has a similar, somewhat messy interface as dump1090, but uses local feed data. Run it with:

python fr24_feed.py <ip of receiver>

Output looks something like this now:


ID:         Flight          Alt         Speed              Lat              Lon       Heading     Last seen   PPS
A7D3FE:                 11175ft           mph                E                N           deg          0sec     3
A58672:                 34950ft           mph                E                N           deg          1sec     0
A4CA6A:        752      35000ft      484.0mph      -115.41179E        46.45871N      107.2deg          0sec     7
A500CA:                 20450ft           mph                E                N           deg          0sec     2
89911D:                 38000ft           mph                E                N           deg          0sec     3
A5C4E2:         36      35000ft      466.0mph      -113.48465E        47.69261N       88.9deg          0sec     6
86DCE6:       JAL4      40975ft      507.0mph      -113.85406E        48.29695N       96.7deg          0sec     8
AB3FC8:                  8475ft           mph                E                N           deg          0sec     1
AB4508:                      ft           mph                E                N           deg         14sec     0

Bytes/sec: 3075  Packets/sec: 30  Avg bytes/pkt: 102

Planes visible: 9  Total seen: 9

In conclusion, the FlightRadar receiver is much more accurate than others I have seen. At a roughly calculated rate of 30-40 message per second, lots of processing may be unfeasible. Recording it may also be hard, as mine used about 150 kilobytes per minute of disk space.

Also here's the script:





Sunday, December 20, 2015

Bro IDS + Python == success!

I recently learned about the Bro IDS project, and I think it's really cool! The only problem is that I didn't want to have to learn their special language to process network data. I'm more used to Python's powerful tools to do processing like this. So I wrote BroScanner, which essentially streams from Bro logfiles and reads the CSV into an easily parseable python dict. It can also pipe out the data as JSON for other programs to analyze.

Today, I want to show you how to use BroScanner by creating some small passive IDS scripts with it. First, you need to download and configure Bro. Then you'll need BroScanner and its dependencies, see the aforementioned Github repo. Once Bro is up and running, you can run main.py to extract json from one of the logfiles. Assuming Mac or Linux, you can list the logfiles with ls /usr/local/bro/spool/bro. 


$ ls /usr/local/bro/spool/bro
app_stats.log dns.log       files.log     notice.log    ssl.log       stdout.log    x509.log    conn.log      dpd.log       http.log      software.log  stderr.log    weird.log

(This isn't a complete list of the files, they are created as needed by bro)

$ python main.py software.log
{"host_p": null, "name": "Unspecified WebKit", "unparsed_version": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Spotify/1.0.20.94 Safari/537.36", "version.minor": 36, "version.minor3": null, "ts": 1450646720.519703, "host": "10.0.1.19", "version.minor2": null, "version.addl": null, "software_type": "HTTP::BROWSER", "version.major": 537}

(don't include the path in any calls to BroScanner, it is added automatically)

$ cat /usr/local/bro/spool/bro/software.log 
#separator \x09
#set_separator ,
#empty_field (empty)
#unset_field -
#path software
#open 2015-12-20-14-25-21
#fields ts host host_p software_type name version.major version.minor version.minor2 version.minor3 version.addl unparsed_version
#types time addr port enum string count count count count string string
1450646720.519703 10.0.1.19 - HTTP::BROWSER Unspecified WebKit 537 36 - Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Spotify/1.0.20.94 Safari/537.36

As you can see, this json is a lot more readable than the default CSV-type logfiles. It can be piped to any program that can decode JSON, but it can also be read directly inside your Python file as a dict. This makes it really easy to create your own intrusion detection programs.


Virus scanning


For example, I wrote this simple script to check for prohibited mime types and report their hash digests:

## simple file type detection.
## if i had unlimited api access, i would send every single md5 to something like VirusTotal

forbidden_types = ['text/x-bash', 'text/scriptlet', 'application/x-opc+zip', 'application/com', 'application/x-msmetafile', 'application/x-shellscript', 'text/x-sh', 'text/x-csh', 'application/x-com', 'application/x-helpfile', 'application/hta', 'application/x-bat', 'application/x-php', 'application/x-winexe', 'application/x-msdownload', 'text/x-javascript', 'application/x-msdos-program', 'application/bat', 'application/x-winhelp', 'application/vnd.ms-powerpoint', 'text/x-perl', 'application/x-javascript', 'application/x-ms-shortcut', 'application/vnd.msexcel', 'application/x-msdos-windows', 'text/x-python', 'application/x-download', 'text/javascript', 'text/x-php', 'application/exe', 'application/x-exe', 'application/x-winhlp', 'application/msword', 'application/zip']

import bparser

for i in bparser.parseentries('files.log'):
if i['mime_type'] in forbidden_types: ## scan all of these types
print
print "{} downloaded a file from {} via {}".format(i['rx_hosts'],i['tx_hosts'],i['source'])
print "Filename: {}  length: {}  mime type: {}".format(i['filename'],i['total_bytes'],i['mime_type'])
print "MD5: {}  SHA1: {}".format(i['md5'],i['sha1'])

This script could be implemented in a workplace, to monitor employee downloads and possible viruses. If a virus is detected, one could implement the appropriate security measures on the affected machine (as we know its IP address). As stated in the comment, one could implement a check for each hash to be sent to a virus database. Rate-limiting would be a problem though, so in a workplace setting, local virus databases would be more suitable.


Authorized devices only


Another example of a use of this is scanning for unauthorized devices on the network. Since each device has its own unique MAC address, only specific addresses could be allowed.

$ python main.py dhcp.log
{"lease_time": 86400.0, "uid": "CO9XE21x6UMR8CZvZ2", "id.orig_p": 68, "id.resp_h": "10.0.1.1", "ts": 1450648870.748178, "id.orig_h": "10.0.1.19", "id.resp_p": 67, "mac": "a8:bb:cf:07:92:50", "trans_id": 3937219451, "assigned_ip": "10.0.1.19"}

Mac addresses start with a code that identifies the manufacturer of the network interface card. One could write a script that whitelists certain MAC addresses or manufacturers. Every day, each host has to register with DHCP services, so there is little chance of escaping this detection. Additionally, once an unauthorized host is detected, a managed router could be used to sinkhole that host from any network communication.

One small problem with using only the manufacturer ID to whitelist hosts is the fact that MAC addresses can be spoofed. However, if there is a small list of full MAC addresses whitelisted, this problem virtually disappears.

If a company decided to disallow all mobile phones from using the protected corporate network, they could scan for MAC addresses that contained the manufacturer ID of popular smartphone brands. Since MAC address spoofing is practically nonexistent on phones, this would be highly effective.


Notifying IT


A final example of a use for BroScanner is notifying admins of portscans and other possible intrusions that Bro detects. I discovered a logfile that contains this information, and all I needed to do was implement some kind of admin notification.

$ python main.py notice.log
{"uid": null, "id.resp_p": null, "actions": "Notice::ACTION_LOG", "fuid": null, "dropped": false, "remote_location.region": null, "remote_location.country_code": null, "sub": "local", "proto": null, "dst": "10.0.1.1", "ts": 1450650532.338251, "note": "Scan::Port_Scan", "peer_descr": "bro", "msg": "10.0.1.19 scanned at least 15 unique ports of host 10.0.1.1 in 0m0s", "remote_location.city": null, "remote_location.longitude": null, "remote_location.latitude": null, "src": "10.0.1.19", "id.orig_p": null, "id.resp_h": null, "n": null, "id.orig_h": null, "p": null, "file_mime_type": null, "file_desc": null, "suppress_for": 3600.0}

So a simple Python file can notify the appropriate people:

import bparser
import notifiers

for i in bparser.parseentries('notice.log'):
print i
notifiers.notify(i['msg'])

As you can see, there was a portscan detected, even though I ran it with nmap -v -PR -Pn -sV 10.0.1.1 to try to minimize traffic. Bro's builtin threat detection is no slouch.

The notifiers module contains whatever types of notifications you would like to send. I implemented the OSX notification center for a local Bro instance, so OSX users will see a popup notification informing them of the intrusion. Email, SMS, and many other forms of notification could be implemented, giving IT the edge in eliminating threats. As previously stated, network routers could sinkhole these hosts, effectively removing the threat.

With Python, the possibilities are endless! There are many more uses for this program, and I'm interested to see what people come up with.