I have written a CLI utility for Ubuntu to import ModSecurity's audit log file into an sqlite database, which should be a great help to people building whitelists to reduce false positives. This supersedes my previous efforts with BASH scripts. Packages are available for Ubuntu Trusty and Utopic (14.04 & 14.10) in my Personal Package Archive on Launchpad.
To create my app I had to learn about:
- C++ development on Ubuntu including two third party libraries (Boost Regex and SQLite)
- Version control using Git
- The GNU build system "Autotools"
- How to build .deb packages for Ubuntu and Debian
- How to upload packages to a Personal Package Archive (PPA) on Launchpad
I plan on writing detailed tutorials for most of this, but there's quite a lot to get through so it could take a while!
What is ModSecurity
If you have read my previous posts on Apache2's security module "ModSecurity" then you may already know what it is. For those of you who haven't used it, ModSecurity is a Web Application Firewall that can be used with a set of rules to "enumerate badness" and decide when to block requests sent to the server. It sits inbetween Apache and the web applications running on the server, and can therefore intercept malicious requests before they are processed by the app. Probably the most common set of rules is the Open Web Application Security Project's Core Rule Set (OWASP CRS), which is in the ubuntu repos as
Here's a typical example: Mr Naughty is trying to hack example.com, a website running a vulnerable installation of WordPress on a LAMP server. Mr Naughty is trying to use an SQL injection attack to create a new admin user in the database so that he can deface the site, steal data etc. However, ModSecurity identifies the SQL injection attack contained in the POST variable sent by Mr Naughty and blocks it before it is executed by Wordpress. The attack fails :)
Sounds great, right?
Why isn't it more widely deployed?
I started learning about ModSecurity after a friend recommended it to me, about a year and a half ago. As an enthusiastic but inexperienced amateur, I really struggled to configure it properly - each rule is using pattern matching to decide what to block, and there are inevitable false positives.
This means you can't just install it and expect it to work, typically you run ModSecurity in "detection only" mode for a time (rules are evaluated but ModSecurity doesn't actually block anything), and then inspect the audit logs to identify where you need to make amendments to the rules to remove those false positives.
The audit log is a text file with sections for each part of the transaction: the data sent to the server, the response sent back, and any rules that were matched. Since the data for each transaction is split over multiple lines, it does not lend itself to being sorted with simple utilities like
grep. Identifying all of the requests from a certain IP address that triggered a given rule is a non-trivial exercise.
My first attempt at tackling the problem was to remove the rules that were being triggered at certain locations. To do this I wrote a BASH script. The script doesn't look at the audit log file, it just uses the error messages ModSecurity writes to the apache log, and spits out a virtualhost configuration file listing locations (URLs) where certain rules are disabled.
This would work OK if you were running ModSecurity in "traditional" mode, where any rule that is matched results in the request being blocked, but it isn't good for the new anomaly scoring mode (the one that enumerates badness). In the anomaly scoring mode, each rule has a point score and the request is blocked if the score passes over a threshold... I soon realised that my script above was actually just removing the rule that adds up the scores and blocks the request, when it should have been removing the individual rules!
This wasn't good enough. I realised I needed a more fine-tuned approach, so I learned some Perl. Perl can do multiline regex (slowly!), which enabled me to look at the audit log instead of the error log. The perl script I wrote splits the audit log into bits and puts it into a spreadsheet. This is the same fundamental approach as my C++ app, but the spreadsheet quickly becomes extremely sluggish, and the script takes ages to run. It does work, though!
The Solution: auditlog2db commandline utility
So, after my partial success with Perl I decided I needed something serious to tackle the problem. I had read that C++ apps are generally faster than scripting languages like Perl, and wanted to learn the language that most of the apps in the Plasma Desktop environment (KDE) are written in. I had an idea that a sqlite database would be a good way to store the information from the audit logs so that it could be sorted quickly, but I didn't know any C++ or anything about sqlite.
- Some basic C++ (hair-tearingly frustrating at times but ultimately rewarding)
- How to use the C/C++ sqlite API (reasonably well documented but very confusing to someone writing their first C++ app)
- How to do regular expression matching in C++ using the Boost Regex library (much more difficult than perl!)
- How to use a Makefile to make compilation less tedious.
The result is a C++ commandline utility called auditlog2db that will import the logfile into a sqlite3 database. It can process about 2000 transactions per second, which is about a bazillion times faster than the perl script :D
As my code got more complicated, I realised I needed to use a proper version control system instead of just saving copies of files as foo.BAK, foo.BAK2 ... so I learned Git. Git is actually quite accessible and definitely worth learning.
So, at this point my code was on Github and it worked, but I doubted very much whether anyone would find it and use it. Seriously... in 2015, you shouldn't have to compile a program yourself unless you're actually developing it.
Packaging my code for Ubuntu/Debian turned out to be almost as difficult as writing the damn program!
I started by learning the GNU build system, Autotools, to replace my handwritten Makefile with a more flexible one. Autotools is the group of programs that are used in the classic "configure, make, make install" procedure to check dependencies and create a makefile that installs everything to the correct place on your system and removes them cleanly again afterwards.
Autotools turned out to be a . It is not at all easy to learn - something as simple as testing for C++11 support in the compiler and setting the appropriate flag should be easy, but it's not, and requires the use of some pretty archaic m4 macros. The documentation is sparse, and non-trivial example tutorials are hard to come by.
In defence of Autotools, once it is set up, the "configure, make, install" procedure is easy to do - I can see why it was good in the days when end users were required to compile software. It also provides some nice features like "make dist", which creates a .tar.gz source archive for distribution - useful for starting a .deb! If I could start over, I think I would learn cmake, which is supposed to be easier.
Once this was all sorted, I set to work building a .deb package. The Ubuntu documentation can pretty much be summed up in one sentence:
"Build a package as you would for debian, but use the ubuntu release codename (utopic) instead of the Debian codename (unstable) in the changelog file."
After battling my way through the Debian new maintainer's guide, I finally produced a package. However, the package checker lintian kicked up a load of errors. Some were trivial like line lengths in the package description, but others were more serious. A manual file was missing.
Yet another unpleasant surprise: manual files are written using nroff, a markup language even more difficult to learn than TeX, and a lot less useful. Luckily, it was possible to just
plagiarise borrow a lot of the markup from other manfiles, which are stored in
/usr/share/man/man1/foo.1.gz. Take a look at a few using the
zless command, and you'll see why I wasn't enthused at the prospect of writing one from scratch.
Package completed, error free, it was time to upload to a PPA.
The next surprise was that you can't just create a .deb, sign it, and upload it to a PPA. Launchpad builds the binaries itself from a source archive!
This is a great for quality control, since the packages are built in a sanitised chroot. In fact, this caught a few of my errors, like missing
libboost-regex-dev from the build-depends field in the control file. These libraries were (obviously) installed on my laptop already, but they weren't present in the chroot, so the compiler failed during linking.
After a bit of trial and error, I got Launchpad to build my app successfully :)
My PPA is here:
Packages are available for and .
If you fancy helping me out (I'd really appreciate it!), you can add the PPA, install the package, test it and remove it.
The package is called ams-whitelisting-tools because I plan on adding other utilities to the package later.
Obligatory warning: in general, you shouldn't add random PPAs to your system. Only do this if you trust me!
The following code will add the PPA, install my utility, do some basic tests, and then remove the package and the PPA.
sudo add-apt-repository ppa:sam-hobbs/ams-whitelisting-tools sudo apt-get update sudo apt-get install ams-whitelisting-tools man auditlog2db auditlog2db --version sudo apt-get remove --purge ams-whitelisting-tools sudo add-apt-repository --remove ppa:sam-hobbs/ams-whitelisting-tools
If you have modsecurity installed, you can also run this command to generate a database from your audit log file:
auditlog2db -i /var/log/apache2/modsec_audit.log -o ~/modsecurity.db
The utility is still very much in development, but it is at the stage now where it could be useful to people and I'm very pleased to be able to release something!
If you do any serious testing, I'd love to hear some feedback. I'm aware that I need to tighten up the
--quiet options, which were added recently.