This script has been superseded by a commandline utility. Please visit this page for more information ModSecurity is a Web Application Firewall for Apache. It can monitor all of the traffic that is seen by your web server, including request headers and GET and POST data, and block dodgy requests. ModSecurity itself is actually just a rule engine; the clever part is in the rules you pass to it. Many people use the Open Web Appplication Security Project's (OWASP) Core Rule Set (CRS), an open source set of rules that ModSecurity can use to sift the wheat from the chaff, and foil some common types of attack. The CRS was written by studying known vulnerabilities and writing rules that would not only have prevented the attacks, but prevented other similar attacks too. Thus, ModSecurity provides a good all-round protection for your web server. Some types of attack that ModSecurity & the OWASP CRS can help to protect against are:
- SQL injection
- Denial of Service
- Cross-Site Scripting
- HTTP anomalies (violations of HTTP protocol)
- Automation detection (stops bots and scanners)
- Comment spam
And yes, you can run this on your Raspberry Pi, although you might find it slows the server down noticeably. This post assumes you have ModSecurity up and running in DetecitonOnly mode. If you haven't installed ModSecurity and enabled the OWASP CRS yet, I'd highly recommend this guide on LinuxQuestions.org, which will work for Debian-like systems (I've tested it on both Raspbian and Ubuntu).
The Script
Every web application firewall like this will have false positives. To help prevent disruption to your site, ModSecurity comes with a "DetectionOnly" mode: SecRuleEngine DetectionOnly
where it will process each request as if it was turned on and write the results to your log files without actually blocking anything. This provides you with useful information about the false positives you need to deal with before turning ModSecurity on for real. Trust me on this one, don't install modsecurity and turn on SecRuleEngine straight away, because you will almost certainly break stuff. False positives are common, and until you write a whitelist to change ModSecurity's behaviour this won't change. Writing a whitelist file manually is torturous drudgery. Don't put yourself through the pain, I wasted a few hours trying to write one for WordPress before I decided it was futile, and looked for a better option. So, here is the result of my frustration: a BASH script that will automatically generate a whitelist file for you. The script should work for any web app (Content management systems like WordPress & Drupal, Webmail apps like Squirrelmail and Roundcube, and other apps like OwnCloud). It works by assuming that any requests that come from a trusted IP address are legitimate: any rules that were triggered were false positives and will go in our whitelist for that location. The script reads through your Apache error log files for statements like this:
[Wed May 07 19:13:54.925435 2014] [:error] [pid 2423] [client 192.168.1.1] ModSecurity: Warning. Match of "eq 1" against "&ARGS:CSRF_TOKEN" required. [file "/etc/modsecurity/modsecurity_crs_43_csrf_protection.conf"] [line "31"] [id "981143"] [msg "CSRF Attack Detected - Missing CSRF Token."] [hostname "www.samhobbs.co.uk"] [uri "/comment/2140/approve"] [unique_id "U2p34n8AAQEAAAl3cyIAAAAM"]
The script then removes the false positive by outputting a whitelist file, which consists of a series of LocationMatch statements. Each LocationMatch statement contains rules that we don't want to be processed at that URL, like so:
<LocationMatch "^/comment/[0-9]+/approve$"> SecRuleRemoveById 981143 </LocationMatch>
This whitelist file will then be included in the relevant VirtualHost file with a statement like this:
# Include personalised whitelist file for ModSecurity Include /etc/modsecurity/whitelists/samhobbs.co.uk.conf
Neat, right? Here's the script:
#! /bin/bash # # This is the ModSecurity CRS Whitelist Generator Script version 16/03/2014 # Not affiliated with the Modsecurity project or the CRS # Script by Sam Hobbs https://samhobbs.co.uk # Licence: public domain # Please let me know how you get on so that I can make improvements: leave a # comment on my blog or email me at: sam at samhobbs dot co dot uk # Every installation of Apache with Modsecurity will probably have false positives. # This script aims to make writing a whitelist file quicker than doing it manually. # Required input is as many error logs as you have available from the relevant virutalhost, # with modsecurity set to detect only and all of the CRS enabled #====================================== CHANGELOG =========================================# # Script version VERSION="27/05/2014" # 16/03/2014 changed default <LocationMatch> regex in whitelist file to .* from * # 12/05/2014 changed defaults for drupal # egrep used instead of grep to provide better regex # now checks to see if locationmatch would be empty before writing it #===================================== USER INPUT =========================================# # This script works by assuming that all traffic from a friendly IP address is legitimate # and that the resulting errors are false positives. # Define one or more friendly IP address, separating each IP with a space. If you are # hosting at home, a good choice is your router's LAN IP address, since your server sees # all traffic from your LAN as originating here. # You might also like to add IP addresses of users you know weren't abusing the site, for # example people who left legitimate comments. Wordpress tells you the IP address used to # post each comment on the comment moderation GUI. FRIENDLY_IP="192.168.1.1" # Define a list of special locations. The matching process uses regex. A LocationMatch # statement will be created for each one of these locations, which will be populated with # rule IDs for all the locations that match that regex. Leave a space between each location # that you enter. # There is no need to start each location with "^"; the script adds the character for you. # If you don't end the location with a "$" then the script will automatically add an # asterisk to the end of the location in the LocationMatch statement so that it will match # all files beginning with that path, i.e. # <LocationMatch "^/wordpress/wp-admin/*"> matches /wordpress/wp-admin/wp-login.php, but # <LocationMatch "^/wordpress/wp-admin/$"> does not. SPECIAL_LOCATIONS="\ /authorize.php \ /admin/config$ \ /admin/config/content/mollom \ /admin/config/content/syntaxhighlighter \ /admin/config/people \ /admin/config/search \ /admin/config/system/actions \ /admin/content \ /admin/reports \ /admin/structure/menu \ /admin/structure/types \ /admin/modules \ /admin/appearance/settings \ /admin/people/permissions \ /comment/reply \ /comment/[0-9]+$ \ /comment/[0-9]+/edit \ /comment/[0-9]+/approve \ /file/ajax/field_image/und/0/ \ /index.php$ \ /node/[0-9]+/delete$ \ /node/[0-9]+/edit$ \ /node/add/article \ /sites/default/files/css/ \ /sites/default/files/js \ /token/tree \ /user/[0-9]+/edit$ \ .*.(png|jpg|JPG|gif|ico)$" # Define directory holding the Apache error log files to be processed LOG_DIR=~/errors-new # Define directory for output files: OUTPUT_DIR=~/modsec-whitelist-samhobbs-new-1 #===================================== KNOWN BUGS =========================================# # SecRuleRemoveById 891143 doesn't work because the rule has its own whitelist built in # see: /usr/share/modsecurity-crs/optional_rules/modsecurity_crs_43_csrf_protection.conf #==================================== HOUSEKEEPING ========================================# # Create the output directory if it doesn't exist: if [ ! -d $OUTPUT_DIR ] then mkdir $OUTPUT_DIR echo "Output directory has been created at $OUTPUT_DIR" echo "" fi # Delete the previous output files if they exist if [ -f $OUTPUT_DIR/whitelist ] then echo -n "Old output files detected, deleting..." rm $OUTPUT_DIR/* echo "...done" echo "" fi # Some files: COMBINED_LOG=$OUTPUT_DIR/combined_log PROBLEM_LOCATIONS=$OUTPUT_DIR/problem_locations ROOT_LOG=$OUTPUT_DIR/root_log PROBLEM_LOCATIONS_REMAINING=$OUTPUT_DIR/problem_locations_remaining ROOT_IDS=$OUTPUT_DIR/root_ids GROUPED_LOCATIONS=$OUTPUT_DIR/grouped_locations WHITELIST_FILE=$OUTPUT_DIR/whitelist # Rules tripped on the root domain will be added as generic exceptions for the whole virtualhost: echo "Rules tripped on the root domain will be added as generic exceptions for the whole virtualhost" echo "" echo "In addition to this, you have selected the following special locations to be grouped into whitelist statements" for variable in $SPECIAL_LOCATIONS; do echo "> $variable"; done echo "" # Perform work in temorary files TEMPFILE1=$(mktemp) TEMPFILE2=$(mktemp) TEMPFILE3=$(mktemp) TEMPFILE4=$(mktemp) #==================================== LOG PROCESSING ======================================# # Read all files in the log file directory and combine them into one long file: echo "Processing error logs..." for f in $LOG_DIR/*; do if [[ "$f" =~ \.gz$ ]]; then echo "> reading $f" zcat $f >> $TEMPFILE1 elif [[ "$f" =~ \.log ]]; then echo "> reading $f" cat $f >> $TEMPFILE1 else echo "File $f not recognised as a log file, skipping" fi done echo "" echo "Converting logs into a useful format:" # Remove any log entries from the file that are not generated by ModSecurity echo -n "> Removing entries that were not generated by ModSecurity..." grep ModSecurity $TEMPFILE1 > $TEMPFILE2 echo "...done." #echo "" # Remove log entries from traffic that is not from friendly IP addresses echo -n "> Removing errors that were not from friendly IP addresses..." for ip in $FRIENDLY_IP; do grep "client $ip" $TEMPFILE2 >> $TEMPFILE4 done echo "...done." #echo "" # Write combined log file: echo -n "> Generating combined log file..." cp $TEMPFILE4 $COMBINED_LOG echo "...done ($COMBINED_LOG)." #echo "" # Filter out rules that match the root regex: echo -n "> Separating entries that match the root location" cat $COMBINED_LOG | grep uri.\"/\" > $ROOT_LOG echo "...done ($ROOT_LOG)." echo "" # Now generate a list of locations with problems: echo -n "Generating a list of all problem locations" awk ' { print $(NF-2) }' $COMBINED_LOG | cut -d '"' -f 2 | sort | uniq > $PROBLEM_LOCATIONS echo "...done ($PROBLEM_LOCATIONS)." echo "" #==================================== WHITELIST HEADER ====================================# echo "# This file was created using the ModSecurity CRS Whitelist Generator script, version $VERSION" >> $WHITELIST_FILE echo "# Save this file to /etc/modsecurity/whitelists/domainname.conf and include it in the relevant" >> $WHITELIST_FILE echo "# VirtualHost configuration with \"Include /etc/modsecurity/whitelists/domain.conf\"" >> $WHITELIST_FILE echo "# See https://samhobbs.co.uk for more information" >> $WHITELIST_FILE echo "" >> $WHITELIST_FILE echo "" >> $WHITELIST_FILE #==================================== MATCH EVERYWHERE ====================================# # Rule matches for the site's root will be whitelisted for the whole site: echo "Now working on rules to whitelist everywhere" # Generate a list of IDs that match the root regex echo -n "> Generating a list of rule IDs for the root location" echo "##" > $ROOT_IDS cat $ROOT_LOG | grep -o 'id \"......\"' | cut -d '"' -f 2 | sort -u >> $ROOT_IDS echo "...done." echo -n "> Writing LocationMatch statement to whitelist file..." cat $ROOT_IDS | while read line; do sed 's/^/SecRuleRemoveById /'>> $WHITELIST_FILE; done echo "" >> $WHITELIST_FILE echo "...done." echo "" #=================================== SPECIAL LOCATIONS ====================================# COUNT=1 for LOCATION in $SPECIAL_LOCATIONS; do echo "Now working on the following location: $LOCATION" # Generate a list of problem locations for $LOCATION echo -n "> Generating a list of matching locations" eval "echo "$OUTPUT_DIR/location_${COUNT}" >/dev/null" TEMPLOCATION="$OUTPUT_DIR/location_${COUNT}" cat $PROBLEM_LOCATIONS | egrep "^$LOCATION" >> $TEMPLOCATION echo "...done." # Generate list of rule IDs for $LOCATION echo -n "> Generating a list of rule IDs for this location" echo "#" > $TEMPFILE3 # just to clear tempfile3 cat $TEMPLOCATION | while read LINE; do grep $LINE $COMBINED_LOG | grep -o 'id \"......\"' | cut -d '"' -f 2 >> $TEMPFILE3 done eval "echo "$OUTPUT_DIR/id_${COUNT}" >/dev/null" TEMPIDFILE="$OUTPUT_DIR/id_${COUNT}" cat $TEMPFILE3 | sort -u | grep -vxF -f $ROOT_IDS > $TEMPIDFILE echo "...done." let COUNT=COUNT+1 # Add $LOCATION to whitelist file echo -n "> Writing LocationMatch statement to whitelist file" # If defined location ends in $ don't add *, if not then do add a * if [[ "$LOCATION" =~ \$$ ]]; then echo "<LocationMatch \"^$LOCATION\">" >> $WHITELIST_FILE else echo "<LocationMatch \"^$LOCATION.*\">" >> $WHITELIST_FILE fi cat $TEMPIDFILE | while read line; do sed 's/^/SecRuleRemoveById /'>> $WHITELIST_FILE; done echo "</LocationMatch>" >> $WHITELIST_FILE echo "" >> $WHITELIST_FILE echo "...done." echo "" done #================================ REMAINING LOCATIONS LIST ================================# # Generate a list of all locations that are covered by the group statements echo -n "Find locations already covered by the group statements" for file in $OUTPUT_DIR/*; do if [[ "$file" =~ location_[0-9] ]]; then cat $file >> $GROUPED_LOCATIONS fi done echo "...done ($GROUPED_LOCATIONS)." # Now remove those locations from the master list of problem locations to leave a list of remaining problem locations. Also remove root, since this is dealt with separately. echo -n "Remove these locations from the master list" grep -vxF -f $GROUPED_LOCATIONS $PROBLEM_LOCATIONS | grep -v ^/$ > $PROBLEM_LOCATIONS_REMAINING echo "...done ($PROBLEM_LOCATIONS_REMAINING)." #================================== REMAINING LOCATIONS ===================================# # Now write the remaining locations to the whitelist file echo -n "Now writing the remaining problem locations to the whitelist file" cat $PROBLEM_LOCATIONS_REMAINING | while read line; do # list all the rule IDs that match this location and don't match the rule IDs we have removed globally grep $line $COMBINED_LOG | grep -o 'id \"......\"' | cut -d '"' -f 2 | sort -u | grep -vxF -f $ROOT_IDS | sed 's/^/SecRuleRemoveById /'> $TEMPFILE4 # count the number of rule IDs - if there are none then we don't want to write an empty locationmatch statement LINES=$(wc -l $TEMPFILE4 | cut -f1 -d ' ') if [[ $LINES != 0 ]] ; then # since some of these are .php scripts, .php?foo=bar needs to match, so don't add $ echo "<LocationMatch \"^$line.*\">" >> $WHITELIST_FILE cat $TEMPFILE4 >> $WHITELIST_FILE echo "</LocationMatch>" >> $WHITELIST_FILE echo "" >> $WHITELIST_FILE fi done echo "...done." echo "" echo "Your whitelist file has been created at $WHITELIST_FILE" echo "" #====================================== CLEAN UP ==========================================# # Clean up temporary files echo -n "Cleaning up..." rm -f $TEMPFILE1 $TEMPFILE2 $TEMPFILE3 $TEMPFILE4 echo "...done"
Instructions
- Copy the script and save it to
~/bin/generate-modsec-whitelist.sh
- Make a copy of your Apache error log files, which are found on Debian systems at
/var/log/apache2/error.log
and move them to a folder in your home directory, for example~/error-logs/2014-05-30
- Make the log files owned by your user, i.e.
chown -R user:user ~/error-logs/2014-05-30
- Open the script in a text editor and go to the "USER INPUT" section. Fill in the list of friendly IP addresses (start with just your router's IP address to begin with if you are hosting at home).
- Fill in some SPECIAL_LOCATIONS where you would like the script to group the rules, i.e. if you have URLs like yourdomain.com/foo/bar/a and yourdomain.com/foo/bar/b that you think are similar (rules tripped on one will probably be tripped on another), add /foo/bar/ and all URLs at yourdomain.com/foo/bar/.* will be grouped together
- Fill in LOG_DIR with the path to the error logs you copied over earlier
- Fill in the OUTPUT_DIR, which is the directory the script will write its output files to. As well as the whitelist file, it writes a few other intermediates to help you understand how it has built the whitelist.
- Save your changes, and then make the script executable:
chmod +x ~/bin/generate-modsec-whitelist.sh
- Run the script
generate-modsec-whitelist.sh
(or use the full path if you haven't added ~/bin to your PATH) - Review the whitelist file that was generated in the output folder, look for sensible ways to group locations, add or modify the special locations as necessary and run the script again. Repeat until you are happy.
- Copy the whitelist file to /etc/modsecurity/whitelists/yourdomain.com.conf
- Use the
Include
directive to make Apache process the whitelist file as part of the relevant virtualhost's configuration, i.e.
<IfModule mod_security2.c> # SecRuleEngine On, Off or DetectionOnly SecRuleEngine On # Include personalised whitelist file for ModSecurity Include /etc/modsecurity/whitelists/samhobbs.co.uk.conf </IfModule>
Set SecRuleEngine On only if you really want to turn ModSecurity on, you can still test your rules in DetectionOnly mode by running this command to watch your log files as you browse around the site:
tail -f /var/log/apache2/error.log
or, for more information:
tail -f /var/log/apache2/modsec_audit.log
Finally, reload Apache to make your changes take effect:
sudo service apache2 reload
Final points
This script is not perfect, but it will help you to get started. There are cleverer ways to build a whitelist file, but they require more knowledge of ModSecurity and more time - this is a quick and dirty way to get ModSecurity up and running without breaking everything. It's important to recognise that this is a bit of a sledgehammer approach to the problem. The crudest way of dealing with false positives is to remove the rule files entirely (the ones in /etc/modsecurity/). Slightly better than that is removing rules by their IDs, and slightly better than that is removing them by ID for specific locations, which is what the script does. There are better ways of doing things though: most of the rules do some kind of pattern matching with regular expressions (regex), looking for naughty terms or patterns. If you're clever, you can update the rule to whitelist problem words that are causing the false positive:
SecRuleUpdateTargetById 958895 !ARGS:email
The above would remove the argument "email" from the rule 958895 without requiring direct modification to the CRS, which is ideal - you don't want to have to manually hack the CRS each time a new set comes out. If you place parameters like this inside a custom rules file, e.g. /etc/modsecurity/modsecurity_crs_60_customrules.conf
then they will be unaffected by upgrades. Better yet, you can do the above but only for specific locations:
SecRule REQUEST_FILENAME "@streq /path/to/file.php" \ "phase:1,t:none,nolog,pass,ctl:ruleUpdateTargetById=958895;!ARGS:email"
This would remove the argument "email" from the rule, but only at /path/to/file.php. Cool eh? The more time you have, the better you can make your whitelist. Comparing the whitelist that my script generates to the two methods above, you can see how it's quite "loose" - you'll remove the false positive at that location, but you also stop the rule from catching any other requests that would have tripped the rule at that location. This post gives a very good description of the different whitelisting methods available in ModSecurity. It is written by SpiderLabs, the sponsor of ModSecurity, and I found it very useful (it's also where I lifted those two examples of better whitelisting methods from). If you're after more information on ModSecurity, a great place to start is the ModSecurity Reference Manual.
Add new comment