Your basic ITPro blog... What's going on at work, what I'm interested in.

Tuesday, August 26, 2008

Perl, PoSH, Processing Logs

Or, maybe a better title would be, "Beets, Bears, Battlestar Galactica". In either case...

We had a very interesting item come up today that, among other things, introduced me to Perl, gave me a renewed appreciation for one of my co-workers, and made me think about firewall management.

We are running Arena here. Recently, we got word from them that our system was throwing a bunch of strange exceptions. When Arena has an exception, it reports it back to 'the Mothership'. Anyway, after some digging, we determined that the exceptions were attempted SQL injection attacks.

This is pretty scary, but it seems that our system knows enough about these attacks to throw the exception rather than attempt to process them. That's the good news. So, our next thought was, "How can we identify the attackers and can we cut them off"

Manually looking through the logs, we found an entry in each record that read: REMOTE_ADDR:

We had found our attacker... or one of them. It turns out that, over the past few days, we had over 1800 such attacks! We needed a way to get a list of these attacker IP addresses. I thought this would be a great exercise for me, especially since I haven't done much in the way of text parsing/processing (in PowerShell or anywhere else). So, I got a copy of the offending log entries (a 6MB text file) and started trying to figure out how to find what I needed using Powershell. Mind you, I am new to text processing and regex, so this was a bit of an adventure!

After some research and a few iterations (and input from my aforementioned co-worker), I came up with this:

$content = Get-Content u:\scripts\SQLHackAttempts.txt
[regex]$regex = 'REMOTE_ADDR: (\S+)'
$allHits = @{}
$hits = $regex.Matches($content)
foreach ($hit in $hits) {
    $ip = $hit.Groups[1].Value
    if ($allHits.ContainsKey($ip)) {
        $allHits.Set_Item($ip, $($allHits.get_Item($ip) + 1))
    else {
        $allHits.Add($ip, 1)
$file = 'u:\scripts\ParsingOutput.txt'
$allHits.GetEnumerator( ) | Sort Value -Descending | Tee-Object $file
$footer = @"
Total records in log: $($content.count)
Total Unique IPs: $($allHits.Count)

Out-File -FilePath $file -InputObject $footer -Append
"Total records in log: $($content.count)"
"Total Unique IPs: $($allHits.Count)"

This goes through the log entries, finds the proper IP address, and reports back (to console and to a file) each unique IP address and how many times it showed up. The tail end of the report looks like:


Yes, 704 unique attack IP addresses used in 1869 individual attacks! Wow.

This was a great exercise, and a lot of fun... until, that is, I saw Nick's Perl script that did pretty much the exact same thing. Here's his code:

I run it like this:

perl < input.txt > out.txt

Here is the code:

    $hash{$1}++ if ( /REMOTE_ADDR: (\S+) / );
foreach $ip ( sort {$hash{$b} <=> $hash{$a}} (keys(%hash)) )
    print "$hash{$ip}\t$ip\n";

I have never worked with Perl before, so this took a bit of explanation. But I thought it was pretty cool! I knew Perl was really good at text processing, but this still blew me away.

Now, after using either method, we had our list of attackers. The question now is, what do we do with this? I don't think a 700+ line ACL on my firewall is that great of an idea. But, I want some way to deny these addresses access to our systems.

Thoughts? Suggestions? Improvements on my PoSH code?


/\/\o\/\/ said...

A shorter PoSH alternative :

gc iptest.txt |? {$_ -match 'REMOTE_ADDR: (\S+)'} |% {$matches[1]} | group | sort count


Greetings /\/\o\/\/

/\/\o\/\/ said...

... and this way might be more performant :

Select-String -Path iptest.txt 'REMOTE_ADDR: (\S+)' |% {$_.matches[0].groups[1].value}| group | sort count

Greetings /\/\o\/\/

Additional Info

My photo
email: support (AT) mangrumtech (DOT) com
mobile: 480-270-4332