awk – getting status code by request type

I started off by wanting to see what my request types were and then grew curious what my status codes were for those request types. This is searching my varnishlogs but they are in apache format.

Since I like awk… I started by searching the 6th field for GET request types
awk ‘($6 ~ /GET/)’ varnishlog |wc -l 
While this is nice its very manual and doesn’t give me a whole lot of useful info except every request that’s a get

So let’s take it a step farther, using the frequency function from awk
awk ‘{freq[$6]++} END {for (x in freq) {print x, freq[x]}} varnishlog
This produces a nicer output and tells me how many of each request type that I got but doesn’t tell the status of those request types

let’s take it another step further, we’re going to search the 9th field for 200 statuses and report against the request type and lets add numeric sorting on the 2nd output field
awk ‘($9 ~ /200/){freq[$6]++} END {for (x in freq) {print x, freq[x]}}’ varnishlog |sort -nr -k 2

Well now we can see all of our requests are not 200s, let’s get a clear view of what’s happening then.  I’ll create an array of the status codes I care about and move to a for loop, due to my use of the array I have to take special care using it in the search field, I also add a carriage return to the output of awk to make the output easier to read
statuscodes=( 200 301 302 400 403 404 410 500 501 503 );for a in ${statuscodes[@]};do echo $a statuses;awk ‘($9 ~ /'”${a}”‘/){freq[$6]++} END {for (x in freq) {print x, freq[x], “\n”}}’ varnishlog |sort -nr -k 2;done