Here are some scripts that well help us in our daily analysis of our servers
A few things we care about and the business will generally ask for are how the sites/servers are performing, in the absence of tools like splunk or awstats or similar tools these scripts can come in handy
This awk script is expecting apache format and is using the 9th field to capture the status codes for all hits
awk '{count[$9]++}END{for (j in count) {print count[j], j; total+=count[j]} print total" Total status codes"}' /logfile |sort -n
This awk script uses the 7th field and provides top requested urls
awk '{count[$7]++}END{for (j in count) {print count[j], j; total+=count[j]} print total" Total urls"}' /logfile |sort -n
This awk script uses the 11th field and provides top referering urls
awk '{count[$11]++}END{for (j in count) {print count[j], j; total+=count[j]} print total" Total urls"}' /logfile |sort -n
This awk script uses a new field separator to get the whole useragent and provides a count of top agents
awk -F \" '{count[$6]++}END{for (j in count) {print count[j], j; total+=count[j]} print total" Total agents"}' /logfile |sort -n
This awk script uses the 4th field to provide hits by sec
awk '{count[$4]++}END{for (j in count) {print count[j], j; total+=count[j]} print total" Total hits"}' /logfile |sort -n
This awk script uses a new field separator to get the hour field and provide a count of hits by hour
awk -F \: '{count[$2]++}END{for (j in count) {print count[j], j; total+=count[j]} print total" Total hits"}' /logfile |sort -n
Now reporting is nice but being able to troubleshoot is better
Let’s say you ran the status codes script and you notice more 500 status codes than you’d like and you’d like to see when they’re occuring
This script combines two awk scripts, this first part looks for any 50(0 1 3 & so on) error and that output is piped to our count by hour script and provides a breakdown of when most of our errors are occuring
awk '($9 ~ /50./)' /logfile | awk -F \: '{count[$2]++}END{for (j in count) {print count[j], j; total+=count[j]} print total" Total 500s"}'|sort -n
Let’s say now you’ve found a high concentration of errors at 3am and now let’s see what urls are causing the errors
This awk script uses field 9 to check for any 500 type error AND field 4 for the 3AM hour, then counts and produces top erroring urls during that time
awk '($9 ~ /50./) && ($4 ~ /\[05\/Dec\/2013\:03/){count[$7]++}END{for (j in count) {print count[j], j; total+=count[j]} print total" Total errors"}' /logfile |sort -n
What if I wanted to see what urls didn’t fail but were successful, redirected or missing? Well we use the not equal comparison for awk, using the previous script makes it easier to note the difference
awk '!($9 ~ /50./) && ($4 ~ /\[05\/Dec\/2013\:03/){count[$7]++}END{for (j in count) {print count[j], j; total+=count[j]} print total" Total errors"}' /logfile |sort -n
These scripts should provide you with a good basis for being able to get the information the business people need and insight for you on how your sites & systems are performing. All scripts are based on the apache log format and use the default “space” delimeter unless otherwise noted.
Happy scripting