4

I would like to parse an access log file and have returned the amount of requests for the last 7 days. I have this command

cut -d'"' -f3 /var/log/apache/access.log | cut -d' ' -f2 | sort | uniq -c | sort -rg

Unfortunately, this command returns the amount of requests since the creation of the file and sorts it into HTTP-code categories. I would like just a number, no categories, and only for the last 7 days.

6 Answers 6

2

I'd set up log rotation daily (how to do this would be dependent on your OS), then use the same command above on the 7 most recent logs. As for your existing log, either use a tool like grep to extract just the days you want, or split that log into logs for each day.

If you want something more elegant than that, I'd just look for one of the myriad log parsing tools already out there.

Here's an example to split up your existing log: Split access.log file by dates using command line tools

7
  • My logs are rotated by size - not time. I will take a look at the link. Commented Oct 17, 2013 at 20:29
  • Your link is basically about months. I would like to split into weeks. Commented Oct 17, 2013 at 20:31
  • The answer marked accepted only breaks it down to months. If you scroll down there are plenty more examples. Every single other example on there now breaks it down into individual days. From there it's up to you how you define a week and combine the log(s) together if you require it by the week. Commented Oct 17, 2013 at 20:34
  • How to be exact? Commented Oct 17, 2013 at 20:45
  • Exact about what? Pick out the example that makes the most sense to you, generate the files, then combine 7 days worth cat day1 day2 day3 day4 day5 day6 day7 > week1, then run your original command on that. Commented Oct 17, 2013 at 21:02
0

It's a Microsoft utility so probably not what you're after but there's a utility called LogParser (link) that will analyse Apache log files and let you use SQL-style syntax to filter, aggregate etc.

You'll want to specify the input format parameter as NCSA.

2
  • Does that work on CentOS? Commented Oct 17, 2013 at 20:32
  • Afraid not, the utility is Windows only. It's a relatively old app so you may be able to get it working with Wine. Have never personally tried though. Commented Oct 17, 2013 at 20:39
0

It should be possible, but I'm getting mired in Bash command nesting that doesn't work, and I don't understand why.

Conceptually, do this:

  1. Find the date 7 days ago, in the format that's in your Apache log
    1. date -d "-7 days" +%d\/%b\/%Y -> 10/Oct/2013
  2. Delete from the first line, up to the first mention of that date
    1. sed '1,/~pattern~/d' access_log
  3. Feed the result into wc to get a count.
    1. | wc -l

So there should be a way to combine the above into one command:

$ sed '1,/10\Oct\/2013/d' access_log | wc -l 29 $ sed '1,/$(date -d "-7 days" +%d\/%b\/%Y)/d' access_log | wc -l $ 

Somewhere in the nesting, my date command and sed aren't playing nicely. And everything I try with various combinations of quotes and escapes doesn't make any difference.

What am I missing?

0

How about looking at tools like Splunk or Loggly? Loggly has a free-trial, Splunk Storm (http://splunkstorm.com) is free to sign up, and unless your log files exceed their limits, it should be trivial to get your logs indexed and run various stats on requests in the past 7 days (or various other time frames).

2
  • It's for a dashboard I am making for my customers, so an External log system is not possible. Commented Oct 19, 2013 at 15:35
  • If you decide to go down this route, you would install Splunk on servers you control, preferably on dedicated instances. I think Loggly only offered hosted solutions. Commented Oct 19, 2013 at 19:24
0

I suggest changing your syslog logfile rotation to be daily then the below script works

 #!/bin/bash # I return the sum of hit for gets and posts here. you can remove the final pipe to awk if you want the hits by page ZCAT=$(cat access.log.1 | egrep "GET|POST" | awk '{ print $11 }' ; zcat access.log.[2-7].gz | egrep "GET|POST" | awk '{ print $11 }') echo "$ZCAT" | sort | uniq -c | sort -rg | awk '{ sum=sum+$1 } END { print sum }' 
0

This code will allow you to parse out the past 7 days of logs from access.log. it can be used in conjunction with the example in the question or in my other answer to sum the hits for the 7 days parsed out in this example.

 #!/usr/bin/bash regex="\*\*\*DONOT_NULLIFY_FORREASONS" days=$(now=`date +%s` for offset in 6 5 4 3 2 1 0 do newdate=`echo "$now - 86400 * $offset" | bc` date +%d/%b/%Y --date="@$newdate" done) for day in $days do regex="$regex"'|'"\[$day" done egrep "$regex" access.log 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.