0

I have a standard apache error log file. I would like to see what URLs are causing 404s, since I have moved this site around and I want to find bad links. Can anyone recommend a bash snippet that will parse this log using awk or something to show me the popular 404s?

I know there are advanced programmes for this sort of thing. I'm just looking for something simple.

2 Answers 2

6

This should do it:

grep ' 404 ' /var/log/apache2/access.log | cut -d ' ' -f 7 |sort |uniq -c |sort -n 
2
  • Add a "| sort -n" to get them in order of hits Commented Jun 30, 2009 at 10:18
  • I agree with this solution with one important side note: This works with most Apache installations that use the system default logging format. If the admin has changed the logging format the command above must be altered slightly. Commented Jun 30, 2009 at 12:35
3

An awk answer :

awk '$9 == 404{urls[$7]++}END{for (url in urls) print urls[url] "\t" url}' access_log | sort -n 

It's just for fun as it's probably much much slower than womble solution

2
  • Got in just as I was writing mine. Commented Jun 30, 2009 at 11:40
  • Have you a nice solution to include the sort -n in awk ? I did not found a way to do it in a simple way Commented Jun 30, 2009 at 12:01

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.