Log Parser: The coolest tool Microsoft has ever Released!

Introduction

Log Parser 2.x is, in my skewed perspective, the coolest tool Microsoft has ever released.  It easily ranks up there with the tools that Winternals\Systernals release (like Filemon, Regmon, Process Explorer, etc.)  The tool is a basic swiss-knife for log files that a administrator can tear through file after file and strip out all the interesting data.  In my world, I focus on two major inputs – IIS Log files & Event Viewer.

During our Web Administration Summit (10 countries in 4 weeks in Europe), we displayed the coolness that comes from this tool.  Beyond that, we used some simple slides & powerful scripts to show how easy a administrator could tear through event viewer logs & W3C logs.  This included a sample 200 Megabyte log file showing the quickness and power of Log Parser. 

Download Log Parser 2.2

Log Parser Basics

The purpose of this blog isn’t to “teach” anyone Log Parser but instead to just make people aware of it (if they somehow are not.)  The basics are pretty straight forward and include the following:

Flavors:

Log Parser comes in two flavors, command-line & component-based (com object dll), for those who prefer direct interaction or automated.  This allows users to use the tool in web-based applications or with ease via script such as vbscript or jscript.

Input Formats:

I couldn’t begin to list all the output formats that are supported by Log Parser.  The list scrolls beyond the screen in the help file so I suggest you visit it when time permits.  However, from a IIS-centric mindset let me tell you some of the things that I use it for:

    • Conversion:  The ability to convert one log file format to another.  This is useful if you are using say Centralized Binary Logging (2K3 only) and would like a readable format such as W3C.  This can be done easily using Log Parser.
    • IIS Log Files:  No matter the log file type, you can easily indicate using -i:<file format> the type of log file and Log Parser will pick that up and understand it (including built-in functions such as understanding <1> means the /w3svc/1 log file.
    • Event Viewer:  This is very useful to locate error events thrown by the key IIS services such as w3wp, etc.
    • Network Monitor:  You got it — it can parse those big ‘ole netmon traces to locate specific traffic you are interested in and only that traffic.   

Output Formats:

    • NAT:  This is the default, but not necessarily the best.  This simply will return the results to the screen but this is not really useful sometimes as the data might be hard to read or more often TOO BIG.
    • Datagrid:  This is nice for formatting data in a easy-to-read format though it is again still slightly limited.  I use it often as I am not in the world of automation that many of you are so forgive my passion for it.  However, if you do choose to use it then you should learn the -rtp switch that allows you to tell Log Parser exactly how many rows to print when extracting the data.  This can be done using -rtp:-1 (all rows no matter how many) or using -rtp:10 to only print 10 rows.
    • SQL:  Yep, you can easily take data from the log files and input into SQL server.  This is very useful by allowing administrators the ability to hand-pick (via SQL commands) only the data they are interested in rather than the “entire” file. 

Sample Scripts

  • Convert Centralized Binary Log (CBL) files to W3C log files

In some high traffic environments, the use of binary logging can help performance of the web server.  The pitfalls of binary logging are the missing extended properties like referrer, etc. and doesn’t always work well in environments.  However, the biggest issue most administrators have with binary logging is they *can’t* read the files using their favorite text editor.  Enter log parser…

 logparser -i:BIN -o:W3C <name of binary file> <name of new W3C text file>

  • Top browsers accessing your Websites

If you are interested in understanding what is your top browsers being used, then you can use this script.  Based solely on the user-agent Log property it will count, and then sort based on Total Hits

Select top 50
 to_int(mul(100.0,PropCount(*))) as Percent,
 count(*) as TotalHits,
 cs(User-Agent) as Browser
from [LogFileName]
group by Browser
order by Totalhits desc

  • Daily Bandwidth

This is a nice script to determine exactly how much bandwidth your site is using.  If you possibly pay for bandwidth, or do not have unlimited supply, this is very nice to know.

Select
 To_String(To_timestamp(date, time), 'MM-dd') As Day,
 Div(Sum(cs-bytes),1024) As Incoming(K),
 Div(Sum(sc-bytes),1024) As Outgoing(K)
From [LogFileName]
Into BandwidthByDay.gif
Group By Day

  • Hits by Hour

A per hour view of traffic might help you understand better how your entire application set is used by customers.  This is useful in many ways such as planning or preparation for expansion.  There are many uses beyond those I mention but are a couple that come to mind.

SELECT QUANTIZE(TO_LOCALTIME(TO_TIMESTAMP(date,time)),3600) As Hour,
       cs-uri-stem As Url,
       COUNT(*) As Hits
FROM [LogFileName]
WHERE cs-uri-stem LIKE '%index.htm%'
GROUP BY Url, Hour
ORDER By Url

  • Request by URI

Page down by page down…is a boring exercise.  This is the life of a Web Administrator if they choose to manually or worse pay for services to determine what traffic is occuring on their website.  This can be avoided with Log Parser which is free and can do it very quickly using the popular T-SQL language.

SELECT top 80

 QUANTIZE(TO_TIMESTAMP(date, time), 3600) as Hour,
 TO_LOWERCASE(STRCAT('/',EXTRACT_TOKEN(cs-uri-stem,1,'/'))) as URI,
 COUNT(*) AS RequestsPerHour,
 SUM(sc-bytes) AS TotBytesSent,
 AVG(sc-bytes) AS AvgBytesSent,
 Max(sc-bytes) AS MaxBytesSent,
 ADD(1,DIV(Avg(time-taken),1000)) AS AvgTime,
 ADD(1,DIV(MAX(time-taken),1000)) AS MaxTime

FROM [LogFileName]
GROUP BY Hour, URI
Having RequestsPerHour > 10
ORDER BY RequestsPerHour ASC

  • Top 10 Images by Size

I love this one because it allows you to find those users who have the digital camera set to top resolution combined with the one lacking skill of editing, cropping, or re-sizing.  Find these bandwidth hogs and size ‘em down…

Select
 Top 10
 StrCat(Extract_Path(TO_Lowercase(cs-uri-stem)),'/') AS RequestedPath,
 Extract_filename(To_Lowercase(cs-uri-stem)) As RequestedFile,
 Count(*) AS Hits,
 Max(time-taken) As MaxTime,
 Avg(time-taken) As AvgTime,
 Max(sc-bytes) As BytesSent
From [LogFileName]
Where
 (Extract_Extension(To_Lowercase(cs-uri-stem)) IN ('gif';'jpg';'png'))
 AND
 (sc-status = 200)
Group By To_Lowercase(cs-uri-stem)
Order By BytesSent, Hits, MaxTime DESC

  • Top 10 URLs for a website with some interesting data to support you

 

Select
   TOP 10
   STRCAT(EXTRACT_PATH(cs-uri-stem),'/') AS RequestPath,
 EXTRACT_FILENAME(cs-uri-stem) AS RequestedFile,
 COUNT(*) AS TotalHits,
 Max(time-taken) AS MaxTime,
 AVG(time-taken) AS AvgTime,
 AVG(sc-bytes) AS AvgBytesSent
INTO chartTop10Url.gif
FROM [LogFileName]
GROUP BY cs-uri-stem
ORDER BY TotalHits DESC

  • Top 20 clients using your website

Select
 Top 20 c-ip AS Client,
 Count(*) AS Hits
INTO Chart.gif
FROM [LogFileName]
GROUP BY c-ip
ORDER BY Hits Desc

  • Referrer Broken Links (i.e. site refers to your content though your content has moved, etc.)

 SELECT DISTINCT cs(Referer) as Referer,
  cs-uri-stem as Url

  INTO ReferBrokenLinks.html

  FROM [LogFileName]

 WHERE cs(Referer) IS NOT NULL
   AND sc-status = 404
   AND (sc-substatus IS NULL OR sc-substatus=0)

    • Using ReferBrokenLinks.html, here is source for the tpl (Template) file that Log Parser will use to format the data

<LPHEADER>
<HTML>

<HEAD><TITLE>Pages having broken links to this site</TITLE></HEAD>

<BODY BGCOLOR="#EFEFFF">

<TABLE BORDER="1" CELLPADDING="2" CELLSPACING="2">
<TR>
 <TH COLSPAN="2" ALIGN="CENTER">Pages with Broken Links To This Site</TH>
</TR>
<TR>
 <TH ALIGN="LEFT">Referring Page</TH>
 <TH ALIGN="LEFT">Broken Link</TH>
</TR>

</LPHEADER>
<LPBODY>
<TR>
 <TD><A HREF="%Referer%">%Referer%</A></TD>
 <TD>%Url%</TD>
</TR>
</LPBODY>

<LPFOOTER>
</TABLE>
</BODY>
</HTML>
</LPFOOTER>

  • Status code

SELECT sc-status As Status,
       COUNT(*) As Number

  INTO StatusCodes.gif
  FROM <2>

GROUP BY Status
ORDER BY Status

  • Time-taken for virtual directories

SELECT CASE EXTRACT_TOKEN(cs-uri-stem, 2, '/')
         WHEN NULL THEN '/'
         ELSE EXTRACT_PREFIX(cs-uri-stem, 1, '/')
       END As VDir,
       QUANTIZE(time-taken, 100) As TimeTaken,
       COUNT(*) As NumHits

  FROM [LogFileName]

GROUP BY TimeTaken, VDir
ORDER BY TimeTaken, VDir DESC

SELECT CASE EXTRACT_TOKEN(cs-uri-stem, 2, '/')
         WHEN NULL THEN '/'
         ELSE EXTRACT_PREFIX(cs-uri-stem, 1, '/')
       END As VDir,
       QUANTIZE(time-taken, 100) As TimeTaken,
       COUNT(*) As NumHits

  FROM ex05111606.log

GROUP BY TimeTaken, VDir
ORDER BY TimeTaken, VDir DESC

  • Grep Event Viewer for W3SVC (IIS) log entries and color-coordinate as to Error, Warning, Information

SELECT TimeGenerated,
       EventTypeName,
       Strings,
       Message,
       CASE EventTypeName
         WHEN 'Error event' THEN 'RED'
         WHEN 'Warning event' THEN 'YELLOW'
         WHEN 'Information event' THEN 'WHITE'
         ELSE 'BLUE'
       END As Color
  INTO file.htm
  FROM System
 WHERE SourceName = 'W3SVC'

    • Associated tpl file used for file.htm in the above example

<LPHEADER>
<HTML>
<HEAD>
  <STYLE>
    TD { font-family: Arial };
    TH { font-family: Arial };
  </STYLE>

</HEAD>

<BODY>

<TABLE BORDERCOLOR="BLACK" BORDER="1" CELLPADDING="2" CELLSPACING="2">
<TR>
  <TH COLSPAN=4 BGCOLOR="BLACK"><FONT COLOR=WHITE>New W3SVC Messages in System Event Log</FONT></TH>
</TR>
<TR>
  <TH ALIGN=LEFT BGCOLOR="#C0C0C0">Time Generated</TH>
  <TH ALIGN=LEFT BGCOLOR="#C0C0C0">Event Type</TH>
  <TH ALIGN=LEFT BGCOLOR="#C0C0C0">Strings</TH>
  <TH ALIGN=LEFT BGCOLOR="#C0C0C0">Message</TH>
</TR>
</LPHEADER>

<LPBODY>
<TR bgCOLOR="%Color%">
  <TD>%TimeGenerated%</TD>
  <TD>%EventTypeName%</TD>
  <TD>%Strings%</TD>
  <TD>%Message%</TD>
</TR>
</LPBODY>

</TABLE>
</BODY>
</HTML>

  • Upload to SQL Server using Log Parser for SQL Reporting Services

logparser “Select * INTO LogsTable FROM ex*.log WHERE TO_LOWERCASE (EXTRACT_EXTENSION(cs-uri-stem)) NOT IN (‘gif’;’jpg’;’png’) AND sc-status <> 404” -i:IISW3C -o:SQL -database:LogsDatabase

Summary

Log Parser is probably one of the only tools on the planet where the possibilities are almost “endless.”  The tool is like the prized Kentucky Derby winner who has all the future of a prized champion — all it needs is a little luck.  That luck is you the user as it will take you as far as you want it to and I hope that these samples (courtesy or myself & Alexis on the Roadshow in Europe) prove useful.

Great Resources:

Official Log Parser Site (non-MS)

Microsoft Log Parser Toolkit by Syngress Publishing

Data Mining with Log Parser 2.x and IIS (Webcast)

Bernard Cheah (IIS MVP) Cool Blog on Log Parser

Gabriele writes in Technet Professor about Log Parser

13 Comments

Comments have been disabled for this content.