Hi, How do I configure Splunk for Squid to parse Squid ver.3.1 logs. Out of the box SplunkForSquid can't find any events, although there are thousands of Squid events in my Splunk installation. Can someone please help. I've tried formatting the access.log as follows.
logformat custom %tu %>a %Ss %<Hs %st %rm % >ru %<A %rp %un %sh %<a %mt duration tu clientip >a action Ss http_status <Hs bytes st method rm uri ru uri_host <A uri_path rp username un hierarchy (Can find appropirate code, sh not available) server_ip <a content_type mt
Splunk for Squid assumes that Squid's default logformat is used:
logformat squid %ts.%03tu %6tr %>a %Ss/%03Hs %<st %rm %ru %un %Sh/%<A %mt
If you change that, you will also need to change Splunk for Squid's extractions.
Could you post a sample event or two?
EDIT: Your format for the actual Squid event looks fine, and checks out with the regex used by the app. HOWEVER, it seems you're sending this through a syslogd that adds a header to the events, which breaks the match. The header I'm referring to is the initial "
<13>Aug 21 17:32:31 LAMNUBIDS001".
What you could do is to modify the regex to skip matching from the start of the line, or modify it to include this extra header. This is the default regex used by the app (in
^\d+\.\d+\s+(\d+)\s+([0-9\.]*)\s+([^/]+)/(\d+)\s+(\d+)\s+(\w+)\s+((?:([^:]*)://)?([^/:]+):?(\d+)?(/?[^ ]*))\s+(\S+)\s+([^/]+)/([^ ]+)\s+(.*)$
Just remove the leading caret (
^) character and your extractions should work fine.
Thanks for the quick reply!!. When I insert your logformat into My squid.conf I get the Warning Below, so I ammended it slightly.
logformat Custom %ts.%03tu %6tr %>a %Ss/%03Hs %<st %rm %ru %un %Sh/%<A %mt
2012/08/21 17:16:34| WARNING: the "Hs" formating code is deprecated use the ">Hs" instead
So I changed Hs to >Hs.. But still no results found in SplunkForSquid.
Below are a few events from Splunk after applying the logformat you gave me.
<13>Aug 21 17:32:31 LAMNUBIDS001 1345541550.601 822 10.53.2.84 TCPMISS/200 785 POST http://us.mg6.mail.yahoo.com/ws/mailPreferences/v1/jsonrpc? - DIRECT/188.8.131.52 application/json
<13>Aug 21 17:32:32 LAMNUBIDS001 1345541551.709 469 10.53.2.84 TCPMISS/200 411 POST http://prod2.rest-core.msg.yahoo.com/v1/message/yahoo/ugluudavaa? - DIRECT/184.108.40.206 -
<13>Aug 21 17:32:32 LAMNUBIDS001 1345541552.087 112615 10.53.2.74 TCPMISS/200 163 GET http://prod2.rest-notify.msg.yahoo.com/v1/pushchannel/jerry0629? - DIRECT/220.127.116.11 -
<13>Aug 21 17:32:33 LAMNUBIDS001 1345541552.773 514 10.53.2.84 TCP_MISS/200 411 POST http://prod2.rest-core.msg.yahoo.com/v1/messages? - DIRECT/18.104.22.168 -
I finally got another chance to hack away at this problem and finally solved it.
First, I had to add SEDCMD-pristrip=s/^<[0-9]+>// to my splunk default props.conf to strip out the priority level. cause I'm using syslog-ng to syslog over TCP.
Second, change the regex in splunk for squid transforms.conf to ^[A-Z][a-z]+\s+\d+\s\d+:\d+:\d+\s[^\s]\s\d+.\d+\s+(\d+)\s+([0-9.])\s+([^/]+)/(\d+)\s+(\d+)\s+(\w+)\s+((?:([^:])://)?([^/:]+):?(\d+)?(/?[^ ]))\s+(\S+)\s+([^/]+)/([^ ]+)\s+(.*)$