<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Custom Squid log format in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76506#M19358</link>
    <description>&lt;P&gt;Hi all, &lt;/P&gt;

&lt;P&gt;I'm trying to modify the SplunkforSquid app to read my squid custom log file format correctly. As per squid.conf it is-&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;logformat test %ts.%03tu %6tr %&amp;gt;a %Ss/%03Hs 0 %03Hs %st %rm %ru %un %&amp;lt;A
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Log format codes (trimmed):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;#               &amp;gt;a      Client source IP address
#               &amp;lt;A      Server IP address or peer name
#               ts      Seconds since epoch
#               tu      subsecond time (milliseconds)
#               tr      Response time (milliseconds)
#               un      User name
#               Hs      HTTP status code
#               Ss      Squid request status (TCP_MISS etc)
#               rm      Request method (GET/POST etc)
#               ru      Request URL
#               st      Request+Reply size including HTTP headers
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I've tried a few things here, creating field extractions in Splunk was working OK until I got to the username field, as often the username is just "-" the regex creator in Splunk would not detect this. My regex knowledge is nowhere near enough to debug this. Some help would be greatly appreciated.&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;UPDATE&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;Attempting to use delimExtractions:&lt;/P&gt;

&lt;P&gt;props.conf-&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[squid]
REPORT-main=delimExtractions
SHOULD_LINEMERGE=false
TIME_FORMAT=%+                  #log format time is in epoch. not sure if this is right
MAX_TIMESTAMP_LOOKAHEAD=19
KV_MODE = none
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;transforms.conf-&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[delimExtractions]
DELIMS=" "
FIELDS="timestamp","responsetime","clientip","not_needed","zero","http_status","total_size","method","uri","username","server_ip
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Fields such as 'responsetime', 'clientip' are not showing in the search tab, however 'not_needed','http_status' and a few others are.&lt;/P&gt;

&lt;P&gt;I removed the other field extractions entry thinking I only needed the delimExtraction.&lt;/P&gt;

&lt;P&gt;Sample squid logs:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;1302571599.112     32 10.10.10.10 TCP_DENIED/407 0 407 2581 CONNECT armmf.adobe.com:443 - -

1302571599.112    465 10.10.10.10 TCP_MISS/200 0 200 13314 GET &lt;A href="http://www.ebay.com.au/" target="test_blank"&gt;http://www.ebay.com.au/&lt;/A&gt; username 203.5.76.11

1302571599.115      0 10.10.10.10 TCP_DENIED/407 0 407 2415 CONNECT armmf.adobe.com:443 - -

1302571599.115     17 10.10.10.10 TCP_IMS_HIT/304 0 304 1302 GET &lt;A href="http://vtr.elections.nsw.gov.au/images/eGlooApp.gif" target="test_blank"&gt;http://vtr.elections.nsw.gov.au/images/eGlooApp.gif&lt;/A&gt; username -

1302571599.118    195 10.10.10.10 TCP_MISS/200 0 200 1729 GET &lt;A href="http://toolbarqueries.google.com.au/tbr" target="test_blank"&gt;http://toolbarqueries.google.com.au/tbr&lt;/A&gt;? username 10.10.10.10

1302571599.119     19 10.10.10.10 TCP_NEGATIVE_HIT/404 0 404 2459 GET &lt;A href="http://vtr.elections.nsw.gov.au/css/mysource_files/arrow.png" target="test_blank"&gt;http://vtr.elections.nsw.gov.au/css/mysource_files/arrow.png&lt;/A&gt; username -

1302571599.119    796 10.10.10.10 TCP_MISS/200 0 200 1734 GET &lt;A href="http://t.adcloud.net/t.gif" target="test_blank"&gt;http://t.adcloud.net/t.gif&lt;/A&gt;? username 10.10.10.10

1302571599.122    148 10.10.10.10 TCP_MISS/200 0 200 5050 GET &lt;A href="http://someurl.net" target="test_blank"&gt;http://someurl.net&lt;/A&gt; username 10.10.10.10

1302571599.122     22 10.10.10.10 TCP_IMS_HIT/304 0 304 1321 GET &lt;A href="http://vtr.elections.nsw.gov.au/images/panel-sprite.png" target="test_blank"&gt;http://vtr.elections.nsw.gov.au/images/panel-sprite.png&lt;/A&gt; username -
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I'd really like to just change the squid log format back to default, but we have a few apps using this weird format for some reason... I mean really why need the '0' and have the status code twice &lt;span class="lia-unicode-emoji" title=":confused_face:"&gt;😕&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 07 Apr 2011 06:53:02 GMT</pubDate>
    <dc:creator>anstoitsec</dc:creator>
    <dc:date>2011-04-07T06:53:02Z</dc:date>
    <item>
      <title>Custom Squid log format</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76506#M19358</link>
      <description>&lt;P&gt;Hi all, &lt;/P&gt;

&lt;P&gt;I'm trying to modify the SplunkforSquid app to read my squid custom log file format correctly. As per squid.conf it is-&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;logformat test %ts.%03tu %6tr %&amp;gt;a %Ss/%03Hs 0 %03Hs %st %rm %ru %un %&amp;lt;A
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Log format codes (trimmed):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;#               &amp;gt;a      Client source IP address
#               &amp;lt;A      Server IP address or peer name
#               ts      Seconds since epoch
#               tu      subsecond time (milliseconds)
#               tr      Response time (milliseconds)
#               un      User name
#               Hs      HTTP status code
#               Ss      Squid request status (TCP_MISS etc)
#               rm      Request method (GET/POST etc)
#               ru      Request URL
#               st      Request+Reply size including HTTP headers
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I've tried a few things here, creating field extractions in Splunk was working OK until I got to the username field, as often the username is just "-" the regex creator in Splunk would not detect this. My regex knowledge is nowhere near enough to debug this. Some help would be greatly appreciated.&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;UPDATE&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;Attempting to use delimExtractions:&lt;/P&gt;

&lt;P&gt;props.conf-&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[squid]
REPORT-main=delimExtractions
SHOULD_LINEMERGE=false
TIME_FORMAT=%+                  #log format time is in epoch. not sure if this is right
MAX_TIMESTAMP_LOOKAHEAD=19
KV_MODE = none
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;transforms.conf-&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[delimExtractions]
DELIMS=" "
FIELDS="timestamp","responsetime","clientip","not_needed","zero","http_status","total_size","method","uri","username","server_ip
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Fields such as 'responsetime', 'clientip' are not showing in the search tab, however 'not_needed','http_status' and a few others are.&lt;/P&gt;

&lt;P&gt;I removed the other field extractions entry thinking I only needed the delimExtraction.&lt;/P&gt;

&lt;P&gt;Sample squid logs:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;1302571599.112     32 10.10.10.10 TCP_DENIED/407 0 407 2581 CONNECT armmf.adobe.com:443 - -

1302571599.112    465 10.10.10.10 TCP_MISS/200 0 200 13314 GET &lt;A href="http://www.ebay.com.au/" target="test_blank"&gt;http://www.ebay.com.au/&lt;/A&gt; username 203.5.76.11

1302571599.115      0 10.10.10.10 TCP_DENIED/407 0 407 2415 CONNECT armmf.adobe.com:443 - -

1302571599.115     17 10.10.10.10 TCP_IMS_HIT/304 0 304 1302 GET &lt;A href="http://vtr.elections.nsw.gov.au/images/eGlooApp.gif" target="test_blank"&gt;http://vtr.elections.nsw.gov.au/images/eGlooApp.gif&lt;/A&gt; username -

1302571599.118    195 10.10.10.10 TCP_MISS/200 0 200 1729 GET &lt;A href="http://toolbarqueries.google.com.au/tbr" target="test_blank"&gt;http://toolbarqueries.google.com.au/tbr&lt;/A&gt;? username 10.10.10.10

1302571599.119     19 10.10.10.10 TCP_NEGATIVE_HIT/404 0 404 2459 GET &lt;A href="http://vtr.elections.nsw.gov.au/css/mysource_files/arrow.png" target="test_blank"&gt;http://vtr.elections.nsw.gov.au/css/mysource_files/arrow.png&lt;/A&gt; username -

1302571599.119    796 10.10.10.10 TCP_MISS/200 0 200 1734 GET &lt;A href="http://t.adcloud.net/t.gif" target="test_blank"&gt;http://t.adcloud.net/t.gif&lt;/A&gt;? username 10.10.10.10

1302571599.122    148 10.10.10.10 TCP_MISS/200 0 200 5050 GET &lt;A href="http://someurl.net" target="test_blank"&gt;http://someurl.net&lt;/A&gt; username 10.10.10.10

1302571599.122     22 10.10.10.10 TCP_IMS_HIT/304 0 304 1321 GET &lt;A href="http://vtr.elections.nsw.gov.au/images/panel-sprite.png" target="test_blank"&gt;http://vtr.elections.nsw.gov.au/images/panel-sprite.png&lt;/A&gt; username -
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I'd really like to just change the squid log format back to default, but we have a few apps using this weird format for some reason... I mean really why need the '0' and have the status code twice &lt;span class="lia-unicode-emoji" title=":confused_face:"&gt;😕&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 07 Apr 2011 06:53:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76506#M19358</guid>
      <dc:creator>anstoitsec</dc:creator>
      <dc:date>2011-04-07T06:53:02Z</dc:date>
    </item>
    <item>
      <title>Re: Custom Squid log format</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76507#M19359</link>
      <description>&lt;P&gt;I don't know that I can help you directly, but a site I use for interactive regex'ing is &lt;A href="https://community.splunk.com/www.regexr.com" target="test_blank"&gt;www.regexr.com&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;You can paste your log into it, and build your regex with trial and error.
That should help you narrow down your problem.&lt;/P&gt;</description>
      <pubDate>Thu, 07 Apr 2011 09:48:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76507#M19359</guid>
      <dc:creator>jgauthier</dc:creator>
      <dc:date>2011-04-07T09:48:42Z</dc:date>
    </item>
    <item>
      <title>Re: Custom Squid log format</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76508#M19360</link>
      <description>&lt;P&gt;How would one match the second last 'column' of the log file - I can't find any reference on how to use regexes to distinguish using a space delimiter.&lt;/P&gt;</description>
      <pubDate>Thu, 07 Apr 2011 11:29:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76508#M19360</guid>
      <dc:creator>anstoitsec</dc:creator>
      <dc:date>2011-04-07T11:29:27Z</dc:date>
    </item>
    <item>
      <title>Re: Custom Squid log format</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76509#M19361</link>
      <description>&lt;P&gt;Side note, are you using NTLM authentication?  If your username is '-' then your hit is TCP_DENIED.  There are two TCP_DENIEDs for every auth request because of the way NTLM works.  I would just discard the TCP_DENIEDs, and save yourself significant index room!&lt;/P&gt;

&lt;P&gt;Will you paste an actual excerpt of the log?&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 09:27:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76509#M19361</guid>
      <dc:creator>jgauthier</dc:creator>
      <dc:date>2020-09-28T09:27:16Z</dc:date>
    </item>
    <item>
      <title>Re: Custom Squid log format</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76510#M19362</link>
      <description>&lt;P&gt;You wont need any regex-fu, your logs will be space delimited and quote validated which is supported using the DELIMS=" " directive. (since values like user agent string have spaces)&lt;/P&gt;

&lt;P&gt;anything "null" will be reported as "-" since a null value would break the format.&lt;/P&gt;

&lt;P&gt;in .../-appdir-/local/props.conf&lt;/P&gt;

&lt;P&gt;assuming your squid logs are sourcetyped as "squid"&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[squid]
REPORT-main=delimExtractions
SHOULD_LINEMERGE=false
TIME_FORMAT=%Y-%m-%d %T   #&amp;lt;--- you need to verify this and match it up with what you have
MAX_TIMESTAMP_LOOKAHEAD=19
KV_MODE = none
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;And in .../-appdir-/local/transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[delimExtractions]
DELIMS=" "
FIELDS="date","time","field0","field1","field2" # &amp;lt;-- etc etc 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I used to know all those squid fields by heart but not any longer since i'm heavy into elff now &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt; and I'm too lazy to dig up the doc and map the fields for you &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;I can give you a definitive parser if you can post a snipet of your logs (obfuscated is fine)&lt;/P&gt;</description>
      <pubDate>Fri, 08 Apr 2011 00:16:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76510#M19362</guid>
      <dc:creator>rshoward</dc:creator>
      <dc:date>2011-04-08T00:16:18Z</dc:date>
    </item>
    <item>
      <title>Re: Custom Squid log format</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76511#M19363</link>
      <description>&lt;P&gt;Thanks for the help! I've done some fooling around but haven't managed to get the fields right. For some reason some of my fields are not showing up in the 'search' field in the SplunkforSquid app. I'll update the post with a sample log and transforms/props file.&lt;/P&gt;</description>
      <pubDate>Tue, 12 Apr 2011 08:20:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Custom-Squid-log-format/m-p/76511#M19363</guid>
      <dc:creator>anstoitsec</dc:creator>
      <dc:date>2011-04-12T08:20:11Z</dc:date>
    </item>
  </channel>
</rss>

