<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Squid Log files in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Squid-Log-files/m-p/68361#M13833</link>
    <description>&lt;P&gt;Hey All,&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;I enabled the squid app for splunk and threw a log file into it.  Pretty quick and easy, and I whipped out an additional dashboard.    (Thanks to who put this together)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I noticed an issue, and in my noobness, looking for some direction.  When I loaded the log file, splunk recorded 80,000 records loaded at 8:00pm.  Well, it's true I loaded them, but I think it should have parsed the timestamp so I can do historical reporting.  (Correct me if I'm wrong)&lt;/P&gt;

&lt;P&gt;So, I looked at the transform and the regex is:&lt;BR /&gt;
&lt;CODE&gt;^\d+\.\d+\s+(\d+)\s+([0-9\.]*)\s+([^/]+)/(\d+)\s+(\d+)\s+(\w+)\s+((?:([^:]*)://)?([^/:]+):?(\d+)?(/?[^ ]*))\s+(\S+)\s+([^/]+)/([^ ]+)\s+(.*)$&lt;/CODE&gt;&lt;BR /&gt;
format is:&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;duration::$1 clientip::$2 action::$3 http_status::$4 bytes::$5 method::$6 uri::$7 proto::$8 uri_host::$9 uri_port::$10 uri_path::$11 username::$12 hierarchy::$13 server_ip::$14 content_type::$15&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;The first field should be timestamp.  When looking at squid data in search, "fields" include "timestamp" but it's determine that there are "none".&lt;/P&gt;

&lt;P&gt;As a refresher, the log file entries look so:&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;1301087053.193    182 10.2.40.179 TCP_MISS/400 1083 GET &lt;A href="http://api.twitter.com/1/statuses/user" rel="nofollow"&gt;http://api.twitter.com/1/statuses/user&lt;/A&gt;_timeline.json? username DIRECT/199.59.148.87 application/json&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;My regex-foo is weak, and I'm definitely below average.  However, shouldn't this include the timestamp in order for splunk to index it by time properly?&lt;/P&gt;

&lt;P&gt;So, I want to load last months data, but I will not be able to report on February 2011, because it appears to be all new data as of the load data.&lt;/P&gt;

&lt;P&gt;Thanks for the advice.   Moving forward, the records are correct.  Obviously, splunk is doing it's own timestamp.&lt;/P&gt;</description>
    <pubDate>Sat, 26 Mar 2011 20:21:20 GMT</pubDate>
    <dc:creator>jgauthier</dc:creator>
    <dc:date>2011-03-26T20:21:20Z</dc:date>
    <item>
      <title>Squid Log files</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Squid-Log-files/m-p/68361#M13833</link>
      <description>&lt;P&gt;Hey All,&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;I enabled the squid app for splunk and threw a log file into it.  Pretty quick and easy, and I whipped out an additional dashboard.    (Thanks to who put this together)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I noticed an issue, and in my noobness, looking for some direction.  When I loaded the log file, splunk recorded 80,000 records loaded at 8:00pm.  Well, it's true I loaded them, but I think it should have parsed the timestamp so I can do historical reporting.  (Correct me if I'm wrong)&lt;/P&gt;

&lt;P&gt;So, I looked at the transform and the regex is:&lt;BR /&gt;
&lt;CODE&gt;^\d+\.\d+\s+(\d+)\s+([0-9\.]*)\s+([^/]+)/(\d+)\s+(\d+)\s+(\w+)\s+((?:([^:]*)://)?([^/:]+):?(\d+)?(/?[^ ]*))\s+(\S+)\s+([^/]+)/([^ ]+)\s+(.*)$&lt;/CODE&gt;&lt;BR /&gt;
format is:&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;duration::$1 clientip::$2 action::$3 http_status::$4 bytes::$5 method::$6 uri::$7 proto::$8 uri_host::$9 uri_port::$10 uri_path::$11 username::$12 hierarchy::$13 server_ip::$14 content_type::$15&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;The first field should be timestamp.  When looking at squid data in search, "fields" include "timestamp" but it's determine that there are "none".&lt;/P&gt;

&lt;P&gt;As a refresher, the log file entries look so:&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;1301087053.193    182 10.2.40.179 TCP_MISS/400 1083 GET &lt;A href="http://api.twitter.com/1/statuses/user" rel="nofollow"&gt;http://api.twitter.com/1/statuses/user&lt;/A&gt;_timeline.json? username DIRECT/199.59.148.87 application/json&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;My regex-foo is weak, and I'm definitely below average.  However, shouldn't this include the timestamp in order for splunk to index it by time properly?&lt;/P&gt;

&lt;P&gt;So, I want to load last months data, but I will not be able to report on February 2011, because it appears to be all new data as of the load data.&lt;/P&gt;

&lt;P&gt;Thanks for the advice.   Moving forward, the records are correct.  Obviously, splunk is doing it's own timestamp.&lt;/P&gt;</description>
      <pubDate>Sat, 26 Mar 2011 20:21:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Squid-Log-files/m-p/68361#M13833</guid>
      <dc:creator>jgauthier</dc:creator>
      <dc:date>2011-03-26T20:21:20Z</dc:date>
    </item>
    <item>
      <title>Re: Squid Log files</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Squid-Log-files/m-p/68362#M13834</link>
      <description>&lt;P&gt;The transforms.conf file is fine. However, the TIME_FORMAT specified in the props.conf file in the squid app is wrong. I'm not sure why or if it's a typo, but the file says:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;TIME_FORMAT = %3N
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;It should be:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;TIME_FORMAT = %s.%3N
TIME_PREFIX = ^
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Changing/adding the lines should solve your problem.&lt;/P&gt;

&lt;P&gt;The app probably worked in the past because when the defined TIME_FORMAT failed, it used the default Splunk time formats. Because of &lt;A href="http://www.splunk.com/base/Documentation/latest/ReleaseNotes/Knownissues#Epoch_timestamps_not_parsed_correctly_after_March_12.2C_2011" rel="nofollow"&gt;this&lt;/A&gt; issue, the default timestamps stopped working for timestamps after March 12, 2011, so Splunk just used current time, which isn't ideal.&lt;/P&gt;</description>
      <pubDate>Sat, 26 Mar 2011 22:55:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Squid-Log-files/m-p/68362#M13834</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2011-03-26T22:55:27Z</dc:date>
    </item>
    <item>
      <title>Re: Squid Log files</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Squid-Log-files/m-p/68363#M13835</link>
      <description>&lt;P&gt;Thank you! I reloaded a small subset of data and ran some tests. It was perfect.  I am going to reload the file now.  Thanks so much!&lt;/P&gt;</description>
      <pubDate>Sun, 27 Mar 2011 01:40:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Squid-Log-files/m-p/68363#M13835</guid>
      <dc:creator>jgauthier</dc:creator>
      <dc:date>2011-03-27T01:40:18Z</dc:date>
    </item>
    <item>
      <title>Re: Squid Log files</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Squid-Log-files/m-p/68364#M13836</link>
      <description>&lt;P&gt;Hi! I'm authoring the Squid app. Thanks gkanapathy for discovering this issue, like you say it's a typo that apparently has gone undiscovered thus far. I'll put out an updated version which fixes this.&lt;/P&gt;

&lt;P&gt;jgauthier, I'm interested to hear what additional dashboard you created - maybe it's something that could be useful to Squid app users in general? In that case I could include that in the updated version as well.&lt;/P&gt;</description>
      <pubDate>Sun, 27 Mar 2011 04:32:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Squid-Log-files/m-p/68364#M13836</guid>
      <dc:creator>Ayn</dc:creator>
      <dc:date>2011-03-27T04:32:42Z</dc:date>
    </item>
    <item>
      <title>Re: Squid Log files</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Squid-Log-files/m-p/68365#M13837</link>
      <description>&lt;P&gt;Sure!  I modified some of yours to fit my needs better. So for instance I removed the client IP charts and replaced them with usernames.  I also added a "heaviest bandwidth user" search on the main dashboard with this query: 'sourcetype="squid" action="*" | stats sum(bytes) as tb by username | sort -tb | head 10'&lt;/P&gt;

&lt;P&gt;I then created a dashboard I called "Sites", which I pull the top ten users of certain "hot" sites at my company. Like facebook, pandora, youtube, etc.  These are all done by 'hits', but I would like reproduce them all by bandiwdth as well, as they are different metrics!&lt;/P&gt;</description>
      <pubDate>Mon, 28 Mar 2011 08:56:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Squid-Log-files/m-p/68365#M13837</guid>
      <dc:creator>jgauthier</dc:creator>
      <dc:date>2011-03-28T08:56:40Z</dc:date>
    </item>
  </channel>
</rss>

