<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to remove duplicate &amp;quot;scans&amp;quot; from field comms packets logs? autoregress maybe...? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-to-remove-duplicate-quot-scans-quot-from-field-comms-packets/m-p/325968#M60614</link>
    <description>&lt;P&gt;autoregress maybe...?&lt;/P&gt;

&lt;P&gt;I found this answer - which is looking for a count of seqential values:&lt;BR /&gt;
&lt;A href="https://answers.splunk.com/answers/396654/is-there-a-way-to-count-the-series-of-consecutive.html"&gt;https://answers.splunk.com/answers/396654/is-there-a-way-to-count-the-series-of-consecutive.html&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;I tried following it directly, but my issue is my values are never going to be exactly sequential (the "next" value will never match)... unless I &lt;EM&gt;sort&lt;/EM&gt; them first... duh!&lt;/P&gt;

&lt;P&gt;(edit) - I have to sort by both the "host" and the "Register &amp;amp; bit", before sorting by time... otherwise it's no different than a dedup...&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;...  | regex to strip timestamp and "junk" from beginning of log into new MESSAGE field | rex "ID: (?&amp;lt;REGISTER_w_BIT&amp;gt;\d{5}\/\d+)" | sort host, REGISTER_w_BIT, _time |autoregress MESSAGE  | eval sameAsNext=if(MESSAGE=MESSAGE_p1,1,0) | search sameAsNext = 0 | sort _time
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This is "kind of" working... if I pre-filter by a single register (40010), I can see all the transitions if it... but as soon as I try a generic search, those ones disappear... which makes no sense, as they are valid right up to before "search sameAsNext=0"  (and the value of sameAsNext is zero).&lt;/P&gt;</description>
    <pubDate>Fri, 02 Jun 2017 16:52:46 GMT</pubDate>
    <dc:creator>alaorath</dc:creator>
    <dc:date>2017-06-02T16:52:46Z</dc:date>
    <item>
      <title>How to remove duplicate "scans" from field comms packets logs? autoregress maybe...?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-remove-duplicate-quot-scans-quot-from-field-comms-packets/m-p/325967#M60613</link>
      <description>&lt;P&gt;We have a bug in our software that is spamming out identical log messages (different timestamps) - when it's only supposed to log changes in value. The application is reading a field device (Modbus protocol) and parsing the message into registers - each register has 16 discrete "bits".  The application takes the "bits" and separates them into a single log message (one entry per bit). The problem is, if only a single bit changes, it is logging the entire register (16 bits worth) - what it's suppose to do is log only the bit that changed from it's previous state.  I'm trying to design a Splunk filter so our support guys can use that until the bug-fix is deployed.&lt;/P&gt;

&lt;P&gt;For example:&lt;BR /&gt;
timestamp 40010/1 = 1 (message)&lt;BR /&gt;
timestamp 40010/2 = 0 (message)&lt;BR /&gt;
timestamp 40010/3 = 1 (message)&lt;BR /&gt;
timestamp 40010/4 = 1 (message)&lt;BR /&gt;
timestamp 40010/5 = 0 (message)&lt;BR /&gt;
timestamp 40010/6 = 0 (message)&lt;BR /&gt;
timestamp 40010/7 = 0 (message)&lt;BR /&gt;
timestamp 40010/8 = 0 (message)&lt;BR /&gt;
...&lt;BR /&gt;
timestamp 40010/16 = 0 (message)&lt;/P&gt;

&lt;P&gt;on the next scan, let's say bit #3 changes to '1':&lt;BR /&gt;
timestamp+1 40010/1 = 1 (message)&lt;BR /&gt;
timestamp+1 40010/2 = 0 (message)&lt;BR /&gt;
timestamp+1 40010/3 = 1 (message)&lt;BR /&gt;
timestamp+1 40010/4 = 1 (message)&lt;BR /&gt;
timestamp+1 40010/5 = 0 (message)&lt;BR /&gt;
timestamp+1 40010/6 = 0 (message)&lt;BR /&gt;
timestamp+1 40010/7 = 0 (message)&lt;BR /&gt;
timestamp+1 40010/8 = 0 (message)&lt;BR /&gt;
...&lt;BR /&gt;
timestamp+1 40010/16 = 0 (message)&lt;/P&gt;

&lt;P&gt;As you can see, this makes the logs &lt;EM&gt;super&lt;/EM&gt; verbose, and terrible to troubleshoot.  I'm trying to find a way to strip out all the "same values" - which still retaining the timestamps (and if possible, full original "_raw" message).  I can't use "dedup" as it would strip out future "toggles" (I need all "changes", not just the first one... if bit #3 goes back to '0', I need the timestamp when that occurs).&lt;/P&gt;

&lt;P&gt;So in my previous example I would get only:&lt;BR /&gt;
timestamp+1 40010/1 = 1 (message)&lt;BR /&gt;
timestamp+1 40010/2 = 0 (message)&lt;BR /&gt;
timestamp+1 40010/3 = 1 (message)&lt;BR /&gt;
timestamp+1 40010/4 = 1 (message)&lt;BR /&gt;
timestamp+1 40010/5 = 0 (message)&lt;BR /&gt;
timestamp+1 40010/6 = 0 (message)&lt;BR /&gt;
timestamp+1 40010/7 = 0 (message)&lt;BR /&gt;
timestamp+1 40010/8 = 0 (message)&lt;BR /&gt;
...&lt;BR /&gt;
timestamp+1 40010/16 = 0 (message)&lt;/P&gt;

&lt;P&gt;next scan:&lt;BR /&gt;
timestamp+1 40010/3 = 1 (message)&lt;/P&gt;

&lt;P&gt;future scan (where bit #3 changes back to '0'):&lt;BR /&gt;
timestamp+10 40010/3 = 0 (message)&lt;/P&gt;

&lt;P&gt;future scan (where bit #5 changes to '1'):&lt;BR /&gt;
timestamp+20 40010/5 = 1 (message)&lt;/P&gt;

&lt;P&gt;etc.&lt;/P&gt;

&lt;P&gt;Starting with an initial "baseline", I just want to see the changes in that register - with the newer timestamp of when that occurred. &lt;BR /&gt;
 Compounding the problem is there are 50+ different registers, all logging every scan (5 seconds) - regardless of if the "value" of the register changed.&lt;/P&gt;</description>
      <pubDate>Fri, 02 Jun 2017 15:36:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-remove-duplicate-quot-scans-quot-from-field-comms-packets/m-p/325967#M60613</guid>
      <dc:creator>alaorath</dc:creator>
      <dc:date>2017-06-02T15:36:58Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove duplicate "scans" from field comms packets logs? autoregress maybe...?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-remove-duplicate-quot-scans-quot-from-field-comms-packets/m-p/325968#M60614</link>
      <description>&lt;P&gt;autoregress maybe...?&lt;/P&gt;

&lt;P&gt;I found this answer - which is looking for a count of seqential values:&lt;BR /&gt;
&lt;A href="https://answers.splunk.com/answers/396654/is-there-a-way-to-count-the-series-of-consecutive.html"&gt;https://answers.splunk.com/answers/396654/is-there-a-way-to-count-the-series-of-consecutive.html&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;I tried following it directly, but my issue is my values are never going to be exactly sequential (the "next" value will never match)... unless I &lt;EM&gt;sort&lt;/EM&gt; them first... duh!&lt;/P&gt;

&lt;P&gt;(edit) - I have to sort by both the "host" and the "Register &amp;amp; bit", before sorting by time... otherwise it's no different than a dedup...&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;...  | regex to strip timestamp and "junk" from beginning of log into new MESSAGE field | rex "ID: (?&amp;lt;REGISTER_w_BIT&amp;gt;\d{5}\/\d+)" | sort host, REGISTER_w_BIT, _time |autoregress MESSAGE  | eval sameAsNext=if(MESSAGE=MESSAGE_p1,1,0) | search sameAsNext = 0 | sort _time
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This is "kind of" working... if I pre-filter by a single register (40010), I can see all the transitions if it... but as soon as I try a generic search, those ones disappear... which makes no sense, as they are valid right up to before "search sameAsNext=0"  (and the value of sameAsNext is zero).&lt;/P&gt;</description>
      <pubDate>Fri, 02 Jun 2017 16:52:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-remove-duplicate-quot-scans-quot-from-field-comms-packets/m-p/325968#M60614</guid>
      <dc:creator>alaorath</dc:creator>
      <dc:date>2017-06-02T16:52:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove duplicate "scans" from field comms packets logs? autoregress maybe...?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-remove-duplicate-quot-scans-quot-from-field-comms-packets/m-p/325969#M60615</link>
      <description>&lt;P&gt;Not an answer because it's not quite an answer yet, more testing.&lt;/P&gt;

&lt;P&gt;If you have the field "register" with values like 40010/3 and "value" for the 0/1 values, then how far does something like the below get you?&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;myBaseSearch ... 
| stats count list(value) as ValueList, list(_time) as ReportedTime by register. 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I have a suspicion this, or something like it, will be the base of the answer.&lt;/P&gt;

&lt;P&gt;Also try&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; myBaseSearch ... 
| stats count last(value) as LatestValue by register. 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Though it might be &lt;CODE&gt;first(value)&lt;/CODE&gt; because I can never remember which way around that is.&lt;/P&gt;

&lt;P&gt;Last, if you need to rex out those, add this line in (assuming your registers are 5 digits exactly followed by a forward slash and one more digit, and assuming I don't know your timestamp format so I can't make this easier):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;myBaseSearch ...
| rex "(?&amp;lt;register&amp;gt;\d{5}/\d)\s*=\s*(?&amp;lt;value&amp;gt;\d+)"
| stats ...
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 02 Jun 2017 16:53:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-remove-duplicate-quot-scans-quot-from-field-comms-packets/m-p/325969#M60615</guid>
      <dc:creator>Richfez</dc:creator>
      <dc:date>2017-06-02T16:53:43Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove duplicate "scans" from field comms packets logs? autoregress maybe...?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-remove-duplicate-quot-scans-quot-from-field-comms-packets/m-p/325970#M60616</link>
      <description>&lt;P&gt;Filtering to a single (known register... 40015)&lt;BR /&gt;
First one:&lt;BR /&gt;
count = 3625&lt;BR /&gt;
ValueList is 99.9% zero&lt;BR /&gt;
ReportedTime - looks like Epoc timestamps...?&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;base query... | rex "ID: (?&amp;lt;REG_w_BIT&amp;gt;\d{5}\/\d+)" | rex "Summary: \w+ (?&amp;lt;BIT_VALUE&amp;gt;\d+)" | stats count list(BIT_VALUE) as ValueList, list(_time) as ReportedTime by REG_w_BIT&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;Second one:&lt;BR /&gt;
count=3625 (identical id "first(value)" is used)&lt;BR /&gt;
LatestValue=0&lt;/P&gt;

&lt;P&gt;Third:&lt;BR /&gt;
Yup, I'be got regex for&lt;BR /&gt;
* the full message (sans timestamp)&lt;BR /&gt;
* the "register &amp;amp; bit pair" (the part I'm trying to filter by)&lt;BR /&gt;
* and most recently, the value of the register &amp;amp; pair&lt;/P&gt;

&lt;P&gt;Sample "full messages":&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;P&gt;40015/13    Summary: CLEAR 0 : Words, text, description of what bit #13 is&lt;BR /&gt;
40015/13    Summary: SET 1 : Words, text, description of what bit #13 is&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;</description>
      <pubDate>Fri, 02 Jun 2017 18:29:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-remove-duplicate-quot-scans-quot-from-field-comms-packets/m-p/325970#M60616</guid>
      <dc:creator>alaorath</dc:creator>
      <dc:date>2017-06-02T18:29:41Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove duplicate "scans" from field comms packets logs? autoregress maybe...?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-remove-duplicate-quot-scans-quot-from-field-comms-packets/m-p/325971#M60617</link>
      <description>&lt;P&gt;I think there's a memory limitation to 'autoregress'... if I limit the amount of results (either by time range, or more complex "base" filter - the results exactly match my expectations.&lt;/P&gt;

&lt;P&gt;But if I try to run a broad query (say 7 days... ~ 150k events) it starts "stripping out" known good values... entire registers worth in fact.  Very odd.&lt;/P&gt;</description>
      <pubDate>Fri, 02 Jun 2017 19:35:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-remove-duplicate-quot-scans-quot-from-field-comms-packets/m-p/325971#M60617</guid>
      <dc:creator>alaorath</dc:creator>
      <dc:date>2017-06-02T19:35:17Z</dc:date>
    </item>
  </channel>
</rss>

