Getting Data In

How to remove duplicate "scans" from field comms packets logs? autoregress maybe...?

Path Finder

We have a bug in our software that is spamming out identical log messages (different timestamps) - when it's only supposed to log changes in value. The application is reading a field device (Modbus protocol) and parsing the message into registers - each register has 16 discrete "bits". The application takes the "bits" and separates them into a single log message (one entry per bit). The problem is, if only a single bit changes, it is logging the entire register (16 bits worth) - what it's suppose to do is log only the bit that changed from it's previous state. I'm trying to design a Splunk filter so our support guys can use that until the bug-fix is deployed.

For example:
timestamp 40010/1 = 1 (message)
timestamp 40010/2 = 0 (message)
timestamp 40010/3 = 1 (message)
timestamp 40010/4 = 1 (message)
timestamp 40010/5 = 0 (message)
timestamp 40010/6 = 0 (message)
timestamp 40010/7 = 0 (message)
timestamp 40010/8 = 0 (message)
...
timestamp 40010/16 = 0 (message)

on the next scan, let's say bit #3 changes to '1':
timestamp+1 40010/1 = 1 (message)
timestamp+1 40010/2 = 0 (message)
timestamp+1 40010/3 = 1 (message)
timestamp+1 40010/4 = 1 (message)
timestamp+1 40010/5 = 0 (message)
timestamp+1 40010/6 = 0 (message)
timestamp+1 40010/7 = 0 (message)
timestamp+1 40010/8 = 0 (message)
...
timestamp+1 40010/16 = 0 (message)

As you can see, this makes the logs super verbose, and terrible to troubleshoot. I'm trying to find a way to strip out all the "same values" - which still retaining the timestamps (and if possible, full original "_raw" message). I can't use "dedup" as it would strip out future "toggles" (I need all "changes", not just the first one... if bit #3 goes back to '0', I need the timestamp when that occurs).

So in my previous example I would get only:
timestamp+1 40010/1 = 1 (message)
timestamp+1 40010/2 = 0 (message)
timestamp+1 40010/3 = 1 (message)
timestamp+1 40010/4 = 1 (message)
timestamp+1 40010/5 = 0 (message)
timestamp+1 40010/6 = 0 (message)
timestamp+1 40010/7 = 0 (message)
timestamp+1 40010/8 = 0 (message)
...
timestamp+1 40010/16 = 0 (message)

next scan:
timestamp+1 40010/3 = 1 (message)

future scan (where bit #3 changes back to '0'):
timestamp+10 40010/3 = 0 (message)

future scan (where bit #5 changes to '1'):
timestamp+20 40010/5 = 1 (message)

etc.

Starting with an initial "baseline", I just want to see the changes in that register - with the newer timestamp of when that occurred.
Compounding the problem is there are 50+ different registers, all logging every scan (5 seconds) - regardless of if the "value" of the register changed.

0 Karma

SplunkTrust
SplunkTrust

Not an answer because it's not quite an answer yet, more testing.

If you have the field "register" with values like 40010/3 and "value" for the 0/1 values, then how far does something like the below get you?

myBaseSearch ... 
| stats count list(value) as ValueList, list(_time) as ReportedTime by register. 

I have a suspicion this, or something like it, will be the base of the answer.

Also try

 myBaseSearch ... 
| stats count last(value) as LatestValue by register. 

Though it might be first(value) because I can never remember which way around that is.

Last, if you need to rex out those, add this line in (assuming your registers are 5 digits exactly followed by a forward slash and one more digit, and assuming I don't know your timestamp format so I can't make this easier):

myBaseSearch ...
| rex "(?<register>\d{5}/\d)\s*=\s*(?<value>\d+)"
| stats ...
0 Karma

Path Finder

Filtering to a single (known register... 40015)
First one:
count = 3625
ValueList is 99.9% zero
ReportedTime - looks like Epoc timestamps...?

base query... | rex "ID: (?<REG_w_BIT>\d{5}\/\d+)" | rex "Summary: \w+ (?<BIT_VALUE>\d+)" | stats count list(BIT_VALUE) as ValueList, list(_time) as ReportedTime by REG_w_BIT

Second one:
count=3625 (identical id "first(value)" is used)
LatestValue=0

Third:
Yup, I'be got regex for
* the full message (sans timestamp)
* the "register & bit pair" (the part I'm trying to filter by)
* and most recently, the value of the register & pair

Sample "full messages":

40015/13 Summary: CLEAR 0 : Words, text, description of what bit #13 is
40015/13 Summary: SET 1 : Words, text, description of what bit #13 is

0 Karma

Path Finder

autoregress maybe...?

I found this answer - which is looking for a count of seqential values:
https://answers.splunk.com/answers/396654/is-there-a-way-to-count-the-series-of-consecutive.html

I tried following it directly, but my issue is my values are never going to be exactly sequential (the "next" value will never match)... unless I sort them first... duh!

(edit) - I have to sort by both the "host" and the "Register & bit", before sorting by time... otherwise it's no different than a dedup...

...  | regex to strip timestamp and "junk" from beginning of log into new MESSAGE field | rex "ID: (?<REGISTER_w_BIT>\d{5}\/\d+)" | sort host, REGISTER_w_BIT, _time |autoregress MESSAGE  | eval sameAsNext=if(MESSAGE=MESSAGE_p1,1,0) | search sameAsNext = 0 | sort _time

This is "kind of" working... if I pre-filter by a single register (40010), I can see all the transitions if it... but as soon as I try a generic search, those ones disappear... which makes no sense, as they are valid right up to before "search sameAsNext=0" (and the value of sameAsNext is zero).

0 Karma

Path Finder

I think there's a memory limitation to 'autoregress'... if I limit the amount of results (either by time range, or more complex "base" filter - the results exactly match my expectations.

But if I try to run a broad query (say 7 days... ~ 150k events) it starts "stripping out" known good values... entire registers worth in fact. Very odd.

0 Karma