Getting Data In

How do you handle data that changes over time within the source systems?

Path Finder

Hello Splunkers!!

Apologies for the wall of text below, but my urge to explain the situation has overcome everything else.
Scenario:

SQL Server stores factory floor data.

New row is appended for each new Shift. So, we have unique rows for each DAY and each Shift that runs at the factory.

Hence, each combo of DAY-Shift will be unique in the DB.

For any changes in the data for a DAY_Shift, it is updated in the existing row.

To allow for corrections, the data for any shift can be updated within a time window of 72hrs from the shift start, after which it is frozen and no further updates can be done.

A timestamp column [col_TS, factory local time] holds the time of last change/update. Thus, this can be anytime within the next 72hrs of the shift start time. Hence, these updates can be within a span of 3 Days.

DBConnect monitors a Rising Column [col_TS] and index time is also the col_TS timestamp, with properly configured time zones.


The problem is with the sentence in bold: Hence, these updates can be within a span of 3 Days.

By this logic, data for production day 2018-10-01 [DAY] can come in within the span of 2018-10-01 till 2018-10-04 [col_TS & index _time].

Thus I can have multiple entries for 2018-10-01 - Shift 1 over 2018-10-01, 2018-10-02, 2018-10-03 & 2018-10-04 and just pick the latest one.

However, if a user using the Splunk Time Picker wants to see a report on a span of time where 2018-10-01 is the last day, the last three day's values will not be the latest. Because Time Picker will translate values into earliest and latest and run them on the first line, against the col_TS and hence filter out all data outside the selected time span.

How do I get rid of this and use something like latest + 72hrs??


. A dirty workaround that I have thought of is to provide two different time picker buttons that will return epoch time to the SPL query. The Start Time will be a straight conversion. The End Time will be User Selected Time + 72hrs converted to epoch. 😛

Regards & Thanks
Anirban.

EDIT: https://answers.splunk.com/answers/322027/splunk-db-connect-is-there-a-way-to-configure-a-da.html
@rich7177 's answer is promising. putting that here for my reference.


_time col_TS Day ShiftCode Quantity
2018-10-02 06:10:02 2018-10-02 06:10:02.0 2018-10-01 1 689.7239999999999
2018-10-02 06:10:02 2018-10-02 06:10:02.0 2018-10-01 1 689.7239999999999
2018-10-02 05:52:03 2018-10-02 05:52:03.0 2018-10-01 1 689.7239999999999
2018-10-02 05:37:05 2018-10-02 05:37:05.0 2018-10-01 2 3965.2490000000003
2018-10-02 05:37:05 2018-10-02 05:37:05.0 2018-10-01 2 3965.2490000000003
2018-10-02 05:24:03 2018-10-02 05:24:03.0 2018-10-01 2 3965.2490000000003
2018-10-02 05:24:03 2018-10-02 05:24:03.0 2018-10-01 1 689.7239999999999
2018-10-02 04:52:02 2018-10-02 04:52:02.0 2018-10-01 1 689.7239999999999
2018-10-02 04:52:02 2018-10-02 04:52:02.0 2018-10-01 2 3607.0090000000005
2018-10-02 04:24:02 2018-10-02 04:24:02.0 2018-10-01 1 689.7239999999999
2018-10-02 04:24:02 2018-10-02 04:24:02.0 2018-10-01 2 3245.649
2018-10-02 03:52:03 2018-10-02 03:52:03.0 2018-10-01 1 689.7239999999999
2018-10-02 03:52:03 2018-10-02 03:52:03.0 2018-10-01 2 2883.799
2018-10-02 03:24:02 2018-10-02 03:24:02.0 2018-10-01 1 689.7239999999999
2018-10-02 03:24:02 2018-10-02 03:24:02.0 2018-10-01 2 2883.799
2018-10-02 02:52:02 2018-10-02 02:52:02.0 2018-10-01 1 689.7239999999999
2018-10-02 02:52:02 2018-10-02 02:52:02.0 2018-10-01 2 2883.799
2018-10-02 02:24:02 2018-10-02 02:24:02.0 2018-10-01 1 689.7239999999999
2018-10-02 02:24:02 2018-10-02 02:24:02.0 2018-10-01 2 2521.799
2018-10-02 01:52:02 2018-10-02 01:52:02.0 2018-10-01 1 689.7239999999999
2018-10-02 01:52:02 2018-10-02 01:52:02.0 2018-10-01 2 2521.799
2018-10-02 01:24:02 2018-10-02 01:24:02.0 2018-10-01 1 689.7239999999999
2018-10-02 01:24:02 2018-10-02 01:24:02.0 2018-10-01 2 2521.799
2018-10-02 00:52:03 2018-10-02 00:52:03.0 2018-10-01 1 689.7239999999999
2018-10-02 00:52:03 2018-10-02 00:52:03.0 2018-10-01 2 2521.799
2018-10-02 00:24:03 2018-10-02 00:24:03.0 2018-10-01 1 689.7239999999999
2018-10-02 00:24:03 2018-10-02 00:24:03.0 2018-10-01 2 2163.3289999999993
2018-10-01 23:52:00 2018-10-01 23:52:00.0 2018-10-01 1 689.7239999999999
2018-10-01 23:52:00 2018-10-01 23:52:00.0 2018-10-01 2 1804.4389999999999
2018-10-01 23:24:00 2018-10-01 23:24:00.0 2018-10-01 1 689.7239999999999
2018-10-01 23:24:00 2018-10-01 23:24:00.0 2018-10-01 2 1442.249
2018-10-01 22:52:00 2018-10-01 22:52:00.0 2018-10-01 1 689.7239999999999
2018-10-01 22:52:00 2018-10-01 22:52:00.0 2018-10-01 2 1080.5890000000002
2018-10-01 22:24:00 2018-10-01 22:24:00.0 2018-10-01 1 689.7239999999999
2018-10-01 22:24:00 2018-10-01 22:24:00.0 2018-10-01 2 722.209
2018-10-01 21:52:01 2018-10-01 21:52:01.0 2018-10-01 1 689.7239999999999
2018-10-01 21:52:01 2018-10-01 21:52:01.0 2018-10-01 2 722.209
2018-10-01 21:24:00 2018-10-01 21:24:00.0 2018-10-01 1 689.7239999999999
2018-10-01 21:24:00 2018-10-01 21:24:00.0 2018-10-01 2 360.869
2018-10-01 20:52:00 2018-10-01 20:52:00.0 2018-10-01 1 689.7239999999999
2018-10-01 20:52:00 2018-10-01 20:52:00.0 2018-10-01 2 0.0
2018-10-01 20:24:00 2018-10-01 20:24:00.0 2018-10-01 1 689.7239999999999
2018-10-01 20:24:00 2018-10-01 20:24:00.0 2018-10-01 2 0.0
2018-10-01 19:52:00 2018-10-01 19:52:00.0 2018-10-01 1 689.7239999999999
2018-10-01 19:52:00 2018-10-01 19:52:00.0 2018-10-01 2 0.0
2018-10-01 19:24:00 2018-10-01 19:24:00.0 2018-10-01 1 689.7239999999999
2018-10-01 19:24:00 2018-10-01 19:24:00.0 2018-10-01 2 0.0
2018-10-01 18:52:00 2018-10-01 18:52:00.0 2018-10-01 1 689.7239999999999
2018-10-01 18:52:00 2018-10-01 18:52:00.0 2018-10-01 2 0.0
2018-10-01 18:24:00 2018-10-01 18:24:00.0 2018-10-01 1 689.7239999999999

0 Karma

Esteemed Legend

You can do something like this to always add 72hours to your search from the timepicker (without modifying the timepicker at all):

Your Base Search String Here
    [| makeresults 
    | rename COMMENT AS "DO NOT CHANGE ANYTHING IN THIS SUBSEARCH SECTION!"
    | addinfo 
    | rename info_min_time AS earliest, info_max_time AS latest 
    | eval old_latest = latest 
    | eval latest=relative_time(latest, "+72h") 
    | table earliest latest 
    | eval search="earliest=" . earliest . " latest=" . latest]
| the | rest | of | your | search | with | pipes | here

Splunk Employee
Splunk Employee

hi @anirbandasdeb

Did the answer below solve your problem? If so, please resolve this post by approving it!
If your problem is still not solved, keep us updated so that someone else can help ya. Thanks for posting!

0 Karma

Path Finder

will do @mstjohn 🙂

0 Karma

SplunkTrust
SplunkTrust

Okay, the first thing is, the time_picker should not be operating on the ingestion date ( _indextime ), it should be working on the date/time of the event ( _time ), which is should be being created using a translation of "2018-10-01 - Shift 1". Assuming that shift 1 starts at 8:00 AM, an event's _time should be being set up as "2018-10-01 08:00:00". This should be relatively easy to calculate in the process that extracts the data from the SQL data base.

Second thing, if you can't fix that ingestion, then you can use a single time_picker, and set the <change> parameter of the control to calculate another pair of fields... the start and end of the potential time period. Then, in your SPL, restrict the results to those whose shift falls within the actual time range desired.

And remember, if Splunk is keeping copies of the same data over time, you are going to need to | dedup the search by its key fields in order to keep only the most recent copy, or run a periodic search that finds all duplicate events after the first one and routes them to |delete - a capability that is restricted for very good reasons.

0 Karma

Path Finder

Okay, the first thing is, the
time_picker should not be operating on
the ingestion date ( _indextime ), it
should be working on the date/time of
the event ( _time ), which is should
be being created using a translation
of "2018-10-01 - Shift 1". Assuming
that shift 1 starts at 8:00 AM, an
event's _time should be being set up
as "2018-10-01 08:00:00". This should
be relatively easy to calculate in the
process that extracts the data from
the SQL data base.

  1. The time picker [rather _time] operates on col_TS. col_TS stores the timestamp of the last update of the row. thus, _time == col_TS
  2. There are no tables within the db to define Shift timings. We have to pull that from a separate source, Shoplogix, and correlate that. But that would be a lot of rework at this point and I want to avoid that.

Second thing, if you can't fix that
ingestion, then you can use a single
time_picker, and set the
parameter of the control to calculate
another pair of fields... the start
and end of the potential time period.
Then, in your SPL, restrict the
results to those whose shift falls
within the actual time range desired.

This is rather interesting.
I am already using the time picker. I can give this a try.

And remember, if Splunk is keeping
copies of the same data over time, you
are going to need to | dedup the
search by its key fields in order to
keep only the most recent copy, or run
a periodic search that finds all
duplicate events after the first one
and routes them to |delete - a
capability that is restricted for very
good reasons.

Yup. I am already taking care of this. Just not though | delete. I am particularly wary of using delete.
Just using stats latest(<blah1>) by Day, Shift to pick the latest from each Day-Shift combo.

0 Karma

SplunkTrust
SplunkTrust

Can you provide sample data from the DB table?

0 Karma

Path Finder

added.

you might need to put this on a spreadsheet to get the layout/column correct.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!