Splunk Search

How do I find the time difference for each change in the location of the city?

time2200
Explorer

I have a query like this:

| dbxquery connection=xxxxx  query="select xxx FROM xxx WHERE xxx and to_char(LOG_DATE_TIME,'YYYY-MM-DD')='2022-10-13'"
| iplocation SRC_IP
| stats values(LOG_DATE_TIME) as TIME dc(City) as countCity list(City) as city values(SRC_IP) as sourceIp by CIF USER_CD
| eval time = strptime(TIME,"%Y-%m-%d %H:%M:%S.%3N")
| eval differenceMinutes=(max(time)-min(time))/60
| fields - time
| search countCity>1 AND differenceHours>1

 

The query displays the result like this:

time2200_0-1666766907354.png

 

I want it to have a result like this:

time2200_0-1666768794202.png

time2200_3-1666769836158.png

Example:
Jakarta - Bogor

(LOG_DATE_TIME  - string type data)                  2022-10-13 09:03:33.539    -    2022-10-13 09:00:55.885

(already converted in timestamps version)      1665626613.539000      -       1665626455.885000

                                                                                             =  158 (in seconds)

                                                                                             then 158/60 = 2.633 minutes

Bogor - Jakarta =  9.22 minutes

Jakarta - Bogor = 360 minutes

Bogor - Jakarta = 240 minutes

 

How should my query be, in order to achieve that result?

Labels (3)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

The requirements is very unclear because there is no explanation of the data.  If I have to speculate, fields  CIF and USER_CD uniquely identifies a certain user activity (let's call it a transaction); each event has a SRC_IP field, and a LOG_DATE_TIME field that is a timestamp in "%Y-%m-%d %H:%M:%S.%3N" format.  You want to reveal some characteristics of this transaction using geolocation and lapsed time between mapped geolocation.

If this is your requirement, the mockup display is perhaps not the best format.  But set that aside, the stats you illustrated will only give you the total lapsed time during this "transaction", not the lapse between cities you wanted.   To do that, you must calculate the characteristics between each event pair.

| dbxquery connection=xxxxx  query="select xxx FROM xxx WHERE xxx and to_char(LOG_DATE_TIME,'YYYY-MM-DD')='2022-10-13'"
| iplocation SRC_IP
| stats count values(LOG_DATE_TIME) as TIME dc(City) as countCity list(City) as city values(SRC_IP) as sourceIp by CIF USER_CD
| eval time = strptime(TIME,"%Y-%m-%d %H:%M:%S.%3N")
| eval differenceHours = (max(time) - min(time)) / 3600
| search countCity>1 AND differenceHours>1
| eval iter = mvrange(1, count)
| eval differenceMinutes = mvmap(iter, (mvindex(time, iter) - mvindex(time, iter - 1))/60)
| fields - time

Note you didn't specify the definition of differenceHours, so I have to invent one, and that leads me to move the search command up.  Most importantly, the above is not exactly what you expect, to calculate lapsed time only when city changes.  I can think of a relatively simple way to achieve that if each city only encompass a maximum of two events.  But if there can be more, aggregation will need more refinements.  The above should point you to the right direction, however.

Tags (1)

time2200
Explorer

CIF and USER_CD uniquely identifies a certain user activity      v  TRUE

 

value of LOG_DATE_TIME correlated with City
are successive real-time events.

 

the differenceHours field will be deprecated because it only takes the earliest and last times.

my goal is to calculate the time difference obtained by differentiating each location change, with units of minutes, which will be entered into the differenceMinutes field.

resultsMinutes.png

0 Karma

yuanliu
SplunkTrust
SplunkTrust

One more ambiguity: differenceMinutes can be defined as the time lapse between the first event in two consecutive geolocations, or it can be the time lapse between the last event in the last geolocation and the first event of the subsequent geolocation.  I will take the latter assumption, the calculation of which is considerably more complex. (If the former, some form of dedup will suffice.)

| dbxquery connection=xxxxx  query="select xxx FROM xxx WHERE xxx and to_char(LOG_DATE_TIME,'YYYY-MM-DD')='2022-10-13'"
| iplocation SRC_IP
| stats count values(LOG_DATE_TIME) as TIME dc(City) as countCity list(City) as city values(SRC_IP) as sourceIp by CIF USER_CD ``` calculate sequence ```
| eval time = strptime(TIME,"%Y-%m-%d %H:%M:%S.%3N")
| eval differenceHours = (max(time) - min(time)) / 3600
| search countCity>1 AND differenceHours>1
| eval iter = mvrange(1, count) ``` iter is the iterator of each event ```
| eval seq = 1 ``` seq is the sequence number of each new location ```
| eval seq = mvmap(iter, if(mvindex(city, iter) == mvindex(city, iter - 1), seq, seq + 1))
| stats min(time) as time max(time) as time_exit values(sourceIp) as sourceIp by CIF USER_CD seq city ``` extract entrance and exit times ```
| eval sourceIp = mvjoin(sourceIp, ", ") ``` to help with display ```
| stats values(time) as time list(time_exit) as time_exit list(seq) as seq list(city) as city list(sourceIp) as sourceIp by CIF USER_CD
``` use seq as the new iterator ```
| eval differenceMinutes = mvmap(seq, (mvindex(time, seq) - mvindex(time, seq - 1) + mvindex(time_exit, seq - 1))/60)
``` difference is caculated between exit time and next entrance time ```
| fields - time* iter seq

 Hope the comments will give you some idea.

time2200
Explorer

| eval iter = mvrange(1, count) ``` iter is the iterator of each event ```

bro, what is the count function inside mvrange(1, count)?

i need to understand this query

0 Karma

yuanliu
SplunkTrust
SplunkTrust

After the previous stats, count in this context is just the result from that stats function named count.  This is a roundabout way to say in mvrange(1, count), count is merely a field name.

Get Updates on the Splunk Community!

How to Get Started with Splunk Data Management Pipeline Builders (Edge Processor & ...

If you want to gain full control over your growing data volumes, check out Splunk’s Data Management pipeline ...

Out of the Box to Up And Running - Streamlined Observability for Your Cloud ...

  Tech Talk Streamlined Observability for Your Cloud Environment Register    Out of the Box to Up And Running ...

Splunk Smartness with Brandon Sternfield | Episode 3

Hello and welcome to another episode of "Splunk Smartness," the interview series where we explore the power of ...