Solved: Re: Creating End_Loading_Time

kakarsu · ‎11-19-2018

Hi Splunkers,

I am faced with another problem where the logs I have contain only 3 fields with Start_Loading_Time, _Event_Reference, Event_Name.
An example of this log is shown below in the dummy data:

11:00:31:800,3200,ABCDeposit;11:00:33:940,3201,ABCSelectAmount;11:00:35:320,3202,ABCSelectAccount;11:00:42:670,3203,ABCConfirm;11:00:50:350,3204,ACBSuccessfulEnd
.......
.......
.......

I have used the split function to split the above record by ";", which will give me below:

11:00:31:800,3200,ABCDeposit
11:00:33:940,3201,ABCSelectAmount
11:00:35:320,3202,ABCSelectAccount
11:00:42:670,3203,ABCConfirm
11:00:50:350,3204,ACBSuccessfulEnd

I have then used the below regex to capture the two fields I'm after:

(?Start_Loading_Time[^\,]+)\,\d*\,(?Event_Name\w+[^\n]+)

What I am trying to create is to get "11:00:33:940" -1milisecond as End_Loading_Time for ABCDeposit and use "11:00:33:940" as Start_Loading_Time for ABCSelectAmount similarly I want to capture "11:00:35:320" -1milisecond as End_Loading_Time for ABCSelectAmount and use "11:00:35:320" Start_Loading_Time for ABCSelectAccount and so on.

Any suggestion or help would be much appreciated.

Many Thanks in advance!

FrankVl · ‎11-20-2018

So basically, you want to take the start time of the next step (minus 1 ms) as the end time of the current step? One thing you could do is duplicate the timestamp before splitting. So (first 2 lines are just to generate a sample event):

| makeresults 
| eval event = "11:00:31:800,3200,ABCDeposit;11:00:33:940,3201,ABCSelectAmount;11:00:35:320,3202,ABCSelectAccount;11:00:42:670,3203,ABCConfirm;11:00:50:350,3204,ACBSuccessfulEnd"
| rex field=event mode=sed "s/;([^,]+)/,\1;\1/g"
| eval event = split(event,";")
| mvexpand event
| rex field=event "(?<Start_Loading_Time>[^,]+),\d*,(?<Event_Name>[^,]+),?(?<End_Loading_Time>.+)?"
| eval End_Loading_Time = strftime(strptime(End_Loading_Time,"%H:%M:%S:%3N")-0.001,"%H:%M:%S:%3N")

The rex sed command on line 3 changes your data into: 11:00:31:800,3200,ABCDeposit,11:00:33:940;11:00:33:940,3201,ABCSelectAmount,11:00:35:320;11:00:35:320,3202,ABCSelectAccount,11:00:42:670;11:00:42:670,3203,ABCConfirm,11:00:50:350;11:00:50:350,3204,ACBSuccessfulEnd effectively duplicating the timestamp from the next step as an extra field to the previous step.

View solution in original post

FrankVl · ‎11-20-2018

So basically, you want to take the start time of the next step (minus 1 ms) as the end time of the current step? One thing you could do is duplicate the timestamp before splitting. So (first 2 lines are just to generate a sample event):

| makeresults 
| eval event = "11:00:31:800,3200,ABCDeposit;11:00:33:940,3201,ABCSelectAmount;11:00:35:320,3202,ABCSelectAccount;11:00:42:670,3203,ABCConfirm;11:00:50:350,3204,ACBSuccessfulEnd"
| rex field=event mode=sed "s/;([^,]+)/,\1;\1/g"
| eval event = split(event,";")
| mvexpand event
| rex field=event "(?<Start_Loading_Time>[^,]+),\d*,(?<Event_Name>[^,]+),?(?<End_Loading_Time>.+)?"
| eval End_Loading_Time = strftime(strptime(End_Loading_Time,"%H:%M:%S:%3N")-0.001,"%H:%M:%S:%3N")

The rex sed command on line 3 changes your data into: 11:00:31:800,3200,ABCDeposit,11:00:33:940;11:00:33:940,3201,ABCSelectAmount,11:00:35:320;11:00:35:320,3202,ABCSelectAccount,11:00:42:670;11:00:42:670,3203,ABCConfirm,11:00:50:350;11:00:50:350,3204,ACBSuccessfulEnd effectively duplicating the timestamp from the next step as an extra field to the previous step.

kakarsu · ‎11-20-2018

Thank you very much for the quick response, one should have mentioned that within this log I have another pair of logs that contains as below (Please bear in mind that below data is dummy, the time and action names vary):
"11:00:31:800,3200,ABCDeposit, Selected_Action;11:00:33:940,3201,ABCSelectAmount,Selected_Amount;11:00:35:320,3202,ABCSelectAccount,Selected_Account,;11:00:42:670,3203,ABCConfirm,Selected_Button;11:00:50:350,3204,ACBSuccessfulEnd,Confirmed"

And another one:

"11:00:31:800,3200,ABCDeposit, Selected_Action;11:00:33:940,3201,ABCSelectAmount,0;11:00:35:320,3202,ABCSelectAccount,0;11:00:42:670,3203,ABCConfirm,0;11:00:50:350,3204,ACBSuccessfulEnd,0"

How do I get the | rex field=event mode=sed for the above logs?

I tried to analyse your code but failed. 😞

Thanks a million in advance!

FrankVl · ‎11-21-2018

The code I gave should apply just fine to those other logs as well, right? All it does, is find each ;, captures any tekst that follows, until the first , (ie. captures the timestamp). And then replaces that by a ,, followed by a copy of the timestamp, followed by the ; followed by the captured timestamp again. So it just duplicates the timestamp to the left side of the ;.

As an example, it replaces ;11:00:33:940 by ,11:00:33:940;11:00:33:940. That way, when you then split the data by ;, you have the timestamp from the next item also as an extra field at the end of the previous item.

It basically (after splitting) changes this:

11:00:31:800,3200,ABCDeposit
11:00:33:940,3201,ABCSelectAmount
11:00:35:320,3202,ABCSelectAccount
11:00:42:670,3203,ABCConfirm
11:00:50:350,3204,ACBSuccessfulEnd

Into this:

11:00:31:800,3200,ABCDeposit,11:00:33:940
11:00:33:940,3201,ABCSelectAmount,11:00:35:320
11:00:35:320,3202,ABCSelectAccount,11:00:42:670
11:00:42:670,3203,ABCConfirm,11:00:50:350
11:00:50:350,3204,ACBSuccessfulEnd

Did you try it and ran into issues?

kakarsu · ‎11-25-2018

Thank you very much @FrankVl much appreciated mate. I had to update the next regex command to match the criteria for them and it is working as I was expecting.

While I will accept your solution as correct I was wondering to know if you can post me some good sites where I can learn more about Regex specifically the one that teaches the " | rex field=event mode=sed"
I have known the regex101 and www.udemy.com but never thought regex will have this functionality.

Once again thank you and Regards,

FrankVl · ‎11-26-2018

It is not so much a feature of regular expressions. It is using the sed utility to perform string manipulations. Generic info on the sed utility: https://linux.die.net/man/1/sed

Note: Splunk only supports a very limited set of sed functionalities, namely replace (s) and character substitution (y). See also props.conf spec:

SEDCMD-<class> = <sed script>
* Only used at index time.
* Commonly used to anonymize incoming data at index time, such as credit
  card or social security numbers. For more information, search the online
  documentation for "anonymize data."
* Used to specify a sed script which Splunk software applies to the _raw 
  field.
* A sed script is a space-separated list of sed commands. Currently the
  following subset of sed commands is supported:
    * replace (s) and character substitution (y).
* Syntax:
    * replace - s/regex/replacement/flags
      * regex is a perl regular expression (optionally containing capturing
        groups).
      * replacement is a string to replace the regex match. Use \n for back
        references, where "n" is a single digit.
      * flags can be either: g to replace all matches, or a number to
        replace a specified match.
    * substitute - y/string1/string2/
      * substitutes the string1[i] with string2[i]

kakarsu · ‎11-26-2018

You are a legend! thank you for the info mate.

Creating End_Loading_Time

Application management with Targeted Application Install for Victoria Experience

Index This | What goes up and never comes down?

Splunkers, Pack Your Bags: Why Cisco Live EMEA is Your Next Big Destination

Join the Conversation

Creating End_Loading_Time

Application management with Targeted Application Install for Victoria Experience

Index This | What goes up and never comes down?

Splunkers, Pack Your Bags: Why Cisco Live EMEA is Your Next Big Destination