Splunk Search

How to populate nonexistent values in a table?

adnankhan5133
Communicator

My query below generates a table, which appears as follows. The issue that I'm trying to resolve is being able to populate non-existent values with "No Data", as shown in the 2020-08-11 column. Can someone provide some assistance on how to do this? I have used fillnull and filldown, but have not been successful.

Service IDResource NameTransaction NamePriorityService AreaConsumer2020-08-122020-08-112020-08-102020-08-09
ID1GETTransaction11Area1App13 40
ID2PUTTransaction22Area2App28 25

 

index=test_index_1 sourcetype=test_sourcetype_2

| eval epoch_Timestamp=strptime(Timestamp, "%Y-%m-%dT%H:%M:%S.%3QZ")-14400

| rename "Transaction Name" as trans_name, "Application Name" as application_name, "Status Code" as status_code

| eval service_id=case(Verb="GET" AND trans_name="Transaction1" AND application_name="APP1", "ID1", Verb="GET" AND trans_name="Transaction2" AND application_name="App2", "ID2", Verb="PUT" AND trans_name="Transaction2" AND application_name="App2", "ID3", 1=1, "Unqualified")

| where service_id!="Unqualified"

| eval Priority=case(Verb="GET" AND trans_name="Transaction1" AND application_name="APP1", "2", Verb="GET" AND trans_name="Transaction2" AND application_name="App2", "2", Verb="PUT" AND trans_name="Transaction2" AND application_name="App2", "1", 1=1, "Unqualified")

| where Priority!="Unqualified"

| eval service_area=case(Verb="GET" AND trans_name="Transaction1" AND application_name="APP1", "Area1", Verb="GET" AND trans_name="Transaction2" AND application_name="App2", "Area2", Verb="PUT" AND trans_name="Transaction2" AND application_name="App2", "Member", 1=1, "Unqualified")

| where service_area!="Unqualified"

| eval date_reference=strftime(epoch_Timestamp, "%Y-%m-%d")

| stats count(eval(status_code)) as count by service_id, Verb, trans_name, Priority, service_area, application_name, date_reference

| eval combined=service_id."@".Verb."@".trans_name."@".Priority."@".service_area."@".application_name."@"

| xyseries combined date_reference count

| rex field=combined "^(?<service_id>[^\@]+)\@(?<Verb>[^\@]+)\@(?<trans_name>[^\@]+)\@(?<Priority>[^\@]+)\@(?<service_area>[^\@]+)\@(?<application_name>[^\@]+)\@$"

| fillnull value="0"

| table service_id, Verb, trans_name, Priority, service_area, application_name

[ makeresults | addinfo

| eval time = mvappend(relative_time(info_min_time,"@d"),relative_time(info_max_time,"@d"))

| fields time | mvexpand time

| makecontinuous time span=1d

| eval time=strftime(time,"%F")

| reverse

| stats list(time) as time

| return $time

]

| rename service_id as "Service ID", Verb as "Resource Name", trans_name as "Transaction Name", Priority as "Priority", service_area as "Service Area", application_name as "Consumer"

Labels (6)
0 Karma

niketn
Legend

@adnankhan5133 you are performing too many preprocessing and filtering of data after pulling from index and before your stats command. Which will impact performance of your query.

Also you are performing eval on string time field with timezone adjustment by 4 hours. If your data is getting indexed with wrong timestamp identification you should handle timestamp extraction and timezone adjustment while indexing the data (not in all of your search time SPL which impacts performance). Please try the following search and confirm whether it matches your expectation.

1. I have _time calculated based on Timestamp string time field, however, you should handle this while indexing of data.

2. I have filtered only three combination of Verb, Transaction Name and Application Name from your data so that there is no need to pull unwanted data from index and filter later.

3. I have used timechart to create empty buckets of time using calculated key field which is built by combining Verb, Transaction Name and Application Name.

4. Finally used transpose to reverse row data as column and vice versa. (Epoch time is converted String Time of format YYYY-mm-dd before transpose so that it shows up properly as column).

Keep adding one pipe at a time to understand what each step is doing.

 

index=test_index_1 sourcetype=test_sourcetype_2 (Verb="GET" AND 'Transaction Name'="Transaction1" AND 'Application Name'="APP1") OR (Transaction Name'="Transaction2" AND 'Application Name'="App2") OR (Verb="PUT" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2)
| eval _time=strptime(Timestamp, "%Y-%m-%dT%H:%M:%S.%3QZ")-14400
| eval key=Verb."|".'Transaction Name'."|".'Application Name'
| timechart span=1d count('Status Code') as count by key 
| eval service_id=case(Verb="GET" AND 'Transaction Name'="Transaction1" AND 'Application Name'="APP1", "ID1", 
    Verb="GET" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "ID2", 
    Verb="PUT" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "ID3") 
| eval Priority=case(Verb="GET" AND 'Transaction Name'="Transaction1" AND 'Application Name'="APP1", "2",
    Verb="GET" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "2", 
    Verb="PUT" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "1") 
| eval service_area=case(Verb="GET" AND 'Transaction Name'="Transaction1" AND 'Application Name'="APP1", "Area1",
    Verb="GET" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "Area2",
    Verb="PUT" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "Member")
| eval Time=strftime(_time,"%Y-%m-%d")
| fields - _*
| transpose 0 header_field="Time" column_name=key
| eval key=split(key,"|"), Verb=mvindex(key,0), "Transaction Name"=mvindex(key,1), "Application Name"=mvindex(key,2)
| fields - key
| fields Verb "Transaction Name" "Application Name" *

 

Following is a run anywhere example based on sample data provided:

 

| makeresults 
| eval Timestamp=strftime(_time,"%Y-%m-%dT%H:%M:%S.%3QZ") 
| eval data="Verb=\"GET\",\"Transaction Name\"=\"Transaction1\",\"Application Name\"=\"APP1\";Verb=\"GET\",\"Transaction Name\"=\"Transaction2\",\"Application Name\"=\"App2\";Verb=\"PUT\",\"Transaction Name\"=\"Transaction2\",\"Application Name\"=\"App2\"" 
| makemv data delim=";" 
| mvexpand data 
| rename data as _raw 
| extract pairdelim="," kvdelim="=" 
| eval "Status Code"=random() 
| rename "Transaction_Name" as "Transaction Name","Application_Name" as "Application Name" 
| eval _time=strptime(Timestamp, "%Y-%m-%dT%H:%M:%S.%3QZ")-14400
| eval key=Verb."|".'Transaction Name'."|".'Application Name'
| timechart span=1d count('Status Code') as count by key 
| eval service_id=case(Verb="GET" AND 'Transaction Name'="Transaction1" AND 'Application Name'="APP1", "ID1", 
    Verb="GET" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "ID2", 
    Verb="PUT" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "ID3") 
| eval Priority=case(Verb="GET" AND 'Transaction Name'="Transaction1" AND 'Application Name'="APP1", "2",
    Verb="GET" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "2", 
    Verb="PUT" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "1") 
| eval service_area=case(Verb="GET" AND 'Transaction Name'="Transaction1" AND 'Application Name'="APP1", "Area1",
    Verb="GET" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "Area2",
    Verb="PUT" AND 'Transaction Name'="Transaction2" AND 'Application Name'="App2", "Member")
| eval Time=strftime(_time,"%Y-%m-%d")
| fields - _*
| transpose 0 header_field="Time" column_name=key
| eval key=split(key,"|"), Verb=mvindex(key,0), "Transaction Name"=mvindex(key,1), "Application Name"=mvindex(key,2)
| fields - key
| fields Verb "Transaction Name" "Application Name" *

 

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

adnankhan5133
Communicator

Thanks @niketn - this is definitely helpful. We ended up getting a lot of assistance from @to4kawa , @xpac , and @Richfez who successfully resolved the issue. It did boil down to the timestamp logic.

Here is the final and working SPL:

index=hcnc_mulesoft_test_new sourcetype=api_events_legacy_new
| rename "API Name" as api_name, "Application Name" as application_name, "Status Code" as status_code 
| eval service_id=case(Verb="GET" AND api_name="Provider Data Management API" AND application_name="PHP-AMHC", "PRO-I-001-0001", Verb="GET" AND api_name="Member Data API" AND application_name="Enrollment Broker", "MEM-I-001-0001", Verb="PUT" AND api_name="Member Data API" AND application_name="Enrollment Broker", "MEM-I-002-0001", 1=1, "Unqualified") 
| where service_id!="Unqualified" 
| eval Priority=case(Verb="GET" AND api_name="Provider Data Management API" AND application_name="PHP-AMHC", "2", Verb="GET" AND api_name="Member Data API" AND application_name="Enrollment Broker", "2", Verb="PUT" AND api_name="Member Data API" AND application_name="Enrollment Broker", "1", 1=1, "Unqualified") 
| where Priority!="Unqualified" 
| eval service_area=case(Verb="GET" AND api_name="Provider Data Management API" AND application_name="PHP-AMHC", "Provider", Verb="GET" AND api_name="Member Data API" AND application_name="Enrollment Broker", "Member", Verb="PUT" AND api_name="Member Data API" AND application_name="Enrollment Broker", "Member", 1=1, "Unqualified") 
| where service_area!="Unqualified" 
| eval combined=service_id."@".Verb."@".api_name."@".Priority."@".service_area."@".application_name."@" 
| timechart span=1d count(eval(status_code)) as count by combined 
| untable _time combined count 
| eval date=strftime(_time, "%Y-%m-%d") 
| xyseries combined date count 
| rex field=combined "^(?<service_id>[^@]+)@(?<Verb>[^@]+)@(?<api_name>[^@]+)@(?<Priority>[^@]+)@(?<service_area>[^@]+)@(?<application_name>[^@]+)@$" 
| fields - combined 
| table service_id Verb api_name Priority service_area application_name 
    [ makeresults 
    | addinfo 
    | eval time = mvappend(relative_time(info_min_time,"@d"),relative_time(info_max_time,"@d")) 
    | fields time 
    | mvexpand time 
    | makecontinuous time span=1d 
    | eval time=strftime(time,"%F") 
    | reverse 
    | stats list(time) as time 
    | return $time
        ] 
| rename service_id as "Service ID", Verb as "Resource Name", api_name as "API Name", Priority as "Priority", service_area as "Service Area", application_name as "Consumer"
0 Karma

niketn
Legend

All my points for Performance improvement are still valid. Do include my SPL as well and check performance of your SPL vs mine using Job Inspector. I have tried to explain some of the basic Splunk Search Best practices (obviously SPL will still work without best practices as well.)

https://docs.splunk.com/Documentation/Splunk/latest/Search/Quicktipsforoptimization

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

adnankhan5133
Communicator

Anyone? Hoping someone can help out here...

0 Karma

to4kawa
Ultra Champion

add
| foreach 2020* [ eval <<FIELD>> =if(isnull('<<FIELD>>') OR !match('<<FIELD>>',"\S+"),"No Data",'<<FIELD>>')]

0 Karma

adnankhan5133
Communicator

Hi @to4kawa - Thanks for the reply. However, when I include the following statement after the makeresults subsearch (which returns the time columns), the query ends up creating a new column called "<<FIELD>>" or whatever value I substitute in, such '<<count>>' or 'count'.

 

| foreach 2020* [ eval <<FIELD>> =if(isnull('<<FIELD>>') OR !match('<<FIELD>>',"\S+"),"No Data",'<<FIELD>>')]

 

Any suggestions on how to remedy this?

0 Karma

to4kawa
Ultra Champion
| makeresults 
| fillnull 2020-08-01 2020-08-02 count value=10
| foreach 2020* [ eval <<FIELD>>=random() % 10 + '<<FIELD>>']

this query works fine.

It turns out you're doing something extra for your query.

my query can't display "<<FIELD>>"

0 Karma

adnankhan5133
Communicator

Hey @to4kawa  - I tried using your updated query. It ends up populating the 2020-08-01 and 2020-08-02 columns with random numbers. 

Is there a way to get the date columns associated with non-existent values to populate with "0" ? For example, when the column with today's date (2020-8-14) generates at 12AM EST (US/Eastern), the column values are null/blank. They will only populate later in the morning once certain transactions begin occurring. The requirement is to ensure that those null/blank values display as "0". The current query isn't meeting the requirement. I tried including both your recommendations, and those are getting the null/blank values to populate with "0" either. 

0 Karma

adnankhan5133
Communicator
index=test_index_1 sourcetype=test_sourcetype_2

| eval epoch_Timestamp=strptime(Timestamp, "%Y-%m-%dT%H:%M:%S.%3QZ")-14400

| rename "Transaction Name" as trans_name, "Application Name" as application_name, "Status Code" as status_code

| eval service_id=case(Verb="GET" AND trans_name="Transaction1" AND application_name="APP1", "ID1", Verb="GET" AND trans_name="Transaction2" AND application_name="App2", "ID2", Verb="PUT" AND trans_name="Transaction2" AND application_name="App2", "ID3", 1=1, "Unqualified")

| where service_id!="Unqualified"

| eval Priority=case(Verb="GET" AND trans_name="Transaction1" AND application_name="APP1", "2", Verb="GET" AND trans_name="Transaction2" AND application_name="App2", "2", Verb="PUT" AND trans_name="Transaction2" AND application_name="App2", "1", 1=1, "Unqualified")

| where Priority!="Unqualified"

| eval service_area=case(Verb="GET" AND trans_name="Transaction1" AND application_name="APP1", "Area1", Verb="GET" AND trans_name="Transaction2" AND application_name="App2", "Area2", Verb="PUT" AND trans_name="Transaction2" AND application_name="App2", "Member", 1=1, "Unqualified")

| where service_area!="Unqualified"

| eval date_reference=strftime(epoch_Timestamp, "%Y-%m-%d")

| stats count(eval(status_code)) as count by service_id, Verb, trans_name, Priority, service_area, application_name, date_reference

| eval combined=service_id."@".Verb."@".trans_name."@".Priority."@".service_area."@".application_name."@"

| xyseries combined date_reference count

| rex field=combined "^(?<service_id>[^\@]+)\@(?<Verb>[^\@]+)\@(?<trans_name>[^\@]+)\@(?<Priority>[^\@]+)\@(?<service_area>[^\@]+)\@(?<application_name>[^\@]+)\@$"

| fillnull value="0"

| table service_id, Verb, trans_name, Priority, service_area, application_name

[ makeresults | addinfo

| eval time = mvappend(relative_time(info_min_time,"@d"),relative_time(info_max_time,"@d"))

| fields time | mvexpand time

| makecontinuous time span=1d

| eval time=strftime(time,"%F")

| reverse

| stats list(time) as time

| return $time

]

| rename service_id as "Service ID", Verb as "Resource Name", trans_name as "Transaction Name", Priority as "Priority", service_area as "Service Area", application_name as "Consumer"
0 Karma

adnankhan5133
Communicator

I also tried the following, along with filldown, fillnull, and a couple of eval statements, but unfortunately, I'm not making any progress.

| foreach notNULL
        [ eval time = if(isnull(time) OR len(time)==0, "0", time)
0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...