Splunk Search

How do I remove rows with null values on single/multivalue fields?

maria2691
Path Finder

alt textHello Everyone

I have a below search query that results me 4 column table. Process, RunID, StartTime and EndTime.

sourcetype=xxxx
| eval Process=substr('source',1,5) 
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process 
| table RunID, StartTime, EndTime, Process

alt text
I am unable to paste the exact results that I am getting as outputs, as they are scattering and would not be completely understandable.

I have some of the columns where either RunID is missing or StartTime is missing and some of the StartTime fields have multiple values. I do not nee those rows/events where these fields are either empty or have multiple values.

I need only the rows where all the 4 fields have values and all of them are single values, how do I retrieve that?

Have attached 2 image files which shows what fields I need and what all the ones to be removed. Please help!

Thanks
Maria Arokiaraj

0 Karma
1 Solution

tiagofbmm
Influencer

Well so that is the reason. Maybe the results are "" and therefore not null. see if using the length is acceptable for your context and use case:

sourcetype=xxxx RUNID= StartTime= EndTime= TaskStart= Process=*
| eval Process=substr('source',1,5) 
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process 
| table RunID, StartTime, EndTime, Process
| eval mvc_EndTime=mvcount(EndTime), mvc_StartTime=mvcount(StartTime)
| where mvc_EndTime==1 AND mvc_StartTime==1
| where len(RunID)>0 AND len(StartTime)>0

View solution in original post

kmaron
Motivator
  sourcetype=xxxx
  | eval Process=substr('source',1,5) 
  | stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process 
  | eval StartTimeCount=mvcount(StartTime)
  | search StartTimeCount=1 AND RunID=* AND StartTime=* AND EndTime=* AND Process=*
  | table RunID, StartTime, EndTime, Process
0 Karma

niketn
Legend

@kmaron, unwanted events should be filtered upfront rather than later down the "pipe"line 🙂
So the following filter is better to be put in the main search as stated in the comment to the question:

  sourcetype=xxxxx RunID=* AND StartTime=* AND EndTime=* AND Process=*
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

kmaron
Motivator

I agree Process and RunID should be moved up however the StartTime and EndTime fields weren't created until after the stats command. So you are only half right.

niketn
Legend

Sorry my bad if you get the idea it should be TaskStart and TaskEnd in the base search.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"

tiagofbmm
Influencer

Well so that is the reason. Maybe the results are "" and therefore not null. see if using the length is acceptable for your context and use case:

sourcetype=xxxx RUNID= StartTime= EndTime= TaskStart= Process=*
| eval Process=substr('source',1,5) 
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process 
| table RunID, StartTime, EndTime, Process
| eval mvc_EndTime=mvcount(EndTime), mvc_StartTime=mvcount(StartTime)
| where mvc_EndTime==1 AND mvc_StartTime==1
| where len(RunID)>0 AND len(StartTime)>0

tiagofbmm
Influencer

If the answer/comments contributed to you, please don't forget to support by upvoting and accepting answer

0 Karma

niketn
Legend

@maria2691 to remove events with null values in the field you can add the following in your base search

sourcetype=xxxx RUNID=* StartTime=*

Can you please tell which row in the data provided in the question (screenshot) has multivalued StartTime?

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

niketn
Legend

@maria2691, you should inspect the data in Event Viewer and pull some sample in the Raw mode for us to assist you better.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

tiagofbmm
Influencer

Hey

Consider first eliminate the null values of the RunID and StartTime and then remove the rows with mv

 sourcetype=xxxx
 | eval Process=substr('source',1,5) 
 | stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process 
 | table RunID, StartTime, EndTime, Process
 | where isnotnull(RunID) AND isnotnull(StartTime)
 | eval mvc_RunID=mvcount(RunID), mvc_StartTime=mvcount(StartTime)
 | where mvc_RunID==1 AND mvc_StartTime==1
0 Karma

maria2691
Path Finder

Hello @tiagofbmm, It doesn't work 😞

First the null fields are not removed and the mvcount is assigned to these MultiValue Columns also and they are also getting displayed in the final output.
Earlier also I tried removing the nulls, but for some reason it is not removing.

0 Karma

tiagofbmm
Influencer

Can you give me the mvcount results ?

0 Karma

maria2691
Path Finder

1 is assigned to all the fields, even though they have multiple values (like the ones in attached screenshot).

0 Karma

tiagofbmm
Influencer

According to your image "the-fields-i-do-not-need.jpg", those mvcount can't be 1 as it is clearly seen that you have multivalued fields there.

Are we talking about some other fields? Because StartTime and EndTime have definitely mvcount greater than 0 in your snapshot

0 Karma

maria2691
Path Finder

Hello @tiagofbmm

I have executed your search query and the result I have extracted to a CSV and it's screenshot "result.jpeg" has been attached to the question now. Please refer to that. The count is coming as 1 only even though the EndTime field has multiple values.

Thanks

0 Karma

tiagofbmm
Influencer

I think there may be some confusion between a field with multiple values with multivalued field.

What your image result.jpg is showing are fields with multiple values, one value per row.

What your the-fields-i-do-not-need.jpg is showing is multivalued fields. Many values for the same row!

For instance, in the-fields-i-do-not-need.jpg, your row of process CS02 has dozens of values in StartTime and EndTime.

If you don't want that row, I guarantee you that excluding mvcount>1 will eliminate them.

In the question you asked "I need only the rows where all the 4 fields have values and all of them are single values, how do I retrieve that?"

Using mvcount>1 will most definitely exclude rows with multivalued fields.

More, after removing the rows with multivalued fields, the remaining ones (with single valued fields only), will now be ready to be excluded with the isnotnull, so maybe it should be done backwards and to all the fields

  sourcetype=xxxx
  | eval Process=substr('source',1,5) 
  | stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process 
  | table RunID, StartTime, EndTime, Process
  | eval mvc_EndTime=mvcount(EndTime), mvc_StartTime=mvcount(StartTime)
  | where mvc_EndTime==1 AND mvc_StartTime==1
  | where isnotnull(RunID) AND isnotnull(StartTime)

maria2691
Path Finder

Thanks a lot @tiagofbmm
It worked 🙂
However the last line does not seem to work. I receive one or 2 lines with RunID and StartTime as NULL. What could be the reason.

0 Karma

tiagofbmm
Influencer

@niketnilay solution for the first part is better:

   sourcetype=xxxx RUNID=* StartTime=* EndTime=* TaskStart=* Process=*
   | eval Process=substr('source',1,5) 
   | stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process 
   | table RunID, StartTime, EndTime, Process
   | eval mvc_EndTime=mvcount(EndTime), mvc_StartTime=mvcount(StartTime)
   | where mvc_EndTime==1 AND mvc_StartTime==1
0 Karma

maria2691
Path Finder

Weird!! Still the same 😞

0 Karma

tiagofbmm
Influencer

Do an eval to and feed us back if the result of the isnull is true or false

sourcetype=xxxx RUNID=* StartTime=* EndTime=* TaskStart=* Process=*
| eval Process=substr('source',1,5)
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process
| table RunID, StartTime, EndTime, Process
| eval mvc_EndTime=mvcount(EndTime), mvc_StartTime=mvcount(StartTime)
| where mvc_EndTime==1 AND mvc_StartTime==1
| eval isnullRunID=if(isnull(RunID),1,0), isnullStartTime=if(isnull(StartTime),1,0)

0 Karma

maria2691
Path Finder

The Result is false even though the fields are empty.
Looks like Splunk does not consider these fields as NULL 😮

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...