Hello Everyone
I have a below search query that results me 4 column table. Process, RunID, StartTime and EndTime.
sourcetype=xxxx
| eval Process=substr('source',1,5)
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process
| table RunID, StartTime, EndTime, Process
I am unable to paste the exact results that I am getting as outputs, as they are scattering and would not be completely understandable.
I have some of the columns where either RunID is missing or StartTime is missing and some of the StartTime fields have multiple values. I do not nee those rows/events where these fields are either empty or have multiple values.
I need only the rows where all the 4 fields have values and all of them are single values, how do I retrieve that?
Have attached 2 image files which shows what fields I need and what all the ones to be removed. Please help!
Thanks
Maria Arokiaraj
Well so that is the reason. Maybe the results are "" and therefore not null. see if using the length is acceptable for your context and use case:
sourcetype=xxxx RUNID= StartTime= EndTime= TaskStart= Process=*
| eval Process=substr('source',1,5)
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process
| table RunID, StartTime, EndTime, Process
| eval mvc_EndTime=mvcount(EndTime), mvc_StartTime=mvcount(StartTime)
| where mvc_EndTime==1 AND mvc_StartTime==1
| where len(RunID)>0 AND len(StartTime)>0
sourcetype=xxxx
| eval Process=substr('source',1,5)
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process
| eval StartTimeCount=mvcount(StartTime)
| search StartTimeCount=1 AND RunID=* AND StartTime=* AND EndTime=* AND Process=*
| table RunID, StartTime, EndTime, Process
@kmaron, unwanted events should be filtered upfront rather than later down the "pipe"line 🙂
So the following filter is better to be put in the main search as stated in the comment to the question:
sourcetype=xxxxx RunID=* AND StartTime=* AND EndTime=* AND Process=*
I agree Process and RunID should be moved up however the StartTime and EndTime fields weren't created until after the stats command. So you are only half right.
Sorry my bad if you get the idea it should be TaskStart and TaskEnd in the base search.
Well so that is the reason. Maybe the results are "" and therefore not null. see if using the length is acceptable for your context and use case:
sourcetype=xxxx RUNID= StartTime= EndTime= TaskStart= Process=*
| eval Process=substr('source',1,5)
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process
| table RunID, StartTime, EndTime, Process
| eval mvc_EndTime=mvcount(EndTime), mvc_StartTime=mvcount(StartTime)
| where mvc_EndTime==1 AND mvc_StartTime==1
| where len(RunID)>0 AND len(StartTime)>0
If the answer/comments contributed to you, please don't forget to support by upvoting and accepting answer
@maria2691 to remove events with null values in the field you can add the following in your base search
sourcetype=xxxx RUNID=* StartTime=*
Can you please tell which row in the data provided in the question (screenshot) has multivalued StartTime?
@maria2691, you should inspect the data in Event Viewer and pull some sample in the Raw mode for us to assist you better.
Hey
Consider first eliminate the null values of the RunID and StartTime and then remove the rows with mv
sourcetype=xxxx
| eval Process=substr('source',1,5)
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process
| table RunID, StartTime, EndTime, Process
| where isnotnull(RunID) AND isnotnull(StartTime)
| eval mvc_RunID=mvcount(RunID), mvc_StartTime=mvcount(StartTime)
| where mvc_RunID==1 AND mvc_StartTime==1
Hello @tiagofbmm, It doesn't work 😞
First the null fields are not removed and the mvcount is assigned to these MultiValue Columns also and they are also getting displayed in the final output.
Earlier also I tried removing the nulls, but for some reason it is not removing.
Can you give me the mvcount results ?
1 is assigned to all the fields, even though they have multiple values (like the ones in attached screenshot).
According to your image "the-fields-i-do-not-need.jpg", those mvcount can't be 1 as it is clearly seen that you have multivalued fields there.
Are we talking about some other fields? Because StartTime and EndTime have definitely mvcount greater than 0 in your snapshot
Hello @tiagofbmm
I have executed your search query and the result I have extracted to a CSV and it's screenshot "result.jpeg" has been attached to the question now. Please refer to that. The count is coming as 1 only even though the EndTime field has multiple values.
Thanks
I think there may be some confusion between a field with multiple values with multivalued field.
What your image result.jpg is showing are fields with multiple values, one value per row.
What your the-fields-i-do-not-need.jpg is showing is multivalued fields. Many values for the same row!
For instance, in the-fields-i-do-not-need.jpg, your row of process CS02 has dozens of values in StartTime and EndTime.
If you don't want that row, I guarantee you that excluding mvcount>1 will eliminate them.
In the question you asked "I need only the rows where all the 4 fields have values and all of them are single values, how do I retrieve that?"
Using mvcount>1 will most definitely exclude rows with multivalued fields.
More, after removing the rows with multivalued fields, the remaining ones (with single valued fields only), will now be ready to be excluded with the isnotnull, so maybe it should be done backwards and to all the fields
sourcetype=xxxx
| eval Process=substr('source',1,5)
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process
| table RunID, StartTime, EndTime, Process
| eval mvc_EndTime=mvcount(EndTime), mvc_StartTime=mvcount(StartTime)
| where mvc_EndTime==1 AND mvc_StartTime==1
| where isnotnull(RunID) AND isnotnull(StartTime)
Thanks a lot @tiagofbmm
It worked 🙂
However the last line does not seem to work. I receive one or 2 lines with RunID and StartTime as NULL. What could be the reason.
@niketnilay solution for the first part is better:
sourcetype=xxxx RUNID=* StartTime=* EndTime=* TaskStart=* Process=*
| eval Process=substr('source',1,5)
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process
| table RunID, StartTime, EndTime, Process
| eval mvc_EndTime=mvcount(EndTime), mvc_StartTime=mvcount(StartTime)
| where mvc_EndTime==1 AND mvc_StartTime==1
Weird!! Still the same 😞
Do an eval to and feed us back if the result of the isnull is true or false
sourcetype=xxxx RUNID=* StartTime=* EndTime=* TaskStart=* Process=*
| eval Process=substr('source',1,5)
| stats values(TaskStart) as StartTime, values(TaskEnd) as EndTime by RunID, Process
| table RunID, StartTime, EndTime, Process
| eval mvc_EndTime=mvcount(EndTime), mvc_StartTime=mvcount(StartTime)
| where mvc_EndTime==1 AND mvc_StartTime==1
| eval isnullRunID=if(isnull(RunID),1,0), isnullStartTime=if(isnull(StartTime),1,0)
The Result is false even though the fields are empty.
Looks like Splunk does not consider these fields as NULL 😮