Knowledge Management

Summary Index getting populated with incorrect data

Roopaul
Explorer

Hi, I am getting logs from 2 servers which is exactly same unless there is some failure. We have to group the events based on an Id and consider it as a single event for reporting. So i used 'transaction' command. When I ran the query as a stand-alone it gives correct count as expected. But while it gets written to SI its giving wrong results. This SI is getting populate every hour.

index=test | fields content
| rex field=content "\n*Id:(?P<Id>\d[^~]+)"  
| rex field=content "\n*Path\:(?<path>[^~|?]+)"
| transaction Id keepevicted=true
| fillnull value=NA path
| replace  "" with "NA" in path
| bucket _time span=1h
| stats count by _time,path

content from hostA
time1 Id:A Path:AB1
time1 Id:A Path:AB2
time2 Id:B Path:AC1
time2 Id:C Path:AC1

content from hostB
time1 Id:A Path:AB1
time1 Id:A Path:AB2
time2 Id:B Path:AC1
time2 Id:C Path:AC1

Output while running standalone: - this is expected to fill in summary
time1 AB1 1
time1 AB2 1
time2 Ac1 2

Output while writing to summary: - this is counting from both the servers
time1 AB1 2
time1 AB2 2
time2 Ac1 4

0 Karma

somesoni2
Revered Legend

Give this a try

 index=test | fields content
 | rex field=content "\n*Id:(?P<Id>\d[^~]+)"  
 | rex field=content "\n*Path\:(?<path>[^~|?]+)"
 | fillnull value=NA path
 | replace  "" with "NA" in path
 |dedup Id path
 | bucket _time span=1h
 | stats count by _time,path

Update
Try this

your base search giving your result from both host and all 5 fields
| table _time Id Path otherfield1 otherfield2 otherfield3...
| fillnull value=NA path
| replace  "" with "NA" in path
| stats values(*) as * by _time Id
 | bucket _time span=1h
 | stats count by _time,path
0 Karma

Roopaul
Explorer

Thanks for the answer. There are multiple other conditions also in this data which i have not explained. So i can't use dedup, because evenif id+path combination is unique there are other fields which can be different. So based on certain conditions, we extract the required fields from these files after using transaction command. so removing duplicated based on these 2 fields might remove the data that is required.

0 Karma

somesoni2
Revered Legend

I've seen Splunk behaving differently when using transaction command (it's a resource intensive command and since scheduled searches have lower priority than ad-hoc, it has to work with (less) available resources). Consider replacing it with a stats or something. If you can add your full search in the question, answer community can help you with a solution .

0 Karma

Roopaul
Explorer

This is my requirement 🙂 Hope this helps.

hostA:
Field1 | Field2 | Field3 | Field4 | Field5
time1 | Id1 | path1 | dog1 |

time1 | Id1 | path1 | _____ | cat1
time1 | Id2 | path1 | dog1 |

time1 | Id2 | path1 |____| cat1
time2 | Id3 | path2 | dog2 |

time2 | Id3 | path2 | _
__ | cat2
time2 | Id4 | path2 | dog2 |

hostB:
Field1 | Field2 | Field3 | Field4 | Field5
time1 | Id1 | path1 | dog1 |

time1 | Id1 | path1 | _____ | cat1
time1 | Id2 | path1 | dog1 |

time1 | Id2 | path1 |____| cat1
time2 | Id3 | path2 | dog2 |

time2 | Id3 | path2 | _
__ | cat2
time2 | Id5 | path2 | dog2 |

I want the out to be like this and want this to be stored in a summary index.

Field1 | Field3 | Field4 | Field5 | Count
time1 | path1 | dog1 | cat1 | 1
time1 | path1 | dog1 | cat1 | 1
time2 | path2 | dog2 | cat2 | 1
time2 | path2 | dog2 | NA | 2

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...