Knowledge Management

Summary Index getting populated with incorrect data

Roopaul
Explorer

Hi, I am getting logs from 2 servers which is exactly same unless there is some failure. We have to group the events based on an Id and consider it as a single event for reporting. So i used 'transaction' command. When I ran the query as a stand-alone it gives correct count as expected. But while it gets written to SI its giving wrong results. This SI is getting populate every hour.

index=test | fields content
| rex field=content "\n*Id:(?P<Id>\d[^~]+)"  
| rex field=content "\n*Path\:(?<path>[^~|?]+)"
| transaction Id keepevicted=true
| fillnull value=NA path
| replace  "" with "NA" in path
| bucket _time span=1h
| stats count by _time,path

content from hostA
time1 Id:A Path:AB1
time1 Id:A Path:AB2
time2 Id:B Path:AC1
time2 Id:C Path:AC1

content from hostB
time1 Id:A Path:AB1
time1 Id:A Path:AB2
time2 Id:B Path:AC1
time2 Id:C Path:AC1

Output while running standalone: - this is expected to fill in summary
time1 AB1 1
time1 AB2 1
time2 Ac1 2

Output while writing to summary: - this is counting from both the servers
time1 AB1 2
time1 AB2 2
time2 Ac1 4

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Give this a try

 index=test | fields content
 | rex field=content "\n*Id:(?P<Id>\d[^~]+)"  
 | rex field=content "\n*Path\:(?<path>[^~|?]+)"
 | fillnull value=NA path
 | replace  "" with "NA" in path
 |dedup Id path
 | bucket _time span=1h
 | stats count by _time,path

Update
Try this

your base search giving your result from both host and all 5 fields
| table _time Id Path otherfield1 otherfield2 otherfield3...
| fillnull value=NA path
| replace  "" with "NA" in path
| stats values(*) as * by _time Id
 | bucket _time span=1h
 | stats count by _time,path
0 Karma

Roopaul
Explorer

Thanks for the answer. There are multiple other conditions also in this data which i have not explained. So i can't use dedup, because evenif id+path combination is unique there are other fields which can be different. So based on certain conditions, we extract the required fields from these files after using transaction command. so removing duplicated based on these 2 fields might remove the data that is required.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

I've seen Splunk behaving differently when using transaction command (it's a resource intensive command and since scheduled searches have lower priority than ad-hoc, it has to work with (less) available resources). Consider replacing it with a stats or something. If you can add your full search in the question, answer community can help you with a solution .

0 Karma

Roopaul
Explorer

This is my requirement 🙂 Hope this helps.

hostA:
Field1 | Field2 | Field3 | Field4 | Field5
time1 | Id1 | path1 | dog1 |

time1 | Id1 | path1 | _____ | cat1
time1 | Id2 | path1 | dog1 |

time1 | Id2 | path1 |____| cat1
time2 | Id3 | path2 | dog2 |

time2 | Id3 | path2 | _
__ | cat2
time2 | Id4 | path2 | dog2 |

hostB:
Field1 | Field2 | Field3 | Field4 | Field5
time1 | Id1 | path1 | dog1 |

time1 | Id1 | path1 | _____ | cat1
time1 | Id2 | path1 | dog1 |

time1 | Id2 | path1 |____| cat1
time2 | Id3 | path2 | dog2 |

time2 | Id3 | path2 | _
__ | cat2
time2 | Id5 | path2 | dog2 |

I want the out to be like this and want this to be stored in a summary index.

Field1 | Field3 | Field4 | Field5 | Count
time1 | path1 | dog1 | cat1 | 1
time1 | path1 | dog1 | cat1 | 1
time2 | path2 | dog2 | cat2 | 1
time2 | path2 | dog2 | NA | 2

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...