Splunk Search

How to create a search that compares two log files and displays results in a table format?

thaara
Explorer

I have below 2 log files with 4 identical columns and in that, status is different:

Status1.log
host1,PROD,1666680,mobile1,Staging_Successful
host1,PROD,1666680,mobile2,Staging_Successful
host1,PROD,1666680,mobile3,Staging_Successful

Status2.log
host1,PROD,1666680,mobile1,Deployment_Successful
host1,PROD,1666680,mobile2,Deployment_Successful
host1,PROD,1666680,mobile3,Deployment_Successful

Currently, I am able to extract both files individually by using rex command. But my desire is to compare both the files and output as follows. I want to merge the status like below in table format...Please suggest how to do this.

host1,PROD,1666680,mobile1,Staging_Successful,Deployment_Successful
host1,PROD,1666680,mobile2,Staging_Successful,Deployment_Successful
host1,PROD,1666680,mobile3,Staging_Successful,Deployment_Successful
Labels (3)
1 Solution

richgalloway
SplunkTrust
SplunkTrust

There is more than one way to do that. Try this one.

(index=foo source=Status1.log) OR (index=bar source=Status2.log)
| stats values(*) by field1, field2, field3, field4
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

There is more than one way to do that. Try this one.

(index=foo source=Status1.log) OR (index=bar source=Status2.log)
| stats values(*) by field1, field2, field3, field4
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

thaara
Explorer

@richgalloway this is not working as expected, am getting data in different way altogether. Please suggest another way to do this.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

What did you get? Please share your results.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

thaara
Explorer

@richgalloway in Status1.log, i just noticed that, i have one extra field before 'Staging_Successful'.
So currently i have 6 fields in Status1.log and 5 fields in Status2.log..The query i used is,
(sourcetype=abc source=Status1.log) OR (sourcetype=abc source=Status2.log) | stats values(*) field1, field2, field3, field4, field5
Since there is a difference in fields, am getting results in unexpected format.

and how to remove one unwanted field from Status1.log to achieve the desired result ?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Note the use of the by keyword in the stats command in my answer. That should change your results. Also, "field1, field2, field3, field4" are intended to be placeholders. Replace them with the actual field names from your data. You don't need the additional field from Status1.log. Ignore it.

---
If this reply helps you, an upvote would be appreciated.
0 Karma

thaara
Explorer

@richgalloway
Indeed i used 'by' command but forgot to mention it..It didn't give me the expected output even after using..

Current scenario is,

In Status1.log
Server,Env,Req,Package,INC_Num,Stage_Status - Extracted the event and named the fields like this..
host1,PROD,1666680,mobile1,INC,Staging_Successful
host1,PROD,1666680,mobile2,INC,Staging_Successful
host1,PROD,1666680,mobile3,INC,Staging_Successful

Status2.log
Server,Env,Req,Package,Deploy_Status - Extracted the event and named the fields like this..
host1,PROD,1666680,mobile1,Deployment_Successful
host1,PROD,1666680,mobile2,Deployment_Successful
host1,PROD,1666680,mobile3,Deployment_Successful

Query used: (sourcetype=abc source=Status1.log) OR (sourcetype=abc source=Status2.log) | stats values(*) by Server,Env,Req,Package,Stage_Status,Deploy_Status

There are 244 events present in each log. When i run this query, i got output with only 3 rows...But i need all 244 rows..Below o/p i got with improper data aligned in different columns..

Server Env Req Package Stage_Status Deploy_Status

host1 PROD 1666680 mobile1 PROD Deployment_Successful

host

host1 PROD 1666680 mobile2 PROD Deployment_Successful

host

host1 PROD NIL mobile3 Staging_Successful INC

in addition to this, am getting extra columns for all the unselected fields like values(commands), values(host), values(linecount) etc..I don't need this..

Please suggest what's the problem in this..else suggest the other idea..

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Thanks for providing actual field names. That additional bit of information will get us closer to the answer. Since the two log files don't have identical field names, we have to adapt.

(index=foo source=Status1.log) OR (index=bar source=Status2.log)
| eval Status = coalesce(Stage_Status, Deploy_status)
| stats values(*) as * by Server, Env, Req, Package
| table Server, Env, Req, Package, Status
---
If this reply helps you, an upvote would be appreciated.
0 Karma

thaara
Explorer

@richgalloway You're great..this worked, thank you..but by any chance i can get status in 2 different columns ? that's my ultimate aim..

0 Karma

richgalloway
SplunkTrust
SplunkTrust

This should do it. Since not all events contain Stage_Status and Deploy_Status, those fields will have empty values.

(index=foo source=Status1.log) OR (index=bar source=Status2.log)
| stats values(*) as * by Server, Env, Req, Package
| table Server, Env, Req, Package, Stage_Status, Deploy_Status
---
If this reply helps you, an upvote would be appreciated.
0 Karma

thaara
Explorer

@richgalloway This worked, thanks a bunch for ur help!

0 Karma

richgalloway
SplunkTrust
SplunkTrust

You're welcome. Please accept the answer to help future readers.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!