Splunk Search

two search queries with join not working

lovelyshrm421
Explorer

I have two separate search queries which are working separately but when i am trying to get data by joining them its not giving me any result from second query.

first query- 

index=ads sourcetype="sequel"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db tb cnts dates jobname
| where cnts>0
| table dates jobname db tb cnts

second query-

index=ads sourcetype="isosequel"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "count:(?P<cnt>[^ ]+)"
| eval jobname1="Job for p2"
| stats sum(cnt) as tb_cnt by jobname1 db tb
| fields jobname1 db tb tb_cnt
|table jobname1 db tb tb_cnt

joined query(not working as expected)-

index=ads sourcetype="sequel"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=inner db tb
[ search(index=ads sourcetype="isosequel")
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "count:(?P<cnt>[^ ]+)"
| rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
| stats sum(cnt) as tb_cnt by jb db tb
| fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff

requirement- I want to compare each db ,table with the second query db, table and get the difference, but i am not getting any result out of second query. any help would be appreciated !!!

 

Thankyou in Advance !!

 

Labels (3)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

Force them both to lower case

| makeresults 
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021\", \"jb_name\": \"Job for p1\", \"time\": 1637751926}"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| eval tb=lower(tb)
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=left db tb
    [| makeresults
    | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752834}"
    | append
        [| makeresults
        | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752776}"
        ]
    | rex field=_raw "schema:(?P<db>[^ ]+)"
    | rex field=_raw "table:(?P<tb>[^ ]+)"
    | eval tb=lower(tb)
    | rex field=_raw "count:(?P<cnt>[^ ]+)"
    | rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
    | stats sum(cnt) as tb_cnt by jb db tb
    | fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff

View solution in original post

ITWhisperer
SplunkTrust
SplunkTrust

You have changed the way you determine jobname - is it possible that

| rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"

is returning no values for jb (which would mean the subsequent stats by jb would have zero results)?

0 Karma

lovelyshrm421
Explorer

Even i tried

 | eval jobname1="Job for p2"

as well .

its same actually. no results 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Can you share some sample events from the two searches?

0 Karma

lovelyshrm421
Explorer

first query event- in the below event we are getting count directly in field- s_total_count:2.

{ [-]
   jb_nameJob for p1
   messagedatabase:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:2 s_total_count:324101 origin_cnt_date:Tue Nov 23 06:57:49 GMT 2021
   severityINFO
   time1637650742

}

Second query-  this below query getting the count in the field count:20000 and having multiple events for one table per batch wise, so we need to sum up all the event for the table and compare the count with the first query count and see the difference.

event1:

[-]
   jb_nameJob for p2
   messagedatabase:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2
   severityINFO
   time1637650935
}

event2:

[-]
   jb_nameJob for p2
   messagedatabase:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2
   severityINFO
   time1637650875
}

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

In the first event, your table name is liabilitytype, and in the second set of events, it is Claim. This is why the join is returning zero events

0 Karma

lovelyshrm421
Explorer

We have all the table event available in first query. now check the event info i edited.

I am thinking something wrong with the join i am using its not able to calculate the count from second query.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

I am confused as to what you think is wrong - the search is summing the counts in the second set of events - 20000 + 20000 = 40000 and you are comparing this to the count from the field in the first event - 2

Instead of summing the counts in the second set of events, should you be counting the events - 2?

0 Karma

lovelyshrm421
Explorer

I need to sum-up the count from second query(both event for claim table field count: 20000 + 20000) and compare with the field s_total_count:324101 from first query and post the difference.

Which is not happening .

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Try changing

| stats sum(cnt) as tb_cnt by jb db tb

to

| stats sum(cnt) as tb_cnt values(jb) as jb by db tb
0 Karma

lovelyshrm421
Explorer

I changed to 

| stats sum(cnt) as tb_cnt values(jb) as jb by db tb

 

but still no results from second query.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

The events look like they might be in JSON format - can you share the raw events (rather than the formatted events)? Also, have you considered using spath to retrieve the fields?

0 Karma

lovelyshrm421
Explorer

Event for first query- table -claim

{"severity": "INFO", "message": "database:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021", "jb_name": "Job for p1", "time": 1637751926}

Events for second query- table- claim

{"severity": "INFO", "message": "database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2", "jb_name": "Job for p2", "time": 1637752834}

{"severity": "INFO", "message": "database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2", "jb_name": "Job for p2", "time": 1637752776}

have you considered using spath to retrieve the fields? No i haven't tried yet.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Never mind about the spath - I didn't realise that most of the information is taken from the message field

Your search appears to work with the data you shared (apart from I "corrected" claim to Claim in the table name as the join is case-sensitive)

| makeresults 
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021\", \"jb_name\": \"Job for p1\", \"time\": 1637751926}"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=left db tb
    [| makeresults
    | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752834}"
    | append
        [| makeresults
        | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752776}"
        ]
    | rex field=_raw "schema:(?P<db>[^ ]+)"
    | rex field=_raw "table:(?P<tb>[^ ]+)"
    | rex field=_raw "count:(?P<cnt>[^ ]+)"
    | rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
    | stats sum(cnt) as tb_cnt by jb db tb
    | fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff

So perhaps there is something different about your real data?

0 Karma

lovelyshrm421
Explorer

Yes, Exactly I have observed that from first query events are like table name -all small case, and from second query  table name is in mix case(upper +lower), so is there a way we can change the case(to_lower) before comparing the table and db?

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Force them both to lower case

| makeresults 
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021\", \"jb_name\": \"Job for p1\", \"time\": 1637751926}"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| eval tb=lower(tb)
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=left db tb
    [| makeresults
    | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752834}"
    | append
        [| makeresults
        | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752776}"
        ]
    | rex field=_raw "schema:(?P<db>[^ ]+)"
    | rex field=_raw "table:(?P<tb>[^ ]+)"
    | eval tb=lower(tb)
    | rex field=_raw "count:(?P<cnt>[^ ]+)"
    | rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
    | stats sum(cnt) as tb_cnt by jb db tb
    | fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff

View solution in original post

lovelyshrm421
Explorer

This seems to be working. Perfect. Thanks alot. Great help.

But let me test out few more scenario's for this and if i face any issue will post here.

Thanks Again for helping 🙂 

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!