Solved: two search queries with join not working

lovelyshrm421 · ‎11-22-2021

I have two separate search queries which are working separately but when i am trying to get data by joining them its not giving me any result from second query.

first query-

second query-

joined query(not working as expected)-

index=ads sourcetype="sequel"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=inner db tb
[ search(index=ads sourcetype="isosequel")
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "count:(?P<cnt>[^ ]+)"
| rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
| stats sum(cnt) as tb_cnt by jb db tb
| fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff

requirement- I want to compare each db ,table with the second query db, table and get the difference, but i am not getting any result out of second query. any help would be appreciated !!!

Thankyou in Advance !!

ITWhisperer · ‎11-24-2021

Force them both to lower case

| makeresults 
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021\", \"jb_name\": \"Job for p1\", \"time\": 1637751926}"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| eval tb=lower(tb)
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=left db tb
    [| makeresults
    | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752834}"
    | append
        [| makeresults
        | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752776}"
        ]
    | rex field=_raw "schema:(?P<db>[^ ]+)"
    | rex field=_raw "table:(?P<tb>[^ ]+)"
    | eval tb=lower(tb)
    | rex field=_raw "count:(?P<cnt>[^ ]+)"
    | rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
    | stats sum(cnt) as tb_cnt by jb db tb
    | fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff

View solution in original post

ITWhisperer · ‎11-22-2021

You have changed the way you determine jobname - is it possible that

| rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"

is returning no values for jb (which would mean the subsequent stats by jb would have zero results)?

lovelyshrm421 · ‎11-22-2021

Even i tried

| eval jobname1="Job for p2"

as well .

its same actually. no results

ITWhisperer · ‎11-22-2021

Can you share some sample events from the two searches?

lovelyshrm421 · ‎11-22-2021

first query event- in the below event we are getting count directly in field- s_total_count:2.

{ [-]
   jb_name: Job for p1
   message: database:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:2 s_total_count:324101 origin_cnt_date:Tue Nov 23 06:57:49 GMT 2021
   severity: INFO
   time: 1637650742
}

Second query- this below query getting the count in the field count:20000 and having multiple events for one table per batch wise, so we need to sum up all the event for the table and compare the count with the first query count and see the difference.

event1:

{ [-]
   jb_name: Job for p2
   message: database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2
   severity: INFO
   time: 1637650935
}

event2:

{ [-]
   jb_name: Job for p2
   message: database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2
   severity: INFO
   time: 1637650875
}

ITWhisperer · ‎11-23-2021

In the first event, your table name is liabilitytype, and in the second set of events, it is Claim. This is why the join is returning zero events

lovelyshrm421 · ‎11-23-2021

We have all the table event available in first query. now check the event info i edited.

I am thinking something wrong with the join i am using its not able to calculate the count from second query.

ITWhisperer · ‎11-23-2021

I am confused as to what you think is wrong - the search is summing the counts in the second set of events - 20000 + 20000 = 40000 and you are comparing this to the count from the field in the first event - 2

Instead of summing the counts in the second set of events, should you be counting the events - 2?

lovelyshrm421 · ‎11-24-2021

I need to sum-up the count from second query(both event for claim table field count: 20000 + 20000) and compare with the field s_total_count:324101 from first query and post the difference.

Which is not happening .

ITWhisperer · ‎11-24-2021

Try changing

| stats sum(cnt) as tb_cnt by jb db tb

to

| stats sum(cnt) as tb_cnt values(jb) as jb by db tb

lovelyshrm421 · ‎11-24-2021

I changed to

| stats sum(cnt) as tb_cnt values(jb) as jb by db tb

but still no results from second query.

ITWhisperer · ‎11-24-2021

The events look like they might be in JSON format - can you share the raw events (rather than the formatted events)? Also, have you considered using spath to retrieve the fields?

lovelyshrm421 · ‎11-24-2021

Event for first query- table -claim

{"severity": "INFO", "message": "database:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021", "jb_name": "Job for p1", "time": 1637751926}

Events for second query- table- claim

{"severity": "INFO", "message": "database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2", "jb_name": "Job for p2", "time": 1637752834}

{"severity": "INFO", "message": "database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2", "jb_name": "Job for p2", "time": 1637752776}

have you considered using spath to retrieve the fields? No i haven't tried yet.

ITWhisperer · ‎11-24-2021

Never mind about the spath - I didn't realise that most of the information is taken from the message field

Your search appears to work with the data you shared (apart from I "corrected" claim to Claim in the table name as the join is case-sensitive)

| makeresults 
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021\", \"jb_name\": \"Job for p1\", \"time\": 1637751926}"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=left db tb
    [| makeresults
    | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752834}"
    | append
        [| makeresults
        | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752776}"
        ]
    | rex field=_raw "schema:(?P<db>[^ ]+)"
    | rex field=_raw "table:(?P<tb>[^ ]+)"
    | rex field=_raw "count:(?P<cnt>[^ ]+)"
    | rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
    | stats sum(cnt) as tb_cnt by jb db tb
    | fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff

So perhaps there is something different about your real data?

lovelyshrm421 · ‎11-24-2021

Yes, Exactly I have observed that from first query events are like table name -all small case, and from second query table name is in mix case(upper +lower), so is there a way we can change the case(to_lower) before comparing the table and db?

ITWhisperer · ‎11-24-2021

Force them both to lower case

| makeresults 
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021\", \"jb_name\": \"Job for p1\", \"time\": 1637751926}"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| eval tb=lower(tb)
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=left db tb
    [| makeresults
    | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752834}"
    | append
        [| makeresults
        | eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752776}"
        ]
    | rex field=_raw "schema:(?P<db>[^ ]+)"
    | rex field=_raw "table:(?P<tb>[^ ]+)"
    | eval tb=lower(tb)
    | rex field=_raw "count:(?P<cnt>[^ ]+)"
    | rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
    | stats sum(cnt) as tb_cnt by jb db tb
    | fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff

lovelyshrm421 · ‎11-24-2021

This seems to be working. Perfect. Thanks alot. Great help.

But let me test out few more scenario's for this and if i face any issue will post here.

Thanks Again for helping 🙂

two search queries with join not working

count

stats

table

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Index This | What travels the world but is also stuck in place?

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

Join the Conversation