I have two separate search queries which are working separately but when i am trying to get data by joining them its not giving me any result from second query.
first query-
index=ads sourcetype="sequel"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db tb cnts dates jobname
| where cnts>0
| table dates jobname db tb cnts
second query-
index=ads sourcetype="isosequel"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "count:(?P<cnt>[^ ]+)"
| eval jobname1="Job for p2"
| stats sum(cnt) as tb_cnt by jobname1 db tb
| fields jobname1 db tb tb_cnt
|table jobname1 db tb tb_cnt
joined query(not working as expected)-
index=ads sourcetype="sequel"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=inner db tb
[ search(index=ads sourcetype="isosequel")
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "count:(?P<cnt>[^ ]+)"
| rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
| stats sum(cnt) as tb_cnt by jb db tb
| fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff
requirement- I want to compare each db ,table with the second query db, table and get the difference, but i am not getting any result out of second query. any help would be appreciated !!!
Thankyou in Advance !!
Force them both to lower case
| makeresults
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021\", \"jb_name\": \"Job for p1\", \"time\": 1637751926}"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| eval tb=lower(tb)
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=left db tb
[| makeresults
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752834}"
| append
[| makeresults
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752776}"
]
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| eval tb=lower(tb)
| rex field=_raw "count:(?P<cnt>[^ ]+)"
| rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
| stats sum(cnt) as tb_cnt by jb db tb
| fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff
You have changed the way you determine jobname - is it possible that
| rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
is returning no values for jb (which would mean the subsequent stats by jb would have zero results)?
Even i tried
| eval jobname1="Job for p2"
as well .
its same actually. no results
Can you share some sample events from the two searches?
first query event- in the below event we are getting count directly in field- s_total_count:2.
{ [-]
jb_name: Job for p1
message: database:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:2 s_total_count:324101 origin_cnt_date:Tue Nov 23 06:57:49 GMT 2021
severity: INFO
time: 1637650742
}
Second query- this below query getting the count in the field count:20000 and having multiple events for one table per batch wise, so we need to sum up all the event for the table and compare the count with the first query count and see the difference.
event1:
{ [-]
jb_name: Job for p2
message: database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2
severity: INFO
time: 1637650935
}
event2:
{ [-]
jb_name: Job for p2
message: database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2
severity: INFO
time: 1637650875
}
In the first event, your table name is liabilitytype, and in the second set of events, it is Claim. This is why the join is returning zero events
We have all the table event available in first query. now check the event info i edited.
I am thinking something wrong with the join i am using its not able to calculate the count from second query.
I am confused as to what you think is wrong - the search is summing the counts in the second set of events - 20000 + 20000 = 40000 and you are comparing this to the count from the field in the first event - 2
Instead of summing the counts in the second set of events, should you be counting the events - 2?
I need to sum-up the count from second query(both event for claim table field count: 20000 + 20000) and compare with the field s_total_count:324101 from first query and post the difference.
Which is not happening .
Try changing
| stats sum(cnt) as tb_cnt by jb db tb
to
| stats sum(cnt) as tb_cnt values(jb) as jb by db tb
I changed to
| stats sum(cnt) as tb_cnt values(jb) as jb by db tb
but still no results from second query.
The events look like they might be in JSON format - can you share the raw events (rather than the formatted events)? Also, have you considered using spath to retrieve the fields?
Event for first query- table -claim
{"severity": "INFO", "message": "database:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021", "jb_name": "Job for p1", "time": 1637751926}
Events for second query- table- claim
{"severity": "INFO", "message": "database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2", "jb_name": "Job for p2", "time": 1637752834}
{"severity": "INFO", "message": "database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2", "jb_name": "Job for p2", "time": 1637752776}
have you considered using spath to retrieve the fields? No i haven't tried yet.
Never mind about the spath - I didn't realise that most of the information is taken from the message field
Your search appears to work with the data you shared (apart from I "corrected" claim to Claim in the table name as the join is case-sensitive)
| makeresults
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021\", \"jb_name\": \"Job for p1\", \"time\": 1637751926}"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=left db tb
[| makeresults
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752834}"
| append
[| makeresults
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752776}"
]
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| rex field=_raw "count:(?P<cnt>[^ ]+)"
| rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
| stats sum(cnt) as tb_cnt by jb db tb
| fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff
So perhaps there is something different about your real data?
Yes, Exactly I have observed that from first query events are like table name -all small case, and from second query table name is in mix case(upper +lower), so is there a way we can change the case(to_lower) before comparing the table and db?
Force them both to lower case
| makeresults
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:claim count:1 queue:-1 pipelineuser:l@gmail.com jobname:Job for p1 pk_field:id s_pk_count:324203 \ts_total_count:324203 origin_cnt_date:Wed Nov 24 11:04:31 GMT 2021\", \"jb_name\": \"Job for p1\", \"time\": 1637751926}"
| eval jobname="Job for p1"
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| eval tb=lower(tb)
| rex field=_raw "s_total_count:(?P<cnts>[^ ]+)"
| rex field=_raw "origin_cnt_date:(?P<dte>[\D]+[\d]+[ ][\d]+[:]+[\d]+[:]+[\d]+[ ][\D]+[\d]+)"
| eval date= strptime(dte, "%a %B %d %H:%M:%S")
| eval dates=strftime(date, "%Y-%m-%d")
| fields db, tb, cnts, dates, jobname
| join type=left db tb
[| makeresults
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752834}"
| append
[| makeresults
| eval _raw="{\"severity\": \"INFO\", \"message\": \"database:local_db schema:dbo table:Claim count:20000 queue:-1 pipelineuser:l@gmail.com jobname:Job for p2\", \"jb_name\": \"Job for p2\", \"time\": 1637752776}"
]
| rex field=_raw "schema:(?P<db>[^ ]+)"
| rex field=_raw "table:(?P<tb>[^ ]+)"
| eval tb=lower(tb)
| rex field=_raw "count:(?P<cnt>[^ ]+)"
| rex field=_raw "jobname:Job for (?P<jb>[a-z_A-Z0-9]+)"
| stats sum(cnt) as tb_cnt by jb db tb
| fields db, tb, tb_cnt, jb]
| eval diff = cnts-tb_cnt
| table dates, jobname, jb, db, tb, cnts, tb_cnt, diff
This seems to be working. Perfect. Thanks alot. Great help.
But let me test out few more scenario's for this and if i face any issue will post here.
Thanks Again for helping 🙂