Hi All,
I have transaction data from a database and want to compare it with an index in splunk, filtering the transaction data which is has not exist in the index
Have query like this :
| dbxquery connection=monsplunk query="select userid, acctno, trxamt, trxstatus from "appdb"."apppymt" where accttyp is null "
| join type=outer userid
[search index=trxpayment_idx | fields userid]
| eval mark = if (isnull(userid),"blank",userid)
|search mark=blank
|table userid, acctno, trxamt, mark
when run the query above, the result still shown all data from transaction without filter from index data
opposite result with lookup, using a same query and only change index in to inputlookup :
| dbxquery connection=monsplunk query="select userid, acctno, trxamt, trxstatus from "appdb"."appymt" where accttyp is null "
| join type=outer userid
[|inputlookup trxpayment.csv]
| eval mark = if (isnull(userid),"blank",userid)
|search mark=blank
|table userid, acctno, trxamt, mark
it shown filtered data from lookup file
I prefer using index compare to lookup file , because the size of data
any one can help with index ? or if you have alternative it would be preferable too
It is not clear to me what is going on here
Your join doesn't appear to be adding anything - your join field is the only field returned by the subsearch so is potentially redundant?
You say you want to use a index subsearch because the data is large - how large? are you getting any warnings about the subsearch being truncated?
Hi @ITWhisperer ,
Thanks for your response, the data in the index > 1.5M events, i don't thinks if csv can handle that amount
Like i said, if i'm outer joining main search and index as subsearch, i can't get result filtered
but when i'm using inputlookup the result has filtered well
Or may be i have wrong understanding in joining the index ?
Subsearches are limited (usually to about 50K events) which is probably why (with 1.5M) this is failing to work as you expected.
How many events are returned by the dbxquery?
result from dbxquery is depend on transaction data, but average 90-100 transaction
i need to compare it with another data which is stored in index.
when using join outer with inputlookup, the filter is working well, but when using join outer with index, it looks like not filter at all
This is inconsistent with the searches you posted:
You have restricted the fields returned by the index search to just the userid, which is the join field, so there is no other data.
You then effectively search for the userid being null, which wouldn't have found anything on the join, so even if you had returned more data from the index, you are filtering out these events.
It is not clear whether all 1.5M events are returned by the index search, if they are these would have been truncated as it is a subsearch.
You haven't said whether there are any warning or error messages in the logs.
Further, it is still unclear to me what you are trying to achieve. Can you provide some sample (sanitised) events from the dbxquery and index searches and show which data you expect in your final result so we can better understand what you are trying to achieve?
hi @ITWhisperer ,
Yes you are right, the subsearch has been truncated when i use index (max 50.000 events), that's why the result not shown as i expected, different result if i'm using inputlookup as subsearch since the data in the csv only sample data (<5000 events)
again thanks for you enlightenment, so i need to figure out another way