sourcetype=Account contains Id values and the AccountName
sourcetype=Issue contains AccountId values but no AccountName and this is really the sourcetype that contains the stuff I want to report on...
If I do an inner join:
Filter sourcetype Account then
| eval AccountId = Id
| fields Name, AccountId
| join type=inner AccountId [search sourcetype=Issue |fields AccountId Name Issue
This search has completed and has returned 4 results by scanning 481 events in 49.392 seconds.
If I use a sub-search
[sourcetype=Account ...filter-the-AccountId...|fields + AccountId]
table AccountId Issue
This search has completed and has returned 4 results by scanning 686 events in 6.182 seconds.
What I want is
table Name Issue...
But I've tried a few things and I can't figure out how to pass the Account Name, or where I'd do a lookup for it so I could get the fields from the Account and the Issue together without doing the inner join...
How would I structure this so it's more efficient than the inner join? I know there must be a better way...
Right now I'm keeping the timeframe small, but I'd like to expand both the filters on the Accounts and the timeframe.
If I have to use the inner join, I'm afraid I'm really just approaching this wrong.
Try the following (just in case creating lookup is not an option for you):-
sourcetype=Account ...other filters...| stats count by Id, Accountname | rename Id as AccountId | fields - count | join AccountId [search sourcetype=Issue |fields AccountId Name Issue]
if there could be more that one issue for an AccountId, then use "| join max=0" instead of "|join".
thank you... yes. the lookup is indeed an issue... and the join works fine, it's just not efficient. the join is where I started. (I did try different types of joins)
That might just be where I have to be. I was hoping there was some way I was missing... where once I passed the Id/AccountId I would have a way to pass other fields as well even though they were cosmetic and had no match in the 2nd sourcetype.
The "most splunkish" (splunkonic? splunkadelic?) way of doing this is with an actual lookup file. You would do a scheduled search of the form:
sourcetype=Account | fields Id, Accountname | inputlookup append=t accounts.csv | stats first(Accountname) as Accountname by Id | outputlookup accounts.csv
This would (incrementally) maintain the lookup file in the fashion araitz demonstrates in hist blog post.
Then your real search can use a simple
lookup command (or an implicitly defined lookup in the configuration files) to bring in the Accountname data.
Another approach that might work is to use a "disjointed search" along with
stats as a row-compression function. I've not tested this, but a search similar to:
sourcetype=Account OR sourcetype=issue
| stats first(Name) as Name, first(Issue) as Issue by Id
Should get fairly close to what you're trying to do.
correct. these are extracts from relational tables... which is why this is a bit of a mess... as for renaming... yes, it would be. I was following your suggestion exactly which left out the eval. maybe I'm missing the insight you're trying to pass along... it's actually the coalesce I hadn't thought of... which might now turn out to be a help. 🙂
Isn't the naming deal solved with a simple
| eval Id=coalesce(Id,AccountId) or a
| rename AccountId AS ID ?
Also, Splunk is designed to handle extremely large lookups fairly efficiently. If your list of (Id, Accountname) tuples in
sourcetype=Account is < 1GB a lookup is still applicable.
I get a somewhat funny feeling that your
Account sourcetype is coming from a relational source of sorts anyway - so why not make a lookup file directly from there?
I appreciate the blog link. A lot of good info there. the accumulated account list isn't practical in this case, as the list is massive and the search is taking only a small slice (and that slice changes based on daily assignments so the pull needs to be dynamic)
The second suggestion doesn't account for the fact that Id = AccountId which is why I had to join... I believe the 2nd suggestion is actually where I started... and the inner join is where I ended up. - hoping for more insight.