I am on Splunk 7.0.2 and trying to join two search strings with a common field but for reason this is not working.
index="someindex" user="ABCD" | rename user as username | join type=left username [ search index="windows" | fields username, firstname, lastname] | table username, firstname, lastname
This is a sample of my search string which is very basic and its still not working.
The main search returns 5 results when searched like
And the secondary search return 15 results when searched like
index="windows" username="ABCD" | fields username, firstname, lastname
give this a try:
index="someindex" user="ABCD" OR index="windows" | eval username=case(isnotnull(user), user, isnotnull(username), username, 1=1, "unknown") | fields username, firstname, lastname | stats values(*) AS * by username
table to sort it as needed.
Hope this helps ...
The reason I am using Join is to pull list of users in main search and then look for those users in secondary search and the way you are proposing is pulling both main & secondary users list and then pick the ones that has both the values, this will put lot of stress to the server. Also in realty the fields are not just username, firstname and lastname. I just did put a sample, I want to join the searches based on the common field username so that I can grab all the fields from both the searches. Your answer won't help. Sorry.
Okay, lets take a different approach here ...
join is not the join one knows in the DB world.
join is one of the most inefficient SPL commands, and has a lot of limitations, by using sub searches, that you can hit even without knowing you hit them http://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutsubsearches#Subsearch_performance_con...
On the other side
stats is one of the most efficient SPL commands you can use, more here https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... or here http://sideviewapps.com/slides/2017_05_02_sideview_let_stats_sort_them_out.pptx
That said, if you combine two searches as base filter, and make them as specific as possible by using all metadata fields
host as well as all needed fields like
bar=* and add the
fields command http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Fields after the first
| you will tell Splunk just to look for and return these fields. In addition using
stats you will make Splunk to use map reduce and parts of the search results are pre-processed on the indexers.
Any search using sub searches will be most likely run twice as long (there are exception) and put more load on the servers. So by using a
join you actually do exactly what you mentioned as reason you don't want to use a
stats search 😉
Hope this makes more sense now, and give it a try and compare the run times of the join search and of the stats search ... I'm pretty confident you will no longer use
join after this comparison 😉
Aside from the inefficiencies of using 'join', do you know why the original join did not work? Is there anything wrong with its syntax? I am going through the same kind of problem with a different search and using a savedsearch in the subsearch (with only about 800 results in this savedsearch).
Specifically, main search produces a table of results with the following fields: field1, field2, field3, field4, (uses stats + other streaming commands) while the savedsearch produces another set of fields (field1, field5, field6, field7, etc.). My combined search statement is as follows: my-main-search | join field1 [savedsearch "mysavedsearch"]. Unfortunately the addition of the join does not change at all the result of the original query (without the join). Do you have any suggestions here?
Without sample events, both complete searches commands it is nearly impossible to tell.