Solved: About distributed search.

yutaka1005 · ‎07-17-2017

In my environment, I have two indexers for one Search head.

I think that these commands like "search", "dedup", "transaction" are processed by indexer in distributed search.

But are these commands in the sub search such as "map", "join" etc processed by indexer too?
Could anyone tell me?

mattymo · ‎07-17-2017

Hi yutaka1005!

I recommend checking out this doc on "Types of Commands"

http://docs.splunk.com/Documentation/Splunk/latest/Search/Typesofcommands

and

"command types"

https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Commandsbytype

Which will give you an in-depth tour of the various types of search commands available to you, and how they function

Technically the indexers will be involved in all the commands you mentioned, as they will return events to the search head for further processing.

To your question specifically, join is listed as a centralized streaming command, which means it is run on the search head as events come back from the indexers.

Map is not listed but I would guess it is in the same category based on how I've seen it used

- MattyMo

View solution in original post

woodcock · ‎07-18-2017

I am quite certain that dedup occurs both places and does map-reduce. An initial reduced local-scope dedup will occur on each Indexer and the final aggregated global-scope dedup will occur on the Search Head. Because map kicks off new searches, things must start at that point on the Search Head but it's work does map-reduce. Using join should always be avoided so let's not even talk about that (use stats, streamstats, etc. instead). Why is this important to you? Get your search working FIRST, then optimize it later. Just be sure to get it working WITHOUT using join or transaction and you should be fine.

yutaka1005 · ‎07-21-2017

Does it mean that the search head collects each data once deduped with each indexer and then do dedup processing to them again?

And do you talk about "dedup" in "map" command like this?
main search | map search="... | dedup"

mattymo · ‎07-17-2017

Hi yutaka1005!

I recommend checking out this doc on "Types of Commands"

http://docs.splunk.com/Documentation/Splunk/latest/Search/Typesofcommands

and

"command types"

https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Commandsbytype

Which will give you an in-depth tour of the various types of search commands available to you, and how they function

Technically the indexers will be involved in all the commands you mentioned, as they will return events to the search head for further processing.

To your question specifically, join is listed as a centralized streaming command, which means it is run on the search head as events come back from the indexers.

Map is not listed but I would guess it is in the same category based on how I've seen it used

- MattyMo

yutaka1005 · ‎07-17-2017

Hi mmodestino_splunk!
Thank you for your polite answer.

I saw a bit of the document you taught me,but It seems that it will take time to understand it....

But I understood commands that I mentioned are processed by indexer.
And I understood that the search command in Join command is processed by indexer and the result is returned to search head, and join is processed there.
Also, I understood that map is a similar category.

But there is one point to wonder about.
Dedup is described as Centralized streaming command.
Is this command processed by search head?

esix_splunk · ‎07-17-2017

Dedup is processed on the Search Head side.

mattymo · ‎07-18-2017

^^^

Dedup requires the peers to return all the results to a central location (the search head) so that we can dedup. It is streaming because we can do it as the results come in.

- MattyMo

yutaka1005 · ‎07-20-2017

Thank you for your comments !

I understood about dedup!

About distributed search.

New Case Study Shows the Value of Partnering with Splunk Academic Alliance

How to Monitor Google Kubernetes Engine (GKE)

Index This | How can you make 45 using only 4?