Getting Data In
Highlighted

About distributed search.

Builder

In my environment, I have two indexers for one Search head.

I think that these commands like "search", "dedup", "transaction" are processed by indexer in distributed search.

But are these commands in the sub search such as "map", "join" etc processed by indexer too?
Could anyone tell me?

0 Karma
Highlighted

Re: About distributed search.

Influencer

Hi yutaka1005!

I recommend checking out this doc on "Types of Commands"

http://docs.splunk.com/Documentation/Splunk/latest/Search/Typesofcommands

and

"command types"

https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Commandsbytype

Which will give you an in-depth tour of the various types of search commands available to you, and how they function

Technically the indexers will be involved in all the commands you mentioned, as they will return events to the search head for further processing.

To your question specifically, join is listed as a centralized streaming command, which means it is run on the search head as events come back from the indexers.

Map is not listed but I would guess it is in the same category based on how I've seen it used

View solution in original post

0 Karma
Highlighted

Re: About distributed search.

Builder

Hi mmodestino_splunk!
Thank you for your polite answer.

I saw a bit of the document you taught me,but It seems that it will take time to understand it....

But I understood commands that I mentioned are processed by indexer.
And I understood that the search command in Join command is processed by indexer and the result is returned to search head, and join is processed there.
Also, I understood that map is a similar category.

But there is one point to wonder about.
Dedup is described as Centralized streaming command.
Is this command processed by search head?

0 Karma
Highlighted

Re: About distributed search.

Super Champion

Dedup is processed on the Search Head side.

0 Karma
Highlighted

Re: About distributed search.

Influencer

^^^

Dedup requires the peers to return all the results to a central location (the search head) so that we can dedup. It is streaming because we can do it as the results come in.

0 Karma
Highlighted

Re: About distributed search.

Builder

Thank you for your comments !

I understood about dedup!

0 Karma
Highlighted

Re: About distributed search.

Esteemed Legend

I am quite certain that dedup occurs both places and does map-reduce. An initial reduced local-scope dedup will occur on each Indexer and the final aggregated global-scope dedup will occur on the Search Head. Because map kicks off new searches, things must start at that point on the Search Head but it's work does map-reduce. Using join should always be avoided so let's not even talk about that (use stats, streamstats, etc. instead). Why is this important to you? Get your search working FIRST, then optimize it later. Just be sure to get it working WITHOUT using join or transaction and you should be fine.

0 Karma
Highlighted

Re: About distributed search.

Builder

Does it mean that the search head collects each data once deduped with each indexer and then do dedup processing to them again?

And do you talk about "dedup" in "map" command like this?
main search | map search="... | dedup"

0 Karma