Solved: Re: heavy-forwarder configuration

ryastrebov · ‎10-01-2014

Hello!

I need help to configuration a heavy-forvarder.
My data contain event of 9 types:

datetime1,type1,val1,val2,val3,...
datetime2,type2,val1,val2,val3,...
datetime3,type4,val1,val2,val3,...
datetime4,type5,val1,val2,val3,...
datetime5,type3,val1,val2,val3,...
datetime6,type1,val1,val2,val3,...
datetime7,type2,val1,val2,val3,...
datetime8,type7,val1,val2,val3,...
datetime9,type6,val1,val2,val3,...
datetime10,type8,val1,val2,val3,...
datetime11,type9,val1,val2,val3,...
datetime12,type4,val1,val2,val3,...
datetime13,type2,val1,val2,val3,...
datetime14,type4,val1,val2,val3,...

I have 3 indexers. Every indexer contain 3 index named by type events:

indexer1. Index: type1, type2, type3
indexer2. Index: type4, type5, type6
indexer3. Index: type7, type8, type9

I need to sort data by indexes using heavy-forwarder. Can you please tell me what to do for this?

Best regards,
Roman

lguinn2 · ‎10-01-2014

Is there some reason that each indexer is only responsible for a subset of the data? Because it would be a lot more common, and generally a better configuration, to let all the indexers have all the data. If you need to separate the data, it would be easier and better in most cases to separate the data into different indexes - not separate indexers.

But to give you what you want: On the heavy forwarder -

props.conf

[source::/fullpathtotheinput]
TRANSFORMS-route=route-index1,route-index2,route-index3

transforms.conf

[route-index1]
SOURCE_KEY=_raw
REGEX=,(?:type1|type2|type3),
DEST_KEY=_TCP_ROUTING
FORMAT=group1

[route-index2]
SOURCE_KEY=_raw
REGEX=,(?:type4|type5|type6),
DEST_KEY=_TCP_ROUTING
FORMAT=group2

[route-index3]
SOURCE_KEY=_raw
REGEX=,(?:type7|type8|type9),
DEST_KEY=_TCP_ROUTING
FORMAT=group3

outputs.conf

[tcpout:group1]
server=indexer1.yourcompany.com:9997

[tcpout:group2]
server=indexer2.yourcompany.com:9997

[tcpout:group3]
server=indexer3.yourcompany.com:9997

This should work, although you should probably test the regular expressions...

View solution in original post

lguinn2 · ‎10-01-2014

Is there some reason that each indexer is only responsible for a subset of the data? Because it would be a lot more common, and generally a better configuration, to let all the indexers have all the data. If you need to separate the data, it would be easier and better in most cases to separate the data into different indexes - not separate indexers.

But to give you what you want: On the heavy forwarder -

props.conf

[source::/fullpathtotheinput]
TRANSFORMS-route=route-index1,route-index2,route-index3

transforms.conf

[route-index1]
SOURCE_KEY=_raw
REGEX=,(?:type1|type2|type3),
DEST_KEY=_TCP_ROUTING
FORMAT=group1

[route-index2]
SOURCE_KEY=_raw
REGEX=,(?:type4|type5|type6),
DEST_KEY=_TCP_ROUTING
FORMAT=group2

[route-index3]
SOURCE_KEY=_raw
REGEX=,(?:type7|type8|type9),
DEST_KEY=_TCP_ROUTING
FORMAT=group3

outputs.conf

[tcpout:group1]
server=indexer1.yourcompany.com:9997

[tcpout:group2]
server=indexer2.yourcompany.com:9997

[tcpout:group3]
server=indexer3.yourcompany.com:9997

This should work, although you should probably test the regular expressions...

ryastrebov · ‎10-01-2014

Hello Iguinn!

Thanks you for quickly and detailed answer!

My data have very big volume and they should be stored for a year. Unfortunately, disk space of each indexer can not store the entire volume of the data...

Best regards,
Roman

lguinn2 · ‎10-02-2014

Hi Roman -

If you use "auto load balancing" on the forwarders, each forwarder will send approximately 1/3 of the data to each indexer. It's only one copy of the data, so it won't take any additional space. This is a best practice. Then you add distributed search (required if you "auto load balance"), and Splunk will search across all 3 indexers at once.

This will almost certainly make your searches run faster. It also gives the environment some resilience and ability to grow easily.

ryastrebov · ‎10-02-2014

Hi Iguinn!

Thank you for your advice!
I'm new in "auto load balancing" configuration of Splunk...
I have 14 Splunk indexers servers and 1500 indexes. Indexes vary by volume. I had planned to to split manually indexes to all servers evenly by volume (approximately 100-300 indexes on the server). If I understand your approach, I need to create 1500 indexes on every server, right? As in this case, adjust the size of the indexes?

Best regards,
Roman

lguinn2 · ‎10-06-2014

Hi Roman - yes, if you use "auto load balancing", the easiest thing to do is to configure all the servers (indexers) identically. And yes, you should adjust the size of the indexes on each server (indexer) downward.

And here is another question: why 1500 indexes? This seems like a huge number.

ryastrebov · ‎10-06-2014

Hi Iguinn!

A reason - a very large number of events per minute (approximately 20000000 event per minute). As well as the need to quickly search for the last year...

heavy-forwarder configuration

Can’t make it to .conf25? Join us online!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Are you a member of the Splunk Community?

heavy-forwarder configuration

Can’t make it to .conf25? Join us online!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions