Getting Data In

heavy-forwarder configuration

ryastrebov
Communicator

Hello!

I need help to configuration a heavy-forvarder.
My data contain event of 9 types:

datetime1,type1,val1,val2,val3,...
datetime2,type2,val1,val2,val3,...
datetime3,type4,val1,val2,val3,...
datetime4,type5,val1,val2,val3,...
datetime5,type3,val1,val2,val3,...
datetime6,type1,val1,val2,val3,...
datetime7,type2,val1,val2,val3,...
datetime8,type7,val1,val2,val3,...
datetime9,type6,val1,val2,val3,...
datetime10,type8,val1,val2,val3,...
datetime11,type9,val1,val2,val3,...
datetime12,type4,val1,val2,val3,...
datetime13,type2,val1,val2,val3,...
datetime14,type4,val1,val2,val3,...

I have 3 indexers. Every indexer contain 3 index named by type events:

indexer1. Index: type1, type2, type3
indexer2. Index: type4, type5, type6
indexer3. Index: type7, type8, type9

I need to sort data by indexes using heavy-forwarder. Can you please tell me what to do for this?

Best regards,
Roman

Tags (1)
1 Solution

lguinn2
Legend

Is there some reason that each indexer is only responsible for a subset of the data? Because it would be a lot more common, and generally a better configuration, to let all the indexers have all the data. If you need to separate the data, it would be easier and better in most cases to separate the data into different indexes - not separate indexers.

But to give you what you want: On the heavy forwarder -

props.conf

[source::/fullpathtotheinput]
TRANSFORMS-route=route-index1,route-index2,route-index3

transforms.conf

[route-index1]
SOURCE_KEY=_raw
REGEX=,(?:type1|type2|type3),
DEST_KEY=_TCP_ROUTING
FORMAT=group1

[route-index2]
SOURCE_KEY=_raw
REGEX=,(?:type4|type5|type6),
DEST_KEY=_TCP_ROUTING
FORMAT=group2

[route-index3]
SOURCE_KEY=_raw
REGEX=,(?:type7|type8|type9),
DEST_KEY=_TCP_ROUTING
FORMAT=group3

outputs.conf

[tcpout:group1]
server=indexer1.yourcompany.com:9997

[tcpout:group2]
server=indexer2.yourcompany.com:9997

[tcpout:group3]
server=indexer3.yourcompany.com:9997

This should work, although you should probably test the regular expressions...

View solution in original post

lguinn2
Legend

Is there some reason that each indexer is only responsible for a subset of the data? Because it would be a lot more common, and generally a better configuration, to let all the indexers have all the data. If you need to separate the data, it would be easier and better in most cases to separate the data into different indexes - not separate indexers.

But to give you what you want: On the heavy forwarder -

props.conf

[source::/fullpathtotheinput]
TRANSFORMS-route=route-index1,route-index2,route-index3

transforms.conf

[route-index1]
SOURCE_KEY=_raw
REGEX=,(?:type1|type2|type3),
DEST_KEY=_TCP_ROUTING
FORMAT=group1

[route-index2]
SOURCE_KEY=_raw
REGEX=,(?:type4|type5|type6),
DEST_KEY=_TCP_ROUTING
FORMAT=group2

[route-index3]
SOURCE_KEY=_raw
REGEX=,(?:type7|type8|type9),
DEST_KEY=_TCP_ROUTING
FORMAT=group3

outputs.conf

[tcpout:group1]
server=indexer1.yourcompany.com:9997

[tcpout:group2]
server=indexer2.yourcompany.com:9997

[tcpout:group3]
server=indexer3.yourcompany.com:9997

This should work, although you should probably test the regular expressions...

ryastrebov
Communicator

Hello Iguinn!

Thanks you for quickly and detailed answer!

My data have very big volume and they should be stored for a year. Unfortunately, disk space of each indexer can not store the entire volume of the data...

Best regards,
Roman

0 Karma

lguinn2
Legend

Hi Roman -

If you use "auto load balancing" on the forwarders, each forwarder will send approximately 1/3 of the data to each indexer. It's only one copy of the data, so it won't take any additional space. This is a best practice. Then you add distributed search (required if you "auto load balance"), and Splunk will search across all 3 indexers at once.

This will almost certainly make your searches run faster. It also gives the environment some resilience and ability to grow easily.

0 Karma

ryastrebov
Communicator

Hi Iguinn!

Thank you for your advice!
I'm new in "auto load balancing" configuration of Splunk...
I have 14 Splunk indexers servers and 1500 indexes. Indexes vary by volume. I had planned to to split manually indexes to all servers evenly by volume (approximately 100-300 indexes on the server). If I understand your approach, I need to create 1500 indexes on every server, right? As in this case, adjust the size of the indexes?

Best regards,
Roman

0 Karma

lguinn2
Legend

Hi Roman - yes, if you use "auto load balancing", the easiest thing to do is to configure all the servers (indexers) identically. And yes, you should adjust the size of the indexes on each server (indexer) downward.

And here is another question: why 1500 indexes? This seems like a huge number.

0 Karma

ryastrebov
Communicator

Hi Iguinn!

A reason - a very large number of events per minute (approximately 20000000 event per minute). As well as the need to quickly search for the last year...

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...