So after spending a lot of time googling this issue, finally I get some mixed comments.
Hence thought of asking the question here to get the clarification on the issue.
Starting with the environment, I have an indexer cluster of 3 indexers, two independent search heads, and one Universal forwarder.
My question is where the BRO IDS app goes and how it works?
What I have done is - I have installed the app on both of my search heads (as per general convention while dealing with apps), and my Universal Forwarder is monitoring the Bro log directory (yes I have installed UF on my Bro sensor machine).
I am getting the monitored Bro logs in my indexers and am able to search them via search heads, but the app is just sitting there doing nothing it seems.
The documentation I have read so far says that you need to install app on the heavy forwarder that is monitoring your log dir and have to set the inputs path in the app instead of heavy forwarder's input. (So I think it's stupid for the people who just want to have a forwarder installed on their bro sensor for just forwarding bro logs and for that we need to install heavy forwarder with the app, and that too app will be doing all the forwarding and parsing and heavy forwarder will be just sitting there providing Python support to the app to do its stuff).
So my question is: is my above configuration even workable with Bro IDS add-on or do I have to just chuck the idea of using the add-on because I don't want to run a heavy forwarder on my Bro machines?
Any comments would be greatly appreciated, as I have already wasted a lot of time dealing with this issue.
There is no need to have a heavy forwarder if you're simply monitoring the Bro log files. Provided you don't choose to use a heavy forwarder, then you have no need to have the technology add-on (TA) on the forwarder. Regarding the Bro source, the only thing you need on the universal forwarder is the inputs.conf file that specifies what you're ingesting. The Bro TA does need to be on the indexer because that's where the basic parsing happens, such as line-breaking and date/time extraction (note that if using a heavy forwarder, that basic parsing happens at the forwarder). The TA also needs to be on the search head for things such as search-time field extraction. All that being said, Splunk will ignore any directives that do not pertain to the level of processing it's doing as part of the data indexing pipeline. So, there is little impact if you want to put the TA at every level of you're Splunk infrastructure, forwarder, indexer, and search head.
Here are the gory details if you'd like to dig in further. http://wiki.splunk.com/Community:HowIndexingWorks
Thanks for clarifying on the installation part for the App. Appreciate it.
I looked into the props file that the app uses, and noted that it uses INDEXEDEXTRACTIONS attribute to extract the fields from the structured input (as Bro log files are tsv formatted).
I was reading about the structured logs parsing in Splunk and came across this (documentation of Splunk):
Splunk Enterprise does not parse structured data that has been forwarded to an indexer
When you forward structured data to an indexer, Splunk Enterprise does not parse this data once it arrives at the indexer, even if you have configured props.conf on that indexer with INDEXEDEXTRACTIONS. Forwarded data skips the following queues on the indexer, which precludes any parsing of that data on the indexer:
The forwarded data must arrive at the indexer already parsed. To achieve this, you must also set up props.conf on the forwarder that sends the data. This includes configuration of INDEXED_EXTRACTIONS and any other parsing, filtering, anonymizing, and routing rules. Universal forwarders are capable of performing these tasks solely for structured data. See "Forward data extracted from header files" earlier in this topic.
Hence, I haven't tried the above configuration you mentioned, but was wondering whether the TA on indexers would be able to work as per intended, since no parsing would be done on indexer for the structured data, as per the Splunk doc.
Also, I am running the current version of Bro, that is v2.4.1, and one of the major changes in Bro logging in this version is that Bro changed some of the field names to include a '.' (for ex: idorigh changed to id.orig_h etc), and therefore the current version of TA might not be able to extract out all the fields (specially the fields containing '.') as Splunk doesn't allow '.' as a valid character while naming the extracted fields, hence some changes (i.e renaming of some extracted fields) might need to be in-corporated in props.conf file rather than having the dynamically extracted fields based on the log.
Didn't get time to play around with the tweaking of defaults of the TA, so don't know if that renaming of the INDEX-EXTRACTED fields would work with the current version of TA.