Hi
I would like to find out how I can "strip out" events from a input file before they reach the splunk indexer. I dont not wish to send the entire file to the indexer just certain events on the file.
want to drop /ignore /filter events before they get sent to the splunk indexer .
for example I have a text "/tmp/gerry.txt"
Want to just send the lines with "example1" on this file to the indexer . ie do not want to have to send the entire file as the file in real production will be quiet large
Am using the universal forwarder and am on 4.2.1
Aim is to just extract the "example1" records from the file and send these on to the indexer
Read a document detailing how to do this with props.conf and transforms.conf
[root@server1 dbs]# more /tmp/gerry.txt
example1 text
example1 text2
example1 text3
example1 text4
example1 text5
example2 text
example2 text2
example2 text3
example2 text3
example3 text1
example3 text2
example3 text3
example4 text4
example4 text1
example4 test2
example5 test3
example4 test3
example4 test443
example4 test3444
new line
another new line
I am using the file in /opt/splunk/etc/system/local/inputs.conf
so here is my input.conf file
[monitor:///tmp/gerry.txt]
sourcetype=testger
I am using the file here /opt/splunk/etc/system/local/props.conf
[source:://tmp/gerry.txt]
TRANSFORMS-set= setnull,setparsing
Then I'm using the following in the transforms.conf file
transforms.conf file in /opt/splunk/etc/system/local/transforms.conf
[setnull]
REGEX= .
DEST_KEY = queue
FORMAT = nullQueue
[setparsing]
REGEX = example1
DEST_KEY = queue
FORMAT = indexQueue
So this should strip out everything (except lines with example1) from the /tmp/gerry.txt file and it should just fordward entries to the indexer
However - it doesn't work .
[root@server1 dbs]# netstat -tulp | grep splunk
tcp 0 0 :8089 *: LISTEN 15623/splunkd
getnameinfo failed
[root@ff-osrv-03 dbs]#
I need a documented method of doing this both for the full client and for the splunk fordwarder .
Itsomana, you'd need a full Splunk installed as a HeavyForwarder on your server to perform parsing and nullQueuing as per your requirements. A Universal Forwarder won't do any parsing and leaves this job to the indexer.
So, you have two options:
- please upvote if you find this answer useful
Itsomana, you'd need a full Splunk installed as a HeavyForwarder on your server to perform parsing and nullQueuing as per your requirements. A Universal Forwarder won't do any parsing and leaves this job to the indexer.
So, you have two options:
- please upvote if you find this answer useful
With the universal forwarder this is not possible. The way that Splunk works, events can only be filtered through nullQueue
after they have been parsed. The parsing of events from a file is what breaks character sequences into lines, and lines into events, and locates timestamps, etc.
The universal forwarder does not do event parsing - it defers that work until the data gets to the indexer. As far as the universal forwarder is concerned, a file is a sequence of bytes and not a sequence of events. Once the data gets to the indexer, the sequence of bytes is then parsed into events and at that point you can filter out specific events via the nullQueue
technique you are using above.
A lightweight forwarder does not parse events either. The two Splunk configurations that do parse events are an indexer and a heavy (full) forwarder. If you must parse events at the forwarder, then you must deploy the heavy forwarder to do so.
Does data getting to the Indexer and parsed to only be filtered out count against your license? I am trying to exclude redundant, verbose noise from the events being processed because they are useless and repetitive and needlessly eat up license bandwidth.
No. You are filtering out the events before indexing by routing to the NullQueue. It is hitting your indexer, but is not being indexed. Just because data got to your indexer for processing does not necessarily mean that the data got "indexed". This allows you to trim your ingest. In most cases, you would want the full fidelity of your logs, but in some cases it is necessary or prudent to trim some waste. Just be careful that you dont exclude something that you later might find interesting. I did this once with some security relevant data that did not seem relevant when I filtered it out. Made a problem for me down the road. Good luck!
Yes, filtering can be done on the indexer.
But if you forward the logs using a LightForwarder and do the Filtering using NullQueue on the Splunk Indexer side, that should be possible right?