Hi All,
I have a lot of compressed files in a local directory that I want Splunk to ingest.
I set up a directory as an input via the WebUI, but I only want events that contain a key word like "usasite.com"
The raw data is in JSON format and the majority of the data is similar having the following pattern like this:
.................,"requestBody":"{\"siteId\":\"usasite.com\",\"data\":{\............
I want to filter and drop events that don't have usasite.com in the raw data.
I created props and transforms in system/local using a test source
I place a couple files in the dir /data/test_files... one file has usasite.com and the other file does not.
Props.conf
[source::/data/test_files]
TRANSFORMS-set = setnull, setparsing
Transforms.conf
[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
[setparsing]
REGEX =usasite.com
DEST_KEY = queue
FORMAT = indexQueue
But I cannot get the filter to work... Splunk grabs both files.
I feel I must not be setting up the regex correctly.
Any advice appreciated.
Thank you
What you have done looks correct. However make sure you put these files on the indexers or on the heavy forwarder if the data is going through a hf.
This null queuing does not happen on the UF.
Good luck
You need to deploy this to the UF if you are using INDEXED_EXTRACTIONS
or to the HFs or Indexers otherwise. You need to restart all Splunk instances there. You must only check events that were forwarded AFTER the restart. If you have done a sourcetype value override, you must use the ORIGINAL sourcetype value in props.conf
.
Sorry I did not mention earlier, this is a standalone 7.1 ec2 that I am using for an emergency ingestion situation. I have not setup the production data I need yet, only testing with a test_files dir so far... and no luck.
So to recap, I have a lot of .gz files in /data and I want to ingest them but drop any event that does not have usasite.com in it. Not sure if that is possible.
What you have done looks correct. However make sure you put these files on the indexers or on the heavy forwarder if the data is going through a hf.
This null queuing does not happen on the UF.
Good luck
thx for the feed back its actually on a standalone 7.1 ec2 instance.
So it is working now?