We are ingesting sftp log. The logfile rotates once every 24h. "headers" are set in the new file every rotation which gets indexed.
Unlike every other event indexed, the "linecount" for this event is 2 instead of 1 so they are pretty easy to spot.
#Date: Mon Jan 10 00:00:00 CEST 2020
#Fields: date time ip port .........
I've seen examples regarding skipping header lines in CSV files, though this is a textfile. It is not a huge issue though still something which is a bit irritating.
Is it possible to skip these lines so they are not forwarded/indexed? How would I go about accomplishing this?
Thank you in advace
Hi and thanks
Hm, so basically I could do something like:
[source::/my/source/here] TRANSFORMS-null= setnull
[setnull] REGEX = ^#[a-zA-Z]+: DEST_KEY = queue FORMAT = nullQueue
In the same files where I define field extraction? Currently this TA lives on the search heads and the universal forwarder collecting the log. Do I need this TA anywhere else or would that be enough?
Thank you again
As per documentation: "Although similar to forwarder-based routing, queue routing can be performed by an indexer, as well as a heavy forwarder." Which means that you need to create a TA and deploy it to the indexer(s) or heavy forwarder (if your are using it).
The transforms file should be ok, if you are sure that events you want to keep, will not match provided REGEX.
Ah, the "caveat" at the end...
So yeah, I need to deploy the TA to the indexers to "skip" these header events once per 24h. Not sure I understand the manual here 100% though. Is it enough if this config is present on indexers and heavy forwarders, or should I push this to universal forwarder and search heads as well?
Regarding the regex, no events should ever start with # for this source, so that should be OK.
Thank you again! Fantastic feedback