I’m using Splunk Enterprise 9 with Universal Forwarder 9 on Windows. I'd like to monitor several structured log files but only ingest specific lines from these files (basically each line begins with a well-defined string so easy to create matching regular expression or simple match against it). I’m wondering where this can be achieved?
Q: Can the UF do this natively or do I need to monitor the file as a whole then drop certain lines at the indexer?
It doesn't work that way.
You should do
TRANSFORMS-netlogon_send_to_nullqueue = netlogon_send_all_to_nullqueue, netlogon_keep_some
And have the netlogon_send_all_to_nullqueue transform send completely _everything_ to nullQueue
[netlogon_send_all_to_nullqueue]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
And then keep only some of them - matching the string you want
[netlogon_keep_some]
REGEX = NO_CLIENT_SITE
DEST_KEY = queue
FORMAT = indexQueue
Hi @shocko,
The typical approach discards lines at an intermediate heavy forwarder or indexer by sending them to nullQueue:
# props.conf
[my_sourcetype]
# add line and event-breaking and timestamp extraction here
TRANSFORMS-my_sourcetype_send_to_nullqueue = my_sourcetype_send_to_nullqueue
# transforms.conf
[my_sourcetype_send_to_nullqueue]
# replace foo with a string or expression matching "keep" events
REGEX = ^(?!foo).
DEST_KEY = queue
FORMAT = nullQueue
As with @PickleRick, I've not seen a common use case for force_local_processing. I often say I don't want my application servers turning into Splunk servers, so I prioritize a lightweight forwarder configuration over data transfer. If CPU cores (fast growing files) and memory (large numbers of files) cost you less than network I/O, you may prefer the force_local_processing option; you won't save on disk I/O either way.
If you need a refresher on the functions performed by the uft8, linebreaker, aggregator, and regexreplacement processors, see https://community.splunk.com/t5/Getting-Data-In/Diagrams-of-how-indexing-works-in-the-Splunk-platfor....
@tscroggins thanks for the steer. I'm close ot getting this working but when I implemenet the transform it drops my event. The even tline looks as follows
SOMEDATA NO_CLIENT_SITE: MYSYSTEM 10.15.37.48
My props.conf is as follows:
[netlogon]
DATETIME_CONFIG =
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Custom
pulldown_type = 1
TRANSFORMS-netlogon_send_to_nullqueue = netlogon_send_to_nullqueue
My transforms.conf
[netlogon_send_to_nullqueue]
REGEX = ^(?!NO_CLIENT_SITE).
DEST_KEY = queue
FORMAT = nullQueue
Is it the regEx at fault here? I have been playing with it at regex101: build, test, and debug regex but I cannot see the issue.
As configured, the transform will match and discard all events that do not start with NO_CLIENT_SITE. An event starting with SOMEDATA (any string that isn't NO_CLIENT_SITE) would be discarded. Was that your intent?
My intent is that any event message without the string NO_CLIENT_SITE anywhere in it is discarded.
It doesn't work that way.
You should do
TRANSFORMS-netlogon_send_to_nullqueue = netlogon_send_all_to_nullqueue, netlogon_keep_some
And have the netlogon_send_all_to_nullqueue transform send completely _everything_ to nullQueue
[netlogon_send_all_to_nullqueue]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
And then keep only some of them - matching the string you want
[netlogon_keep_some]
REGEX = NO_CLIENT_SITE
DEST_KEY = queue
FORMAT = indexQueue
OK got it so basically:
I'll give it a whirl! Thanks @PickleRick and @tscroggins
Firstly - what do you mean by "structured" here. If you mean INDEXED_EXTRACTIONS, the situation is getting complicated because UF does the parsing and the event is not touched after that (except for ingest actions)
If you just mean a well-known and well-formed events, you could try enabling force_local_processing on your UF
force_local_processing = <boolean> * Forces a universal forwarder to process all data tagged with this sourcetype locally before forwarding it to the indexers. * Data with this sourcetype is processed by the linebreaker, aggerator, and the regexreplacement processors in addition to the existing utf8 processor. * Note that switching this property potentially increases the cpu and memory consumption of the forwarder. * Applicable only on a universal forwarder. * Default: false
It' s worth noting though that it's not a recommended setting and it not widely used so you can get problems finding support in case anything goes wrong.
I mean structured in terms of each line in the log following a defined structure (space delimited fields) that lends itself to easy parsing.