I have a very noisy app log. I want to use Splunk's indexer to filter only relevant data and index them. Basically I need to match a string 'Error', only forward the matched line and the line preceding that one for indexing. In other words, I need to do a grep and a grep -B1 for the string Error. Then, I only want to index those events using Splunk's indexer filtering. How do I do that?
Example: I have this log data
INFO: Task1
INFO: OK
INFO: Task 2
ERROR: exception xyz
Here, I only want to capture and index this:
INFO: Task 2
ERROR: exception xyz
Hi @mahars01,
are you speaking of events filtering at Index Time (before indexing) or at Search Time (you index all and display only the needed events)?
Anyway, it's easy to take only some events (both at Index or Search Time) filtering and discarding the others, it's more difficoult to take one event and also the previous oneand I'm not sure that's possible at Index Time and maybe also at Search Time.
The only way, at Index Time is pre-parse the log using a script.
Ciao.
Giuseppe
I am talking about filtering the data before it gets indexed. I donot want to index irrelevant data. I know you can use sed in props.conf and sed does have that kind of feature that gives u the matched event and the one before that. Just not sure how to use that to only index the ones i need and discard the rest.
Hi @mahars01,
my hint is to find the regexes to identify not relevant events in your logs and discard them using the props and trasforms method.
For more infos see at https://docs.splunk.com/Documentation/Splunk/latest/Forwarding/Routeandfilterdatad#Discard_specific_...
Ciao.
Giuseppe
I had already thought of that. Unfortunately that's not going to work on my scenario.