I want to collect all data before a specified text or that ends with it, I have tried the following:
But you have to specify how many characters you want to collect before the specified ending text, I want to just collect all of them (Note that I'm dealing with Python/Perl regexes here in Splunk)
The reason for this is to add the rest to the NullQueue while collecting these.
A quick example of what I have been trying to do, but been failing at:
[ABC_setnull] REGEX = .* DEST_KEY = queue FORMAT = nullQueue [ABC_setparsing] REGEX = (.+?)ABC_.*|(.+?)\> DEST_KEY = queue FORMAT = indexQueue
Perhaps you could try something like this with a stanza in transforms.conf that will just keep the event text up to the end pattern:
[yoursourcetype] TRANSFORMS-yourtransform = scooby-dooby-doo
[scooby-dooby-doo] REGEX = (?m)^(.*)yourendpattern$ FORMAT = $1 DEST_KEY = _raw
Rather than using null queues , using an anonymization based approach.
Thanks for answering, I havent heard of this before, I have read the documentation, but I cant find the "event-stripper" stanza in the transforms.conf nor the props.conf templates here: http://docs.splunk.com/Documentation/Splunk/5.0.2/Admin/Propsconf
So basically what your referring to here is close to the section of "Filter data and send rest to NullQueue" here: http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Routeandfilterdatad ???
"event-stripper" is an EXAMPLE name I made up.You could call the stanza anything you like and reference it from props.conf, adjusted example above.
Also , if I understand correctly you are trying to do more than filter and route raw events , you are trying to split up a raw event and only index part of it.
Not really, I want to just filter events from a log, I thought pushing what I don't want to NullQueue might work, but its not, I have tried and tried, just now its been an hour and it seems that nothing is being indexed at all!, I'm willing to try your recommendation, so this will obviously not create a field called "scooby-dooby-doo" right?, but only index data that match the applied index (Note: the data is being forwarded to the indexer from a Forwarder).
Btw, the main reason for all this, is to filter the data of events that I dont want to be indexed....
scooby-dooby-doo is an example "stanza name".
Replace with whatever you want.
In the above example the raw event(dest_key = _raw) is being transformed to only include the text before the end pattern that you specify.
Yep, it all makes sense, but is there a difference to saying destkey=raw and dest_key=Queue?
(I have modified my question)
btw, does the regex also include the end pattern along with it?
I'm getting lost with your intent now.
Do you want to :
1) route raw events to different queues based on a pattern in the raw event
2) index only certain text from each event ( from original question : collect all data before a specified text or that ends with it )
Well, number (2) seems to be the one, as the main purpose of all this is to filter the data to get rid of events that I don't want indexed so that I wont reach my licensing quota, so basically if you look at my Update in the question above, you will notice that Im picking up what I want from the logs and putting it to the queue, then the rest that I dont need and have not specified goes all the way out of my way to the NullQueue.
Hope this makes sense??
actually , 1) is what you are trying to do. 2) is altering the raw events to strip out text. 1) will route off full events you don't want to a null queue if you so desire.
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
REGEX = (.+?)ABC.*|(.+?)>
DESTKEY = queue
FORMAT = indexQueue
Can you show me an example event , perhaps your regex is wrong ?