I'm dealing with a highly customized access log that isn't being processed properly by access_combined sourcetype during indexing. Fields aren't being pulled out.
Is there a way to write a regex in search time extractions that will simply do something similar to a split(_raw," ")?
I CANNOT do something like because it's way too slow to rex out or split the entire data set (it's huge):
index=blah | rex field=_raw "(?<field1>.+)\s(?<field2>.+)\s | search field1=wowthiswasslow"
It needs to be streaming so that I can search like:
index=blah extracted_field1=thatwasfast
Actually I was making this harder than it had to be. Just go to "Extract more fields" then choose "Delimiter" then choose Space as the delimiter.
Actually I was making this harder than it had to be. Just go to "Extract more fields" then choose "Delimiter" then choose Space as the delimiter.
Yup, that’s a better way to parse delimited fields long term.
I honestly can’t imagine splitting by spaces is going to work for web logs, though. What about useragents?
| eval allfields=_raw | makemv allfields
This may not work will for your specific use case, but it will split into as many values as it needs. delim
defaults to , so I left it out of the command.
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/SearchReference/Makemv
Hmm ok, I'm looking for it now, but how do I search for something like allfields[0]=blah?
Found it: | eval field7=mvindex(allfields,7) | search field7=200
Still left wondering if this will be slow due to the extra | search