This can be done, however using it might be expensive and you should be worried about indexing performance. How much data a day are you gathering from this source? You will have to calculate if decrease in performance is more valuable then increase in your license volume...
Using props and transforms, just the same way that you perform field extractions you can tell splunk to change what it considers as raw data.
something like this should be what you want:
In props.conf:
[source::]
TRANSFORMS-set= crop
In transforms.conf:
[setnull]
REGEX =
DEST_KEY = _raw
FORMAT = $1
in your case i believe the regex should be something like:
( .*\s<\w+)
This will capture the following:
|œ ââL1289937535.401 32 10.135.73.188 TCP_MISS/304 229 GET http://photos-b.ak.fbcdn.net/photos-ak-snc1/v27562/209/148475945166653/app_2_148475945166653_1896.gif - DIRECT/photos-b.ak.fbcdn.net image/gif ALLOW_CUSTOMCAT_11-Aurora_Base_Policy-DefaultGroup-NONE-NONE-NONE-DefaultGroup <C_All0
AND SHOULD through away the rest
,-,"-","-",-,-,-,"-","-",-,-,-,"-","-",-,"-","-",-,-,-,-,"-","-","-","-","-","-",57.25,0,-,"-","-">
I'd suggest testing in a dev environment, for both performance and to see if it works or not.
hope this helps,
.gz
... View more