Hi,
I have a large logfile, but only want certain data. The data is very well structured:
timestamp|itemId|field1Name|field1Value|field2Name|field2Value...
1385832300000|325447|NormalizedCPUInfo|Utilization|3|CPU|IVEblah|CPU 1
1385832300000|358154|NormalizedCPUInfo|Utilization|1|CPU|FILBlah|CPU 5
1385832300000|330336|NormalizedMemoryInfo|Utilization|94|Memory|WCblah|Memory
1385832300000|326223|NormalizedCPUInfo|Utilization|3|CPU|wCblAH1|BlueCoat CPU3
1385832300000|326223|NormalizedCPUInfo|cpuIdleUtilization|97|CPU|iPS-sdf|BlueCoat CPU3
1385832300000|326223|NormalizedCPUInfo|cpuIdleUtilization|97|CPU|R7DALblh|BlueCoat CPU3
1385832300000|326223|NormalizedCPUInfo|cpuIdleUtilization|97|CPU|C29mmkabc|BlueCoat CPU3
I only want to report on certain devices, which is based on the 7th field.
My props.conf has the following entry:
TRANSFORMS-set = setnull,setparsing
And my transforms.conf has the following:
[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
[setparsing]
REGEX = (?:[^|]+|){6}(FIL|[Ww[Cc]|[Ii][Pp][Ss]|[Ii][Vv][Ee])
DEST_KEY = queue
FORMAT = indexQueue
I expect to only receive events that where the 7th field starts with FIL/WC/IPS..., yet I am receiving everything. Did I miss something? These entries are on the indexer, in a distributed environment (forward->indexer->sh).
According to regexpal.com your regex matches this:
1385832300000|325447|NormalizedCPUInfo|Utilization|3|CPU|IVEblah|CPU 1
The matched parts are bold. The current regex is not using "|" correctly as or. Everything needs to be inside a single set of brackets.
If you want to match only the events you have specified then your regex will need to look more like this:
[setparsing]
REGEX = ^(\w+\|){6}[FIL|Ww|Cc|Ii|Pp|Ss|Ii|Vv|Ee]
DEST_KEY = queue
FORMAT = indexQueue
Never mind, got it! Thanks for all the help.
I assume you restarted the instance and deployed this to the forwarder and indexer both?
Thanks, but that didn't work. I want only events where field 7 beings with FIL/WC/IPS and I'm still getting everything.