Getting Data In

event log filter/chunk out at index time

cpuppet
Path Finder

Aug 3 23:35:01 192.168.11.11 Forwarded from 192.168.11.30: ash: [Wed Aug 03 23:35:01 2011] [error] [client 114.24.189.86] File does not exist: /WEB/roomi/public_html/img, referer: http://www.roomi.com.tw/web/album/index.php?PA=album_photo&photo_id=3997548

Aug 3 23:35:00 192.168.11.11 Forwarded from 192.168.11.44: root: [Wed Aug 03 23:35:00 2011] [error] [client 118.169.76.35] File does not exist: /WEB/roomi/public_html/img, referer: http://www.roomi.com.tw/web/album/index.php?PA=album_photo&photo_id=4018842

Aug 3 23:35:12 192.168.11.11 Forwarded from 192.168.11.36: ash: [Wed Aug 03 23:35:12 2011] [error] [client 113.61.193.153] Directory index forbidden by Options directive: /WEB/roomi/public_html/upload/family/, referer: http://www.roomi.com.tw/obj/swf/avatar/avatar_buchafarm.swf?ver=2.0.0

Aug 3 23:35:12 192.168.11.11 Forwarded from 192.168.11.57: root: [Wed Aug 03 23:35:12 2011] [error] [client 118.171.233.64] File does not exist: /WEB/roomi/public_html/img, referer: http://www.roomi.com.tw/web/album/index.php?PA=album_photo&photo_id=4004079

with the above example logs, I want to filter out the unnecessary log in bold so that it will not show up in my search results or to be clear on this question is that result must showed up as

[Wed Aug 03 23:35:01 2011] [error] [client 114.24.189.76] File does not exist: /WEB/roomi/public*html/img, referer: http://www.roomi.com.tw/web/album/index.php?PA=album*photo&photo_id=3997548

how am i suppose to filter out those unwanted logs?
I would try using nullFilter on this, is there any other ways to do it?

Tags (3)
0 Karma
1 Solution

BobM
Builder

You can use SEDCMD in your props.conf to edit data as it is being indexed.
Add the following to your props.conf

[mysourcetype]
SEDCMD-filter-forwarded = s/^.*Forwarded from[^\[]*\[/\[/

This will match from the beginning of the line to the first [ as long as it contains "Forwarded from" and replace it with just the [

You will need to reindex data before you see any changes. If you need to do this at search time, it can be done with a transform but is less optimal.

props.conf

[mysourcetype]
REPORT-filter-forward = filter-forward

transforms.conf

[filter-forward]
DEST_KEY = _raw
REGEX = ^.*Forwarded from[^\[]*(\[.*)
FORMAT = $1

Bob

View solution in original post

BobM
Builder

You can use SEDCMD in your props.conf to edit data as it is being indexed.
Add the following to your props.conf

[mysourcetype]
SEDCMD-filter-forwarded = s/^.*Forwarded from[^\[]*\[/\[/

This will match from the beginning of the line to the first [ as long as it contains "Forwarded from" and replace it with just the [

You will need to reindex data before you see any changes. If you need to do this at search time, it can be done with a transform but is less optimal.

props.conf

[mysourcetype]
REPORT-filter-forward = filter-forward

transforms.conf

[filter-forward]
DEST_KEY = _raw
REGEX = ^.*Forwarded from[^\[]*(\[.*)
FORMAT = $1

Bob

Eqalis
Explorer

Yes the editor on the forum seems to lose backslashes unless you are very careful .

0 Karma

cpuppet
Path Finder

Dear Bob,

thank you so much, the SEDCMD does work!
yes, like twkan said, the regex given had an extra \ before the [ for replacement.

0 Karma

twkan
Splunk Employee
Splunk Employee

Looks good, except that the SEDCMD should be

SEDCMD-filter-forwarded = s/^.Forwarded from[^[][/[/

in order to fit into cpuppet's requirement.

0 Karma

jamesaarondevli
Path Finder

Hi cpuppet,

I grabbed the events you listed and ingested it with the below inputs.conf entry. Note the sourcetype value. This will come into play in the props.conf entry below.

[monitor:///opt/splunk/var/spool/answers/example_log_filter]
sourcetype = example_log_filter
index = answers
crcSalt = <SOURCE>
disabled = false

Then under props.conf I made the following entry:

[example_log_filter]
LINE_BREAKER = (\d\s[0-9:]*\s[0-9\.]*\sForwarded\sfrom\s[0-9.]*:\s\w+:\s)

This seems to work by removing everything but the first 3 characters which for the events you listed is "Aug". This answer has a limitation I wasn't able to iron out in time in that you will still have the month of the following event at the end of the previous event. See below:

[Wed Aug 03 23:35:12 2011] [error] [client 113.61.193.153] Directory index forbidden by Options directive: /WEB/roomi/public_html/upload/family/, referer: http://www.roomi.com.tw/obj/swf/avatar/avatar_buchafarm.swf?ver=2.0.0
Aug 

I hope this helps.

Cheers,
James

0 Karma

cpuppet
Path Finder

thanks james, but having an aug at the end of the event logs will still cause me some minor probs, but thank you for the help given.

0 Karma
Get Updates on the Splunk Community!

Edge Processor | New Resiliency Improvements & Support for Additional Data Sources

We are excited to announce several exciting updates for Edge Processor aimed at hardening overall product ...

Splunk Certification Support Alert | Pearson VUE Outage

Splunk Certification holders and candidates!  Please be advised of an upcoming system maintenance period for ...

Enterprise Security Content Update (ESCU) | New Releases

In September, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...