Field extraction and regex strategy

asarolkar — Tue, 10 Jul 2012 00:16:20 GMT

I have log that looks like this:

2012-02-23 09:25:21 VShellSSH2 sftp 108660 172.59.56.8 62386 NESTLE - C:\SFTP\NESTLE\file.csv 0 0 350754 350754 - - "108660: FLETCHER\NESTLE has accessed 'C:\SFTP\NESTLE\file.csv 350754 bytes downloaded"

I need to figure out a way to apply field extractions to extract the name of the org (which usually appears after the keyword FLETCHER\ ) and then similarly extract the # of bytes downloaded.

Any pointers so as to how I should go about it ?

Here's what Im thinking :

i) I will create a field extraction to pick out the name of the org - eg. NESTLE in this case
This field will be called org

ii) I will create a regex that will filter out the # of bytes downloaded. this field will be called usage

iii) I will do a sourcetype=SFTP_records stats bytes by org

Does that make sense ? There are multiple events like these for every ORG and needless to say there are multiple ORGS

Re: Field extraction and regex strategy

carasso — Tue, 10 Jul 2012 00:22:17 GMT

1) You might be interested in looking at the Field Extractor app which will generate the regexes for you, if you just click on the values you want extracted.

2) you can have one regex to pull out both values. Something like "FLETCHER\\(?\w+).*?(?\d+) bytes"

3) you search needs to be something like: sourcetype=SFTP_records | stats sum(bytes) by org

topic Re: Field extraction and regex strategy in Splunk Search

Field extraction and regex strategy

Re: Field extraction and regex strategy