All Apps and Add-ons

search time field extraction doesn't work

Path Finder

Hello,
We are attempting to resolve problem where data hasn't been assigned the correct source type.
We have attempted to resolve this by performing search time field extractions but nothing seems to work.

The sourcetype has been identified as: wwwwebsitecomauaccesslog-2
The source is: /var/log/httpd/www
websitecomauaccesslog

In props.conf I have tried:

[source::/var/log/httpd/wwwwebsitecomauaccess_log]

rename=access-common

I have tried:
[source::/var/log/httpd/wwwwebsitecomauaccess_log]

sourcetype=access-common

I have tried:
[source::/var/log/httpd/wwwwebsitecomauaccesslog]

TRANSFORMS-fix
ae = fixaccessextractions

With the complementing transforms.conf

[fixaccessextractions]
matches access-common or access-combined apache logging formats

Extracts: clientip, clientport, ident, user, reqtime, method, uri, root, file, uridomain, uriquery, version, status, bytes, refererurl, refererdomain, refererproto, useragent, cookie, other (remaining chars)

Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"

REGEX = ^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:reqtime]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?[[bcdomain:referer]]?+[^"]*+)"(?:\s++[qstring:useragent]?+)?+)?[[all:other]]
FORMAT = sourcetype::access
common
DEST_KEY = MetaData:Sourcetype

Yet when I do a search on source=/var/log/httpd/wwwwebsitecomauaccess_log

The fields are still useless and no useful fields are returned.

Thanks in advance
Cam

SAMPLE DATA:

192.168.x.x (192.168.x.x) www.website.com - - [23/May/2013:17:05:44 +8000] "GET /images/external/website_logo.png HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.2; MSOffice 12)" 21832 TLSv1 AES128-SHA

Please see http://pastebin.com/zYBgyhhn for raw question

0 Karma

SplunkTrust
SplunkTrust

Hi cam343

you're mixing up things here, you are setting up an index time field extraction, not search time. This means, only new indexed events will have those fields and not the older events.

But maybe you should test your field extraction in the search app by using only one field at the time and proceed until you get what you want, like:

 ...  | rex "(?<nspaces:clientip>^(\d{3}.){2}x\.x)"

this matches the first IP in your log data and creates in your search result a new field called nspaces:clientip. This way you can build the regex and use them in the transforms.conf to have the fields extracted at index time for any new event.

as always docs is a good place to read:
http://docs.splunk.com/Documentation/Splunk/5.0.2/Knowledge/Addfieldsatsearchtime
http://docs.splunk.com/Documentation/Splunk/5.0.2/Data/Configureindex-timefieldextraction

hope this helps

cheers,
MuS

0 Karma

Legend

I'm a bit confused. I don't see any statement at all telling Splunk to apply any search-time field extractions? You got a TRANSFORMS statement there but that is index-time, not search-time.

0 Karma