Getting Data In

adding a new log format: detect all the fields

New Member

Hello,

I am trying to add a new custom log format, so splunk can recognize all the fields in this log:

    #proxy_code c_ip "user" "profile" timestamp "url" http_status "user_agent" _time
    - 10.20.11.24 "user1" "profile1" [1/Nov/2010:11:44:51 +0100] "GET http://example.com/ HTTP/1.1" 200 "Mozilla/4.0" 1289818694
    4 10.20.13.19 "user3" "profile2" [1/Nov/2010:11:44:54 +0100] "GET http://server1.com/ HTTP/1.1" 200 "Mozilla/4.0" 1289818697
    - 10.20.12.16 "-" "-" [1/Nov/2010:11:44:54 +0100] "GET http://www.example2.com/ HTTP/1.1" 200 "Mozilla/4.0" 1289818697
    80 19.55.54.22 "user10" "profile5" [1/Nov/2010:11:44:54 +0100] "GET http://abc.server.com/ HTTP/1.1" 200 "MSIE" 1289818697

to execute following queries: source=proxy profile=profile2 proxy_code=4 etc.

my steps:

-1. create new etc\apps\search\local\transforms.conf with a new sourcetype:

    [proxy]
    REGEX = ^([0-9\-]*) ([0-9\.]+) "([^"]+)" "([^"]+)" (\[[^\]+\]) ("[^"]+") ([0-9\-]+) ("[^"]+") ([0-9]*)
    FORMAT = proxy_code::$1 c_ip::$2 user::$3 profile::$4 timestamp::$5 url::$6 http_status::$7 user_agent::$9 _time::$14

-2. create etc\apps\search\local\inputs.conf:

    [nullPound]
    REGEX = ^\#
    DEST_KEY = queue
    FORMAT = nullQueue

    [monitor://c:\proxylogs]
    disabled = false
    followTail = 0
    host = proxy
    sourcetype = proxy

-3. create etc\apps\search\local\props.conf

    [proxy]
    TRANSFORMS-logformat = proxy

-4. restart splunk

I can find the events with sourcetype="proxy", but the fields are not recognized, for example c_ip="10.20.11.24" doesnt work.

The comments are not removed despite of nullPound-rule in transforms.conf

do I missing something?

BR

PS:

Tags (1)
0 Karma

Super Champion

Following up on what Genti said. You really don't want do indexed fields, you want search-time field extractions. So move your "proxy" from a TRANSFORMS to a REPORT entry. Also, I'd be extra careful extracting a field called "_time", since this is a built in and very important field used internally by splunk.

0 Karma

Splunk Employee
Splunk Employee

First, you say:
2. create etc\apps\search\local\inputs.conf:

[nullPound]
REGEX = ^#
DEST_KEY = queue
FORMAT = nullQueue

Either this is a typo when you asked the question, or you are putting this line in the wrong config file. (this should be in transforms.conf)

Then, It seems like you are never actually calling the [nullPound] transform, ie, something like this should work:

[proxy]
TRANSFORMS-logformat = proxy, nullPound

and Third, do you need these field extractions to be index time field extraction? Have you thought of using Search time field extractions instead? (it would make your indexing faster..) http://www.splunk.com/base/Documentation/4.1.5/Knowledge/Createandmaintainsearch-timefieldextraction...

More specifically, this should be helpful: Extract multiple fields using one regex

This is an example of a field extraction that pulls out five separate fields. You can then use these fields in concert with some event types to help you find port flapping events and report on them.

Here's a sample of the event data that the fields are being extracted from:

#%LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet9/16, changed state to down

The stanza in props.conf for the extraction looks like this:

    [syslog]
    EXTRACT-<port_flapping> = Interface\s(?<interface>(?<media>[^\d]+)(?<slot>\d+)\/(?<port>\d+))\,\schanged
    \sstate\sto\s(?<port_status>up|down)