Splunk Search

What is the best custom log event format for Splunk to eat?

maverick
Splunk Employee
Splunk Employee

I intend modify my app/script so that it will write out a completely custom log file format for Splunk to monitor and index in real-time.

What is the best, most optimal format to use for my custom log event such that Splunk automatically extracts ALL of my fields and the timestamp and I do not have to setup or configure any field extractions myself.

1 Solution

Mick
Splunk Employee
Splunk Employee

The optimal log format is -

timestamp key=value key=value key=value key=value key=value key=value key=value key=value

You can have other delimiters in there too like , or : but that's pretty much a personal preference. If the keys and values are easily recognizable, Splunk will index and search as fast as you can write it out.

View solution in original post

tcperkin
New Member

I use:

key="value" || key="value" || key="value"

My props.conf looks like this:

[my_sourcetype]
KV_MODE = none
REPORT-event = my_sourcetype_event

My transforms.conf looks like this:

[my_sourcetype_event]
MV_ADD = true
KEEP_EMPTY_VALS = true
REGEX = ([^=(\s+\|\|\s+)]*?)\s*\=\s*(.)((?:[^\2]|[^=])*?)\2+?(?:\s+\|\|\s+|$)
FORMAT = $1::$3

My events look like this:

timestamp="2012-02-24 17:39:19 -0800 (PST)" || type="php" || message="my message" || variables_type="Warning" || variables_message="blah" || variables_function="sure" || variables_file="file.php" || variables_line="958" || severity="error" || user_uid="1212" || user_language="fr" || user_ctry_cd="AX" || user_name="nada" || user_init="124124" || user_is_employee="no" || request_uri="http://foo.com/sure" || referer="http://bar.com/foo" || ip="10.10.10.10" || message_id="6"

The regex I made is pretty cool. It'll let you do:

key=[any character]valuebla[any character]hvalue[any character] ||

For example:

dog="spot" ||
alien='zonk' ||
fruit=^apple^ ||
broken=#not#brok#en# ||
horriblekey="imnothorrible="yesyouare" ishouldbemyownfield="wellyouwont" i="give="up""" ||

0 Karma

joshualarkins
Explorer

Your transforms.conf worked amazing for me. All I had to do was format my source events like yours. Thank you!

0 Karma

RubenOlsen
Path Finder

There are several ways to deal with the Sql_Text=Select * from Table1 where uname="dummy"

One way which will work if the Sql_Text=something is at the end of a log event is to use filed extractions (i.e. EXTRACT) in the props.conf file:

EXTRACT-Sql_Text = Sql_Text=(?.+)$

You could even do this directly in the search app without using the props.conf stuff. The following should give you a list with the count of the 10 most used Sql_Text expression grouped by the ClientIP field:

* | rex field=_raw " Sql_Text=(?<SqlText>.+)$" | stats count ClientIP, SqlText | sort 10 -count

0 Karma

pero1234
Path Finder

What if you want log sql commands like this:

Example:

May 26 18:14:15 myhostname DBIP=10.5.10.2 Service=OracleXE ClientIP=75.149.38.65 SrcPort=80 DestPort=8080 UID=10534 Sql_Text=Select * from Table1 where uname="dummy"

As you can see timestamp key=value key=value key=value ... in this example is not good and , or : is not good delimiters because all of this delimiters can be in sql commands which cause broken extract fields.

0 Karma

MillerTime
Splunk Employee
Splunk Employee

Something like this:

Generic Example:

[Timestamp] Hostname HostIP=IPaddress Service=ServiceName ClientIP=IPaddress SrcPor=port# DestPort=port# UID=value Stuff=blah Morestuff=blahblah

Specific Example:

May 26 18:14:15 myhostname HostIP=10.5.10.2 Service=CustomLogger ClientIP=75.149.38.65 SrcPort=80 DestPort=8080 UID=10534 ImportantValue=Be9r87 AnotherImportantValue=310984

0 Karma

juansh2809
New Member

Hello Mick. Could you share a log format example? What is the timestamp format?

0 Karma

Mick
Splunk Employee
Splunk Employee

The optimal log format is -

timestamp key=value key=value key=value key=value key=value key=value key=value key=value

You can have other delimiters in there too like , or : but that's pretty much a personal preference. If the keys and values are easily recognizable, Splunk will index and search as fast as you can write it out.

RubenOlsen
Path Finder

The time stamp should be in ISO8601 form - i.e. variants of YYYY-MM-DD HH:MM:SS.mmm TZ DST.

Example: 2011-10-24 14:04:02 +0200 DST

If you do not want (or need) the time zone of Daylight Savings Time designators - these may be omitted.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...