- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I intend modify my app/script so that it will write out a completely custom log file format for Splunk to monitor and index in real-time.
What is the best, most optimal format to use for my custom log event such that Splunk automatically extracts ALL of my fields and the timestamp and I do not have to setup or configure any field extractions myself.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

The optimal log format is -
timestamp key=value key=value key=value key=value key=value key=value key=value key=value
You can have other delimiters in there too like , or : but that's pretty much a personal preference. If the keys and values are easily recognizable, Splunk will index and search as fast as you can write it out.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I use:
key="value" || key="value" || key="value"
My props.conf looks like this:
[my_sourcetype]
KV_MODE = none
REPORT-event = my_sourcetype_event
My transforms.conf looks like this:
[my_sourcetype_event]
MV_ADD = true
KEEP_EMPTY_VALS = true
REGEX = ([^=(\s+\|\|\s+)]*?)\s*\=\s*(.)((?:[^\2]|[^=])*?)\2+?(?:\s+\|\|\s+|$)
FORMAT = $1::$3
My events look like this:
timestamp="2012-02-24 17:39:19 -0800 (PST)" || type="php" || message="my message" || variables_type="Warning" || variables_message="blah" || variables_function="sure" || variables_file="file.php" || variables_line="958" || severity="error" || user_uid="1212" || user_language="fr" || user_ctry_cd="AX" || user_name="nada" || user_init="124124" || user_is_employee="no" || request_uri="http://foo.com/sure" || referer="http://bar.com/foo" || ip="10.10.10.10" || message_id="6"
The regex I made is pretty cool. It'll let you do:
key=[any character]valuebla[any character]hvalue[any character] ||
For example:
dog="spot" ||
alien='zonk' ||
fruit=^apple^ ||
broken=#not#brok#en# ||
horriblekey="imnothorrible="yesyouare" ishouldbemyownfield="wellyouwont" i="give="up""" ||
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your transforms.conf worked amazing for me. All I had to do was format my source events like yours. Thank you!
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are several ways to deal with the Sql_Text=Select * from Table1 where uname="dummy"
One way which will work if the Sql_Text=something is at the end of a log event is to use filed extractions (i.e. EXTRACT) in the props.conf file:
EXTRACT-Sql_Text = Sql_Text=(?
.+)$
You could even do this directly in the search app without using the props.conf stuff. The following should give you a list with the count of the 10 most used Sql_Text expression grouped by the ClientIP field:
* | rex field=_raw " Sql_Text=(?<SqlText>.+)$" | stats count ClientIP, SqlText | sort 10 -count
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What if you want log sql commands like this:
Example:
May 26 18:14:15 myhostname DBIP=10.5.10.2 Service=OracleXE ClientIP=75.149.38.65 SrcPort=80 DestPort=8080 UID=10534 Sql_Text=Select * from Table1 where uname="dummy"
As you can see timestamp key=value key=value key=value ... in this example is not good and , or : is not good delimiters because all of this delimiters can be in sql commands which cause broken extract fields.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Something like this:
Generic Example:
[Timestamp] Hostname HostIP=IPaddress Service=ServiceName ClientIP=IPaddress SrcPor=port# DestPort=port# UID=value Stuff=blah Morestuff=blahblah
Specific Example:
May 26 18:14:15 myhostname HostIP=10.5.10.2 Service=CustomLogger ClientIP=75.149.38.65 SrcPort=80 DestPort=8080 UID=10534 ImportantValue=Be9r87 AnotherImportantValue=310984
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Mick. Could you share a log format example? What is the timestamp format?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

The optimal log format is -
timestamp key=value key=value key=value key=value key=value key=value key=value key=value
You can have other delimiters in there too like , or : but that's pretty much a personal preference. If the keys and values are easily recognizable, Splunk will index and search as fast as you can write it out.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The time stamp should be in ISO8601 form - i.e. variants of YYYY-MM-DD HH:MM:SS.mmm TZ DST.
Example: 2011-10-24 14:04:02 +0200 DST
If you do not want (or need) the time zone of Daylight Savings Time designators - these may be omitted.
