All Apps and Add-ons

MySQL slow query log parsing

redventures
Explorer

I have a mysql slow.log being piped into splunk and works great. Splunk also seems to do a good job of separating the queries out. The only issue I have is I want to have splunk parse out the fields within the comments containing query time and similar. I can't seem to readily get this to work. See sample below:

/* /usr/share/www/redventures.com/data/reddash/dashboard.html / / DBHost: rv-atl-db06 via TCP/IP / / Server: rv-atl-web209 */;

Time: 110325 2:58:25

User@Host: intranet[intranet] @ rv-atl-web209-db [10.24.24.209]

Query_time: 4 Lock_time: 0 Rows_sent: 101 Rows_examined: 712668

SELECT CC.AgentID, COUNT(DISTINCT O.OrderID) AS OrdersStarted, COUNT(DISTINCT IF(O.CreditApproved = 'notchecked', .....

I want the Query_time, Lock_time and so forth to be added to the search as a name/value pair so I can run additional queries.

Ideas?

Tags (1)

mc0e
New Member

Jordan's answer is not exhaustive either.

SET timestamp=1376020745;
# administrator command: Init DB;
# User@Host: xxxx[xxxx] @ web-xxxx.private [10.x.x.x]
# Thread_id: 8343655  Schema: xxxx  QC_hit: No
# Query_time: 3.179482  Lock_time: 0.000000  Rows_sent: 1  Rows_examined: 1

I thought it might be doable by looking for a line that starts with '#' which follows a line which does not start with '#', but there can be comments inside a query.

It's very much an edge case, but a query that is followed by a comment is likely to also fail, as the semi-colon is not last.

I wonder if mysql has any special handling for comments like '# Time: foo' that are passed within queries.

0 Karma

Jordan_Brough
Path Finder

Jason's answer doesn't handle the fact that not every mysqlslow.log event has the "Time: " line -- it appears that mysql only logs that line when the time has actually changed since the last event(..?)

This set of configs seems to be working robustly for me:

inputs.conf

[monitor:///path/to/file]
sourcetype = mysqlslow


splunk/etc/system/local/props.conf

[mysqlslow]
# Use LINE_BREAKER to segment events.  An event may start with either "Time:"" or "User@Host:".
# Look for a semicolon (termination of previous event) followed by one of those items.
# However, the very first entry won't have a ";" before it. Instead it is preceeded by "Argument".
LINE_BREAKER = (?:;|Argument)(\n)(?:\# Time: |\# User@Host: )
SHOULD_LINEMERGE = false
TRUNCATE = 0


splunk/etc/apps/search/local/props.conf

[mysqlslow]
EXTRACT-mysqlslow-user-line = User@Host: (?<user>[^\s@]+) @ (?<query_host>\S+) \[(?<query_ip>[\d.]+)\]
EXTRACT-mysqlslow-query-time-line = Query_time: (?<query_time>\d+)  Lock_time: (?<lock_time>\d+)  Rows_sent: (?<rows_sent>\d+)  Rows_examined: (?<rows_examined>\d+)
EXTRACT-mysqlslow-query-line = \n(?<query>[^#].*)$

Jason
Motivator

I had to figure out MySQL slow query logs today. These configs should help you out. They tell Splunk to only break on a line starting with # Time: 110408 12:34:56, get the time format right, and extract some other fields.

--inputs.conf--
[monitor:///path/to/file]
# input settings, send to the splunk that is collecting
sourcetype = mysql:slow

--props.conf--
[mysql:slow]
# index-time settings, send to the indexer
SHOULD_LINEMERGE = true
TIME_FORMAT = Time: %y%m%d %k:%M:%S
BREAK_ONLY_BEFORE = #\sTime:\s\d{6}\s[\s\d]\d:\d\d:\d\d
MAX_EVENTS = 512
# search-time settings, sent to search head
REPORT-query_for_mysql_slow = query_for_mysql_slow
REPORT-user_src_for_mysql_slow = user_src_for_mysql_slow
REPORT-times_rows_for_mysql_slow = times_rows_for_mysql_slow

--transforms.conf--
# search-time settings, sent to search head
[query_for_mysql_slow]
REGEX = Rows_examined:\s\d+[\r\n\s]+([\s\S]+)
FORMAT = query::$1

[user_src_for_mysql_slow]
REGEX = User@Host:\s([^[]+)\[([^]]+)\]\s?@\s+?\[([^]]+)\]
FORMAT = user::$1 src_user::$2 src::$3 src_bestmatch::$3

[times_rows_for_mysql_slow]
REGEX = Query_time:\s([\d.]+)\s++Lock_time:\s([\d.]+)\s++Rows_sent:\s(\d+)\s++Rows_examined:\s(\d+)
FORMAT = query_time::$1 lock_time::$2 rows_sent::$3 rows_examined::$4

redventures
Explorer

Very helpful, works great.

0 Karma

tpsplunk
Communicator

Jason- i suggest the following update to remove the "SET Timestamp" portion out of the query field extraction (with it in it makes nearly every query unique because the timestamp is always changing):

[query_for_mysql_slow]
REGEX = Rows_examined:\s\d+[\r\n\s]+SET\s+timestamp=\d+;([\s\S]+)
FORMAT = query::$1

0 Karma

tpsplunk
Communicator

Jason, this was really helpful, thanks!

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!