Splunk Search

How to change stats table format from 1x9 to 3x3?

Communicator

In each log event, I have 3 fields that keep a record count of the number of rows inserted, updated and deleted. I am able to grab statistics on each of these fields (e.g. average, 2 standard deviations above mean, 2 standard deviations below mean, etc). However, when I display them using stats, I get 1 row with 9 columns. As the stats are essentially repeated 3 times, I would like to change the format to a table with 3 rows (one for inserted, updated and deleted) and 3 columns (avg, -2sd, +2sd). What's the best way to do this?

Typically, if you have one field (e.g. date) over multiple logs, you can use the stats avg(field) by date to split the data into multiple rows. However, my logs don't fit that format. Here's my search query so far

... | stats avg(InsertRecordCount) as Insert, stdev(InsertRecordCount) as Insert_stdev, avg(UpdateRecordCount) as Update, stdev(UpdateRecordCount) as Update_stdev, avg(DeleteRecordCount) as Delete, stdev(DeleteRecordCount) as Delete_stdev

Here's another way that might work. I just don't know how to pull it off. Since I'm taking the same statistical values(avg, 2-stdev, 2+stdev) from for each field, if somehow (in splunk) I could assign split the 1 log from containing 3 fields insert=123, update=234, delete=012 to 3 logs containing 2 fields (type=(insert|update|delete) number=(123|234|012) then I could then I could generate the stats command the same as above and insert 'by type' at the end.

Thanks in advance!

Tags (2)
0 Karma
1 Solution

Communicator

You're right; it's not pretty. In thinking about this for a while, I came up with a way that made sense to me. I'll share it so people have 2 different ways that they can approach the problem.

In my last comment in the question, I mentioned that if one could split the 3 fields into 2 fields with one field describing the type (insert, update, delete) and the other describing the number of records. Think of this as 'unpivoting' the data. I followed this train of thought and I'll (try to) explain it below

Suppose you have 3 fields all in 1 log event:
Insert=343
Update=512
Delete=729

You can use the eval command to combine all 6 pieces of data into one string: 'Insert,343.Update,512.Delete,729' . Then you can use the split command to split the data into 3 parts (using the period as the delimiter). Next, you can use the mvexpand command to split the 3 parts into 3 different logs. Lastly, you can use the rex command to create 2 new fields, one called type with values Insert, Update, or Delete and one called score with values 343, 512, or 729.

In the end, you have turned the 1 log event into 3 log events each with 2 new fields. Now, you can use the average, standard deviation, and other stats command once using the 'by type' command to create the 3 columns.

Here's the below code that I used. Remember to take out the '<' and replace it with a <

| eval newfield="Insert,"+Insert+".Update,"+Update+".Delete,"+Delete| eval mysplit=split(newfield,".") | mvexpand mysplit | rex field=mysplit "(?'<'type>.*)?\,(?'<'number>.*)" | table mysplit, number

View solution in original post

Communicator

You're right; it's not pretty. In thinking about this for a while, I came up with a way that made sense to me. I'll share it so people have 2 different ways that they can approach the problem.

In my last comment in the question, I mentioned that if one could split the 3 fields into 2 fields with one field describing the type (insert, update, delete) and the other describing the number of records. Think of this as 'unpivoting' the data. I followed this train of thought and I'll (try to) explain it below

Suppose you have 3 fields all in 1 log event:
Insert=343
Update=512
Delete=729

You can use the eval command to combine all 6 pieces of data into one string: 'Insert,343.Update,512.Delete,729' . Then you can use the split command to split the data into 3 parts (using the period as the delimiter). Next, you can use the mvexpand command to split the 3 parts into 3 different logs. Lastly, you can use the rex command to create 2 new fields, one called type with values Insert, Update, or Delete and one called score with values 343, 512, or 729.

In the end, you have turned the 1 log event into 3 log events each with 2 new fields. Now, you can use the average, standard deviation, and other stats command once using the 'by type' command to create the 3 columns.

Here's the below code that I used. Remember to take out the '<' and replace it with a <

| eval newfield="Insert,"+Insert+".Update,"+Update+".Delete,"+Delete| eval mysplit=split(newfield,".") | mvexpand mysplit | rex field=mysplit "(?'<'type>.*)?\,(?'<'number>.*)" | table mysplit, number

View solution in original post

SplunkTrust
SplunkTrust

This isn't going to be pretty.

I'm assuming you have data something like this:

| stats count | fields - count
| setfields insert_avg = 10, insert_stdev = 2, update_avg = 5, update_stdev = 1, delete_avg = 4, delete_stdev = 3
| foreach *_avg
  [ eval <<MATCHSTR>>_lower = <<FIELD>> - 2 * <<MATCHSTR>>_stdev
  | eval <<MATCHSTR>>_upper = <<FIELD>> + 2 * <<MATCHSTR>>_stdev
  | fields - <<MATCHSTR>>_stdev
  ]

That'll produce a single line with nine fields for insert, update, and delete and their lower, avg, and upper values.

Using that dummy data, you could do this:

... | eval operation = "insert update delete" | makemv operation | mvexpand operation
| foreach insert_*
  [ eval <<MATCHSTR>> = case(operation=="delete", delete_<<MATCHSTR>>,
    operation=="update", update_<<MATCHSTR>>,
    operation=="insert", insert_<<MATCHSTR>>)
  ] | table operation lower avg upper

That'll first split the row into three rows, one for each operation. Then it'll pick out the values of the appropriate operation_lower, operation_avg, and operation_upper fields for the operation of that row.

There might be a nicer way, but I can't think of one right now 🙂