Hello community,
I have a string .net clearing cache request for user took this many miliseconds:
and .net clearing cache request for listing took this many miliseconds:
. How can I chart the numbers into a line graph for both values after the :
so my output in a graph will show the latency numbers for the user and listing? Here is an example output from the logs:
8/7/15
1:24:04.913 PM
[2015-08-07 13:24:04.913][INFO ] .net clearing cache request for user took this many miliseconds: 4
source = /var/log/jboss/documentService.log
8/7/15
1:24:04.913 PM
[2015-08-07 13:24:04.913][INFO ] .net clearing cache request for user took this many miliseconds: 4
source = /var/log/jboss/documentService.log
8/7/15
1:24:04.908 PM
[2015-08-07 13:24:04.908][INFO ] .net clearing cache request for listing took this many miliseconds: 7
source = /var/log/jboss/documentService.log
8/7/15
1:24:04.908 PM
[2015-08-07 13:24:04.908][INFO ] .net clearing cache request for listing took this many miliseconds: 7
source = /var/log/jboss/documentService.log
8/7/15
1:22:44.708 PM
[2015-08-07 13:22:44.708][INFO ] .net clearing cache request for user took this many miliseconds: 5
So you really have two problems. The first is to parse your lines to get the numbers to mean something, the second is to then plot those.
To parse, you can do this right in the search*. The portion of the lines you want to parse are...
listing took this many miliseconds: 7
user took this many miliseconds: 5
Where "listing" or "user" could be parsed as, let's call it "type". Then the 7 or 5 we'll parse as "value". You can easily change the names in the below.
In your search, you'll have whatever you need that returns the above rows in Splunk at the front (the "...") then after that you'll put a pipe and a rex command, like so.
... | rex "(?<type>\w+) took this many miliseconds: (?<value>\d+)"
Once you do that, you should see in Splunk in the "Fields" section (on the left) two new fields, one called "type" and another called "value." Rex is powerful, but is often confusing at first. You'll see that we took the parts we wanted and told it to pull out two little pieces. \w+ means "a series of 'word' type characters" and the \d+ says "a series of one or more digits". All the "took this many miliseconds:" stuff in the middle will match exactly but won't do anything with it, it's just text it matches on.
NOTE this is not a terribly robust way to do this, but it should work fine for your needs. A better rex can be developed easily, but I thought it more important for it to be easy to understand for this example. As long as it works, we're good. 🙂
Now, to display value by type. After the command above, use the timechart command after it. It'll be something like...
... | rex "(?<type>\w+) took this many miliseconds: (?<value>\d+)" | timechart avg(value) by type
Then change to the "Visualization" tab and take a look. You'll probably want to change a few settings on the left, like change it to a line chart.
The reason you can't just display "value" by "type" is because timechart will adjust the time scale as you change how much time you are displaying. For instance, if you show the past 7 days, it'll default to a 1 day scale and so you want to display the average of the values that day.
Two more little items to note on the timechart: You can adjust how many divisions it uses by the "span=" parameter. For instance, timechart span=1h ...
will use 1 hour long buckets (assuming that doesn't create more than the default limit of 1000 items, IIRC). 1d for one day, and you can even do odd things like "span=15m" for buckets of 15 minutes each. The second thing you should know about timechart is that you can also plot more than one thing for each category. Like change "avg(value)" to "avg(value) max(value)" and it'll display both.
Oh, and one more tip, in the "format" menu in the chart, there's an option in the general section to "connect nulls" which will fill in gaps in the lines.
Note that timechart can do all sorts of different operations, like average, max, min, and so on. See links below for more on timechart.
*I'd recommend to add these to an EXTRACT- line in props.conf, but after you get this all working, I'd mark this answer as answered then create a new separate question on converting it to a props.conf setting.
So you really have two problems. The first is to parse your lines to get the numbers to mean something, the second is to then plot those.
To parse, you can do this right in the search*. The portion of the lines you want to parse are...
listing took this many miliseconds: 7
user took this many miliseconds: 5
Where "listing" or "user" could be parsed as, let's call it "type". Then the 7 or 5 we'll parse as "value". You can easily change the names in the below.
In your search, you'll have whatever you need that returns the above rows in Splunk at the front (the "...") then after that you'll put a pipe and a rex command, like so.
... | rex "(?<type>\w+) took this many miliseconds: (?<value>\d+)"
Once you do that, you should see in Splunk in the "Fields" section (on the left) two new fields, one called "type" and another called "value." Rex is powerful, but is often confusing at first. You'll see that we took the parts we wanted and told it to pull out two little pieces. \w+ means "a series of 'word' type characters" and the \d+ says "a series of one or more digits". All the "took this many miliseconds:" stuff in the middle will match exactly but won't do anything with it, it's just text it matches on.
NOTE this is not a terribly robust way to do this, but it should work fine for your needs. A better rex can be developed easily, but I thought it more important for it to be easy to understand for this example. As long as it works, we're good. 🙂
Now, to display value by type. After the command above, use the timechart command after it. It'll be something like...
... | rex "(?<type>\w+) took this many miliseconds: (?<value>\d+)" | timechart avg(value) by type
Then change to the "Visualization" tab and take a look. You'll probably want to change a few settings on the left, like change it to a line chart.
The reason you can't just display "value" by "type" is because timechart will adjust the time scale as you change how much time you are displaying. For instance, if you show the past 7 days, it'll default to a 1 day scale and so you want to display the average of the values that day.
Two more little items to note on the timechart: You can adjust how many divisions it uses by the "span=" parameter. For instance, timechart span=1h ...
will use 1 hour long buckets (assuming that doesn't create more than the default limit of 1000 items, IIRC). 1d for one day, and you can even do odd things like "span=15m" for buckets of 15 minutes each. The second thing you should know about timechart is that you can also plot more than one thing for each category. Like change "avg(value)" to "avg(value) max(value)" and it'll display both.
Oh, and one more tip, in the "format" menu in the chart, there's an option in the general section to "connect nulls" which will fill in gaps in the lines.
Note that timechart can do all sorts of different operations, like average, max, min, and so on. See links below for more on timechart.
*I'd recommend to add these to an EXTRACT- line in props.conf, but after you get this all working, I'd mark this answer as answered then create a new separate question on converting it to a props.conf setting.
right on the money
You need a regex to extract the field
try : this one that assumes that the field is prefixed with 1 or more spaces.
| regex "miliseconds:\s+(?<cache_ms>\d+)" | table _time cache_ms _raw
then create a timechart
It looks like I had to use rex as opposed to regex to fill in the on-the-fly cache_ms field - thx for throwing me in the right direction, this should work for now!
yes, rex, my bad.
Here is the example output but I didn't get the values in the field cache_ms, I added an additional \s+ as I think the raw output has 2 spaces, still not effect.
2015-08-07 14:05:25 [2015-08-07 14:05:25.890][INFO ] .net clearing cache request for user took this many miliseconds: 4
2015-08-07 14:05:25 [2015-08-07 14:05:25.890][INFO ] .net clearing cache request for user took this many miliseconds: 4
Awesome, definitely closer than I got, how can those values be graphed in a line chart, that shows two likes. One for user and another for listing:
2015-08-07 14:05:25 [2015-08-07 14:05:25.890][INFO ] .net clearing cache request for user took this many miliseconds: 4
2015-08-07 14:05:25 [2015-08-07 14:05:25.890][INFO ] .net clearing cache request for user took this many miliseconds: 4
2015-08-07 14:05:25 [2015-08-07 14:05:25.885][INFO ] .net clearing cache request for listing took this many miliseconds: 5
2015-08-07 14:05:25 [2015-08-07 14:05:25.885][INFO ] .net clearing cache request for listing took this many miliseconds: 5
2015-08-07 14:04:42 [2015-08-07 14:04:42.306][INFO ] .net clearing cache request for user took this many miliseconds: 4
2015-08-07 14:04:42 [2015-08-07 14:04:42.306][INFO ] .net clearing cache request for user took this many miliseconds: 4
2015-08-07 14:04:42 [2015-08-07 14:04:42.301][INFO ] .net clearing cache request for listing took this many miliseconds: 6
2015-08-07 14:04:42 [2015-08-07 14:04:42.301][INFO ] .net clearing cache request for listing took this many miliseconds: 6
2015-08-07 14:04:31 [2015-08-07 14:04:31.378][INFO ] .net clearing cache request for user took this many miliseconds: 4
2015-08-07 14:04:31 [2015-08-07 14:04:31.378][INFO ] .net clearing cache request for user took this many miliseconds: 4
I did notice that cache_ms did not have any values in the column