Splunk Search

calculate sum of multiple fields in different lines identified using regex

karthik2146
Engager

I want to calculate sum of multiple fields which occur in different lines in logs
I have logs like

bmwcar=10
bmwtruck=5
nissantruck=5
renaultcar=4
mercedescar=10
suzukicar=10
tatatruck=5
bmwcar=2
nissantruck=15

i want to have timechart with sum of all cars and sum of all truck, so my output should be car=36, truck=30.
i can do it like index="xxxx" sourcetype="web_stats" | timechart span=1d eval (sum(bmwcar)+sum(renaultcar)....etc) but this list is not fixed as a new car can be logged any time in future.

so, i am using regex (.*car) and (.*truck) but i am not able to sum up all cars together and trucks together.

index="xxxx" sourcetype="web_stats"  *car OR *truck | rex "(?<vehicle>(.*car=[\d]+) | (.*truck=[\d]+) )"  | table vehicle, _time | mvexpand vehicle | rex field=vehicle ".*=(?<cnt>(\d+))" | search cnt!=0 | timechart span=1d sum(cnt)

by the above query, either i can get sum of all cars and trucks together or cars and trucks in a separate chart using separate quereies. but i wanted to have cars and trucks in a same chart in a single query.
could you suggest any way to do it?

Tags (1)
0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try

 index="xxxx" sourcetype="web_stats"  *car OR *truck | rex "car=(?<car>\d+)" | rex "truck=(?<truck>\d+)" | timechart span=1d sum(car) as car sum(truck) as truck

View solution in original post

somesoni2
Revered Legend

Give this a try

 index="xxxx" sourcetype="web_stats"  *car OR *truck | rex "car=(?<car>\d+)" | rex "truck=(?<truck>\d+)" | timechart span=1d sum(car) as car sum(truck) as truck

karthik2146
Engager

It worked but it selected only few cars and trucks. I now understood why the regex was not working. Actually my log will be posted once in 15 minutes, each log entry will have multiple lines, like this

bmwcar=10
bmwtruck=5
nissantruck=5
renaultcar=4

after 15 minutes

bmwcar=5
bmwtruck=4
nissantruck=8
renaultcar=3

like this every 15 minutes,

the regex selects only the first car or truck, so it does not give me correct sum. Could you let me know, how the regex can be modified to select all trucks and cars in each log entry

0 Karma

somesoni2
Revered Legend

Just add max_match=0 to both the rex command. Like this, and that should take care of that.

 index="xxxx" sourcetype="web_stats"  *car OR *truck | rex max_match=0 "car=(?<car>\d+)" | rex max_match=0 "truck=(?<truck>\d+)" | timechart span=1d sum(car) as car sum(truck) as truck
0 Karma

sundareshr
Legend

Try this

index="xxxx" sourcetype="web_stats" *car OR *truck | rex "(?<make>\w+)(?<type>car|truck)=(?<cnt>\d+)" | timechart span=1d sum(cnt) as total by type

*OR*

index="xxxx" sourcetype="web_stats" *car OR *truck | rex "(?<make>\w+)(?<type>car|truck)=(?<cnt>\d+)" | chart sum(cnt) as total over type by make
0 Karma

karthik2146
Engager

Thanks for answer but sad that it did not work. The regex only selects truck that too one brand.
When i did
index="xxxx" sourcetype="web_stats" *car OR *truck | rex "(?\w+)(?car|truck)=(?\d+)" | table _time, make, type, cnt

i get only truck records and make as bmw and their respective count. I dont know the problem, you have any idea?

0 Karma

sundareshr
Legend

Try this

index="xxxx" sourcetype="web_stats" *car OR *truck | rex field=x max_match=0 "(?<make>\w+)(?<type>car|truck)=(?<cnt>\d+)" | eval z=mvzip(make, mvzip(type, cnt)) | mvexpand z | rex field=z "(?<make>\w+),(?<type>\w+),(?<cnt>\d+)" | stats sum(cnt) as total by type
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this.

index="xxxx" sourcetype="web_stats" *car OR *truck | rex "(?<vehicle>(.*car=[\d]+) | (.*truck=[\d]+) )"  | rex field=vehicle ".*(?<type>[^=]+)=(?<cnt>\d+)" | timechart span=1d sum(cnt) by type
---
If this reply helps you, Karma would be appreciated.

karthik2146
Engager

Thanks for the answer. but the regex selected only truck of a specifc make.

0 Karma

karthik2146
Engager

The regex selected only few cars and trucks. I now understood why the regex was not working. Actually my log will be posted once in 15 minutes, each log entry will have multiple lines, like this

bmwcar=10
bmwtruck=5
nissantruck=5
renaultcar=4

after 15 minutes

bmwcar=5
bmwtruck=4
nissantruck=8
renaultcar=3

like this every 15 minutes,

the regex selects only the first car or truck, so it does not give me correct sum. Could you let me know, how the regex can be modified to select all trucks and cars in each log entry?

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...