Splunk Search

calculate sum of multiple fields in different lines identified using regex

karthik2146
Engager

I want to calculate sum of multiple fields which occur in different lines in logs
I have logs like

bmwcar=10
bmwtruck=5
nissantruck=5
renaultcar=4
mercedescar=10
suzukicar=10
tatatruck=5
bmwcar=2
nissantruck=15

i want to have timechart with sum of all cars and sum of all truck, so my output should be car=36, truck=30.
i can do it like index="xxxx" sourcetype="web_stats" | timechart span=1d eval (sum(bmwcar)+sum(renaultcar)....etc) but this list is not fixed as a new car can be logged any time in future.

so, i am using regex (.*car) and (.*truck) but i am not able to sum up all cars together and trucks together.

index="xxxx" sourcetype="web_stats"  *car OR *truck | rex "(?<vehicle>(.*car=[\d]+) | (.*truck=[\d]+) )"  | table vehicle, _time | mvexpand vehicle | rex field=vehicle ".*=(?<cnt>(\d+))" | search cnt!=0 | timechart span=1d sum(cnt)

by the above query, either i can get sum of all cars and trucks together or cars and trucks in a separate chart using separate quereies. but i wanted to have cars and trucks in a same chart in a single query.
could you suggest any way to do it?

Tags (1)
0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try

 index="xxxx" sourcetype="web_stats"  *car OR *truck | rex "car=(?<car>\d+)" | rex "truck=(?<truck>\d+)" | timechart span=1d sum(car) as car sum(truck) as truck

View solution in original post

somesoni2
Revered Legend

Give this a try

 index="xxxx" sourcetype="web_stats"  *car OR *truck | rex "car=(?<car>\d+)" | rex "truck=(?<truck>\d+)" | timechart span=1d sum(car) as car sum(truck) as truck

karthik2146
Engager

It worked but it selected only few cars and trucks. I now understood why the regex was not working. Actually my log will be posted once in 15 minutes, each log entry will have multiple lines, like this

bmwcar=10
bmwtruck=5
nissantruck=5
renaultcar=4

after 15 minutes

bmwcar=5
bmwtruck=4
nissantruck=8
renaultcar=3

like this every 15 minutes,

the regex selects only the first car or truck, so it does not give me correct sum. Could you let me know, how the regex can be modified to select all trucks and cars in each log entry

0 Karma

somesoni2
Revered Legend

Just add max_match=0 to both the rex command. Like this, and that should take care of that.

 index="xxxx" sourcetype="web_stats"  *car OR *truck | rex max_match=0 "car=(?<car>\d+)" | rex max_match=0 "truck=(?<truck>\d+)" | timechart span=1d sum(car) as car sum(truck) as truck
0 Karma

sundareshr
Legend

Try this

index="xxxx" sourcetype="web_stats" *car OR *truck | rex "(?<make>\w+)(?<type>car|truck)=(?<cnt>\d+)" | timechart span=1d sum(cnt) as total by type

*OR*

index="xxxx" sourcetype="web_stats" *car OR *truck | rex "(?<make>\w+)(?<type>car|truck)=(?<cnt>\d+)" | chart sum(cnt) as total over type by make
0 Karma

karthik2146
Engager

Thanks for answer but sad that it did not work. The regex only selects truck that too one brand.
When i did
index="xxxx" sourcetype="web_stats" *car OR *truck | rex "(?\w+)(?car|truck)=(?\d+)" | table _time, make, type, cnt

i get only truck records and make as bmw and their respective count. I dont know the problem, you have any idea?

0 Karma

sundareshr
Legend

Try this

index="xxxx" sourcetype="web_stats" *car OR *truck | rex field=x max_match=0 "(?<make>\w+)(?<type>car|truck)=(?<cnt>\d+)" | eval z=mvzip(make, mvzip(type, cnt)) | mvexpand z | rex field=z "(?<make>\w+),(?<type>\w+),(?<cnt>\d+)" | stats sum(cnt) as total by type
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try this.

index="xxxx" sourcetype="web_stats" *car OR *truck | rex "(?<vehicle>(.*car=[\d]+) | (.*truck=[\d]+) )"  | rex field=vehicle ".*(?<type>[^=]+)=(?<cnt>\d+)" | timechart span=1d sum(cnt) by type
---
If this reply helps you, Karma would be appreciated.

karthik2146
Engager

Thanks for the answer. but the regex selected only truck of a specifc make.

0 Karma

karthik2146
Engager

The regex selected only few cars and trucks. I now understood why the regex was not working. Actually my log will be posted once in 15 minutes, each log entry will have multiple lines, like this

bmwcar=10
bmwtruck=5
nissantruck=5
renaultcar=4

after 15 minutes

bmwcar=5
bmwtruck=4
nissantruck=8
renaultcar=3

like this every 15 minutes,

the regex selects only the first car or truck, so it does not give me correct sum. Could you let me know, how the regex can be modified to select all trucks and cars in each log entry?

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...