Reporting

How do you define a custom log in Splunk?

garumaru
Explorer

Hello,

I had a custom log from a monitoring tool and the output looks like:

Disk_Space Days Path
10G 4days /path/of/data/userA
20G 5days /path/of/data/userA/folderA
10G 4days /path/of/data/userB
20G 5days /path/of/data/userB/folderA
20G 5days /path/of/data/userB/folderB
20G 10days /path/of/data/userA/folderB/subfolder_a
.....

Is it possible to sort the data, which is over 5days and over 10G for example, and send an email to userA, userB, and so on?

Or do I have to rewrite this log to some other format like JSON?

Thank you very much!

Tags (2)
0 Karma
1 Solution

renjith_nair
Legend

@garumaru,

Its possible to work with this format but we need to consider few factors before going to the final solution.

Assuming Disk_Space, Days and Path are already extracted fields in splunk, we still need to extract the information to do other numerical comparisons.

For e.g. to sort the Disk_Space, the digits should be extracted and could be done with a regex rex field=Disk_Space "(?<DU>\d+)".

But then, what if the disk space usage for some mounts are in TB/MB? In that case, it needs to be checked with a condition and convert to GB before doing any operation

Similarly for Days - it can be extracted with "(?<DAY>\d+)" . But what if it changes to 1month after 30/31 days?

So with your sample data, the fields has to be extracted , compare and then an alert can be sent.

Please find a sample solution done based on dummy data.

index=_* earliest=-5m|stats count by source| eval Days=1|accum Days|eval Days=Days."Days"|eval Disk_Space=if(count > 1000,round((count/1024))."TB",count."GB")

Result

source                                                           count      Days    Disk_Space
C:\Program Files\Splunk\var\log\introspection\disk_objects.log      13      1Days   13GB
C:\Program Files\Splunk\var\log\introspection\kvstore.log           16      2Days   16GB
C:\Program Files\Splunk\var\log\introspection\resource_usage.log    191     3Days   191GB
C:\Program Files\Splunk\var\log\splunk\health.log                   80      4Days   80GB
C:\Program Files\Splunk\var\log\splunk\metrics.log                 853      5Days   853GB
C:\Program Files\Splunk\var\log\splunk\splunkd_access.log           9       6Days   9GB
C:\Program Files\Splunk\var\log\splunk\splunkd_ui_access.log        275     7Days   275GB

Now extract fields and perform UNIT (TB->GB) comparison and conversion (if needed)

index=_* earliest=-5m|stats count by source| eval Days=1|accum Days|eval Days=Days."Days"
|eval Disk_Space=if(count > 1000,round((count/1024))."TB",count."GB")
|rex field=source "C:\\\\Program Files\\\\Splunk\\\\var\\\\log\\\\(?<USER>\w+)\\\\"
|rex field=Disk_Space "(?<DU>\d+)(?<UNIT>\w+)"|rex field=Days "(?<DAY>\d+)"
|eval DU=if(UNIT=="TB",DU*1024,DU)

Finally apply the filter and send mail to the users

    index=_* earliest=-5m|stats count by source| eval Days=1|accum Days|eval Days=Days."Days"
    |eval Disk_Space=if(count > 1000,round((count/1024))."TB",count."GB")
    |rex field=source "C:\\\\Program Files\\\\Splunk\\\\var\\\\log\\\\(?<USER>\w+)\\\\"
    |rex field=Disk_Space "(?<DU>\d+)(?<UNIT>\w+)"|rex field=Days "(?<DAY>\d+)"
    |eval DU=if(UNIT=="TB",DU*1024,DU)
    | where DAY > 5 AND DU > 10|sendmail to=USER@mydomain.com

If you have the control over the content of the log file, suggest you to handle the Unit conversion/Data format before pushing to splunk. The most commonly used and suggested format is key=value format which splunk understands without any extra configuration.

---
What goes around comes around. If it helps, hit it with Karma 🙂

View solution in original post

renjith_nair
Legend

@garumaru,

Its possible to work with this format but we need to consider few factors before going to the final solution.

Assuming Disk_Space, Days and Path are already extracted fields in splunk, we still need to extract the information to do other numerical comparisons.

For e.g. to sort the Disk_Space, the digits should be extracted and could be done with a regex rex field=Disk_Space "(?<DU>\d+)".

But then, what if the disk space usage for some mounts are in TB/MB? In that case, it needs to be checked with a condition and convert to GB before doing any operation

Similarly for Days - it can be extracted with "(?<DAY>\d+)" . But what if it changes to 1month after 30/31 days?

So with your sample data, the fields has to be extracted , compare and then an alert can be sent.

Please find a sample solution done based on dummy data.

index=_* earliest=-5m|stats count by source| eval Days=1|accum Days|eval Days=Days."Days"|eval Disk_Space=if(count > 1000,round((count/1024))."TB",count."GB")

Result

source                                                           count      Days    Disk_Space
C:\Program Files\Splunk\var\log\introspection\disk_objects.log      13      1Days   13GB
C:\Program Files\Splunk\var\log\introspection\kvstore.log           16      2Days   16GB
C:\Program Files\Splunk\var\log\introspection\resource_usage.log    191     3Days   191GB
C:\Program Files\Splunk\var\log\splunk\health.log                   80      4Days   80GB
C:\Program Files\Splunk\var\log\splunk\metrics.log                 853      5Days   853GB
C:\Program Files\Splunk\var\log\splunk\splunkd_access.log           9       6Days   9GB
C:\Program Files\Splunk\var\log\splunk\splunkd_ui_access.log        275     7Days   275GB

Now extract fields and perform UNIT (TB->GB) comparison and conversion (if needed)

index=_* earliest=-5m|stats count by source| eval Days=1|accum Days|eval Days=Days."Days"
|eval Disk_Space=if(count > 1000,round((count/1024))."TB",count."GB")
|rex field=source "C:\\\\Program Files\\\\Splunk\\\\var\\\\log\\\\(?<USER>\w+)\\\\"
|rex field=Disk_Space "(?<DU>\d+)(?<UNIT>\w+)"|rex field=Days "(?<DAY>\d+)"
|eval DU=if(UNIT=="TB",DU*1024,DU)

Finally apply the filter and send mail to the users

    index=_* earliest=-5m|stats count by source| eval Days=1|accum Days|eval Days=Days."Days"
    |eval Disk_Space=if(count > 1000,round((count/1024))."TB",count."GB")
    |rex field=source "C:\\\\Program Files\\\\Splunk\\\\var\\\\log\\\\(?<USER>\w+)\\\\"
    |rex field=Disk_Space "(?<DU>\d+)(?<UNIT>\w+)"|rex field=Days "(?<DAY>\d+)"
    |eval DU=if(UNIT=="TB",DU*1024,DU)
    | where DAY > 5 AND DU > 10|sendmail to=USER@mydomain.com

If you have the control over the content of the log file, suggest you to handle the Unit conversion/Data format before pushing to splunk. The most commonly used and suggested format is key=value format which splunk understands without any extra configuration.

---
What goes around comes around. If it helps, hit it with Karma 🙂

garumaru
Explorer

@renjith.nair, thanks for your reply, for the Disk_Space and Days, they will be using GB and days instead of TB and months.

My sample log might be different from yours, which is my bad that I didn't explain it clearly at first.

The amount of the monitor log file is only one, like /log/usage.log, and in this log file, it has the content like:

Disk_Space  Days   Path
10G        4days /path/of/data/userA
20G        5days /path/of/data/userA/folderA
10G        4days /path/of/data/userB
20G        5days /path/of/data/userB/folderA
20G        5days /path/of/data/userB/folderB
20G       10days /path/of/data/userA/folderB/subfolder_a
30G       40days /path/of/data/userA/folderB/subfolder_a
.....

So using the way you provided, I will have below result:

source         count    Days    Disk_Space
/log/usage.log    2  1Days    2GB

I think there must be something that still needs to be fixed in my log file, would you please share more ideas?
Thank you!

0 Karma

renjith_nair
Legend

@garumaru,
The first line of SPL is just to generate the dummy data similar to yourself. So you don't need to worry about it. Since you are always using GB and Days, you may start from

     "your existing search to get Disk_Space  Days   Path fields"
     |rex field=source "\/path\/of\/data\/(?<USER>\w+)\/"
     |rex field=Disk_Space "(?<DU>\d+)"|rex field=Days "(?<DAY>\d+)"
     | where DAY > 5 AND DU > 10
---
What goes around comes around. If it helps, hit it with Karma 🙂
0 Karma
Get Updates on the Splunk Community!

Optimize Cloud Monitoring

  TECH TALKS Optimize Cloud Monitoring Tuesday, August 13, 2024  |  11:00AM–12:00PM PST   Register to ...

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...