Splunk Search

How do you create a summary chart based on usage information with cost from imported data using fields?

tmblue
Engager

I'm a bit over my head, so I'm going to dive in and ask. I've searched the forums, and the tubes, and there are some slight samples, but I'm just not grasping them (full honesty!)

I'm importing data with the following fields:

fields +explanation(not included in data)
Date (it's a Y-m-d format), Account (Account number), Service (what service is being used), Amount (what it cost for that day).

I need to do a few things: a running daily snapshot of the data , so I can pull up a single pretty graph that shows each service cost on that day per account, as well as a total for all services per account and also across all accounts.

I need to be able to sum the data, eg: for 30 days, show me each service and how much it cost for the 30 days, per account and total. * figure if I get this I can figure out how to show for 60-90-120-6months etc.

The data is in Splunk, and I forced the time to be of the Date in the file, but Splunk is still insisting on adding a random time, but that's fine; the date is correct, and that's what I need to report on. There is no real time in my data, and since I was backfilling, it was kind of important that the Time reflect the Date of the Date not the current date. (make it easier to search using the built in _time) (right or wrong, again I'm out of my zone here 🙂

Thanks!

0 Karma
1 Solution

DalJeanis
Legend

Okay, it seems like you forgot to ask a question. Let's take some general observations, and give you approaches to them.

First, you say the date is correct but the _time is not. If you put any of these lines near the beginning of your search, the selected line will set the _time for every record to midnight of its current date.

| eval _time = relative_time(_time,"@d")

OR

| eval _time=86400*floor(_time/86400)

OR

| bin _time span=1d

Any one of the above accomplishes the exact same thing - throwing away the hours and minutes portion of _time and setting the _time to start of day.


2) Unless you have a LOT of data, you should not think in terms of running daily snapshots. You should think in terms of finding the events you want, then adding up the events at the levels that you want to see the data summarized. Here's the pseudocode -

your search that finds all the events you want over the time period you want
| your search lines that calculate the values of any required calculated fields (such as extracting the account number)
| bin _time  span=1d
| stats sum(Amount) as TotalAmount by Account Service _time

The above would give you one record for each combination of Account, Service, and date. Once that is done, you can then treat each of those records as input to the next section of your code, that calculates how to present them.


For example, if you want to add up all the daily amounts for each month per account, (regardless of service) then you could add two more lines after the above...

| bin _time span=1mon
| stats sum(TotalAmount) as MonthlyTotal by Account _time

Or, if that was all you needed, then the final result for all of the above could be accomplished with a simple time chart...

your search that finds all the events you want over the time period you want
| your search lines that calculate the values of any required calculated fields (such as extracting the account number)
| timechart span=1mon sum(Amount) by Account 

On the other hand, if you wanted to add up all the daily sales amounts for each service, regardless of what account they were on, and the number of accounts that received that services, then you could do this instead...

your search that finds all the events you want over the time period you want
| your search lines that calculate the values of any required calculated fields (such as extracting the account number)
| bin _time  span=1d
| stats sum(Amount) as TotalAmount dc(Account) as NumberOfAccounts by Service _time

If you want to process the same data multiple ways, then you should probably create a dashboard with a base search that calculates the records as we noted in the original search, and that then does post-processing on the records multiple different ways, for instance, to calculate the average daily amount for each service, and separately to calculate the average daily amount for each account.

There are literally hundreds of reports that you could get from very similar code. You just need to define exactly what the report needs to look like, and what the underlying data looks like, and then we can help you get the code working.

Feel free to join the splunk slack channel as well, which is really great for brief questions.

View solution in original post

0 Karma

DalJeanis
Legend

Okay, it seems like you forgot to ask a question. Let's take some general observations, and give you approaches to them.

First, you say the date is correct but the _time is not. If you put any of these lines near the beginning of your search, the selected line will set the _time for every record to midnight of its current date.

| eval _time = relative_time(_time,"@d")

OR

| eval _time=86400*floor(_time/86400)

OR

| bin _time span=1d

Any one of the above accomplishes the exact same thing - throwing away the hours and minutes portion of _time and setting the _time to start of day.


2) Unless you have a LOT of data, you should not think in terms of running daily snapshots. You should think in terms of finding the events you want, then adding up the events at the levels that you want to see the data summarized. Here's the pseudocode -

your search that finds all the events you want over the time period you want
| your search lines that calculate the values of any required calculated fields (such as extracting the account number)
| bin _time  span=1d
| stats sum(Amount) as TotalAmount by Account Service _time

The above would give you one record for each combination of Account, Service, and date. Once that is done, you can then treat each of those records as input to the next section of your code, that calculates how to present them.


For example, if you want to add up all the daily amounts for each month per account, (regardless of service) then you could add two more lines after the above...

| bin _time span=1mon
| stats sum(TotalAmount) as MonthlyTotal by Account _time

Or, if that was all you needed, then the final result for all of the above could be accomplished with a simple time chart...

your search that finds all the events you want over the time period you want
| your search lines that calculate the values of any required calculated fields (such as extracting the account number)
| timechart span=1mon sum(Amount) by Account 

On the other hand, if you wanted to add up all the daily sales amounts for each service, regardless of what account they were on, and the number of accounts that received that services, then you could do this instead...

your search that finds all the events you want over the time period you want
| your search lines that calculate the values of any required calculated fields (such as extracting the account number)
| bin _time  span=1d
| stats sum(Amount) as TotalAmount dc(Account) as NumberOfAccounts by Service _time

If you want to process the same data multiple ways, then you should probably create a dashboard with a base search that calculates the records as we noted in the original search, and that then does post-processing on the records multiple different ways, for instance, to calculate the average daily amount for each service, and separately to calculate the average daily amount for each account.

There are literally hundreds of reports that you could get from very similar code. You just need to define exactly what the report needs to look like, and what the underlying data looks like, and then we can help you get the code working.

Feel free to join the splunk slack channel as well, which is really great for brief questions.

0 Karma

tmblue
Engager

Thank you, so much.. My head is spinning so going to take a break. I did get some basic data, I got it to compute the total cost of all 3 accounts for any time frame specified (week, month etc) and the Data is right 🙂 yay. I'm trying to see if there is an easy way to map the accountID to something more easily identified (for others using the report, I know the numbers but best to give them a name). Then I'm attempting to see if there is a way to create a graph or other with 3 accounts listed and how much they are using of each resource. I can see that broken up in the statistics area, but can't figure out how to visualize it. Also I need to figure out how to change the Amount to a 2 decimal point USDollar vs sometimes a pretty decently long integer 🙂 But thank you, def made some progress. Now time for some Advil 🙂

0 Karma
Get Updates on the Splunk Community!

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...