How to save timecharts of a lot of fields?

Zero · ‎04-22-2023

I need to get timecharts of more than 100 fields from an index and save them back to splunk. And I need to update these timecharts regularly (say per week). I can find a good way to do it.

btw Chatgpt suggest data model. But I failed to ask for a more specific answer from it.

Please help. Many thanks.

ITWhisperer · ‎04-23-2023

Please explain your usecase (or get ChatGPT to do it 😀) as it is not clear what you are trying to do?

In the meantime, you can save the results of a search back to Splunk a couple of ways: you could use a csv store, for example, using outputlookup; or, you could populate a summary index, for example, using collect; or, if you simply just want access to a set of results, you could use a scheduled report, and retrieve the results with loadjob, for example.

Zero · ‎04-23-2023

Thanks for your reply.

What I try to do is:

I use spl like this
```

index=examplea source=exampleb
| timechart span=1w count by field1

```
to get a timechart.

now I need to get such timechart for more than 100 fields and store them in some way.

I need to update these timecharts with latest week's data per week.

So I try to find a way to do all these things with splunk.

Don't know if I explain clearly enough.

ITWhisperer · ‎04-23-2023

There are a couple of options on the timechart command that you don't appear to be using which might help

| timechart span=1w useother=f limit=0 count by field1

You could create a report which just gets the previous week's results and schedule this to update a summary index (you will need to create the index first), that way you can then do a search of the summary index for whichever week you want.

You can set the retention period for the summary index to be larger than the retention period of your base data index so that you can still retrieve the results from further back in time.

Also, I did a presentation on idempotency of summary indexes back in BSides 22, which you might want to consider - This talk can still be found on the YouTube BSides SPL 2022 channel Summary Index Idempotency

Zero · ‎04-23-2023

That's very help information. Thanks for your reply!

There is still a problem in my use case. I do not have just 1 field that needs to be processed in this way. I have more than 100 field (field2, field3, ......). So, I need to set more than 100 summary indices for them.

Don't know if there's any solution for this.

ITWhisperer · ‎04-23-2023

You can use the same summary index. If you put data into the summary index with a scheduled report, the name of the report is included as a field in the event so you can distinguish which report generated the events.

OK I guess your requirement that you have 100 fields processed this way means that you have one report which has

| timechart span=1w count by field1

and another with

| timechart span=1w count by field2

etc. rather than field1 having 100 different values (which timechart or other chart/xyseries commands would create 100 fields for)?

Zero · ‎04-23-2023

Yeah, that's right. Sorry for not describing clearly enough...

ITWhisperer · ‎04-23-2023

OK, so set up 100 reports, one for each field, and schedule them to gather the previous week's data, and save the results to the summary index (ensuring that that week's data hasn't already been added to the index, through idempotency).

Having said that, timecharts are not always the best way to store the data, particularly if you want to do further processing on it. This is because the column names are dependent on the data in field1, for example. You might be better off storing the results of a stats command or add an untable command after the timechart, but this very much depends on what you are wanting to do with the timechart data (which you haven't explained).

Zero · ‎04-23-2023

Thanks for your advice.

I actually need to do some machine learning work with the data.

It can work to get the timechart data with spl from the original data each time I need them. But the search job runs too slow. So I thought maybe I can store these timecharts in splunk.

Now I guess the better way is to store the data somewhere else.

ITWhisperer · ‎04-23-2023

Splunk run a Splunk4Ninjas workshop on MLTK EMEA Workshops - anyway, the way the workshop run, is that the data is extracted and cleaned and then saved to csv. This gives you an easily loadable data set on which to repeatedly experiment to find the best settings for the models you were creating with MLTK. You could then apply these models to your full (cleaned) data.

Having said that, summary indexes work just as well as csv, only they are more "permanent", so it depends on your usecase (as I said earlier), on whether it make more sense for you to store your results in 100 csv or 100 reports (in a summary index).

HTH

Zero · ‎04-23-2023

Thanks a lot for your advice! I'll try to find these materials and think more about it.

How to save timecharts of a lot of fields?

data model

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers

Are you a member of the Splunk Community?

How to save timecharts of a lot of fields?

data model

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers