Knowledge Management

How to save timecharts of a lot of fields?

Zero
Engager

I need to get timecharts of more than 100 fields from an index and save them back to splunk. And I need to update these timecharts regularly (say per week). I can find a good way to do it.

btw Chatgpt suggest data model. But I failed to ask for a more specific answer from it.

Please help. Many thanks.

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Please explain your usecase (or get ChatGPT to do it 😀) as it is not clear what you are trying to do?

In the meantime, you can save the results of a search back to Splunk a couple of ways: you could use a csv store, for example, using outputlookup; or, you could populate a summary index, for example, using collect; or, if you simply just want access to a set of results, you could use a scheduled report, and retrieve the results with loadjob, for example.

0 Karma

Zero
Engager

Thanks for your reply.

What I try to do is:

I use spl like this
```

index=examplea source=exampleb
| timechart span=1w count by field1

```
to get a timechart.

now I need to get such timechart for more than 100 fields and store them in some way.

I need to update these timecharts with latest week's data per week.

So I try to find a way to do all these things with splunk.

Don't know if I explain clearly enough.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

There are a couple of options on the timechart command that you don't appear to be using which might help

| timechart span=1w useother=f limit=0 count by field1

You could create a report which just gets the previous week's results and schedule this to update a summary index (you will need to create the index first), that way you can then do a search of the summary index for whichever week you want.

You can set the retention period for the summary index to be larger than the retention period of your base data index so that you can still retrieve the results from further back in time.

Also, I did a presentation on idempotency of summary indexes back in BSides 22, which you might want to consider - This talk can still be found on the YouTube BSides SPL 2022 channel Summary Index Idempotency 

0 Karma

Zero
Engager

That's very help information. Thanks for your reply!

 

There is still a problem in my use case. I do not have just 1 field that needs to be processed in this way. I have more than 100 field (field2, field3, ......). So, I need to set more than 100 summary indices for them. 

 

Don't know if there's any solution for this.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

You can use the same summary index. If you put data into the summary index with a scheduled report, the name of the report is included as a field in the event so you can distinguish which report generated the events.

OK I guess your requirement that you have 100 fields processed this way means that you have one report which has

| timechart span=1w count by field1

 and another with

| timechart span=1w count by field2

etc. rather than field1 having 100 different values (which timechart or other chart/xyseries commands would create 100 fields for)?

0 Karma

Zero
Engager
Yeah, that's right. Sorry for not describing clearly enough...
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

OK, so set up 100 reports, one for each field, and schedule them to gather the previous week's data, and save the results to the summary index (ensuring that that week's data hasn't already been added to the index, through idempotency).

Having said that, timecharts are not always the best way to store the data, particularly if you want to do further processing on it. This is because the column names are dependent on the data in field1, for example. You might be better off storing the results of a stats command or add an untable command after the timechart, but this very much depends on what you are wanting to do with the timechart data (which you haven't explained).

0 Karma

Zero
Engager

Thanks for your advice.

I actually need to do some machine learning work with the data.

It can work to get the timechart data with spl from the original data each time I need them. But the search job runs too slow. So I thought maybe I can store these timecharts in splunk.

Now I guess the better way is to store the data somewhere else.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Splunk run a Splunk4Ninjas workshop on MLTK EMEA Workshops  - anyway, the way the workshop run, is that the data is extracted and cleaned and then saved to csv. This gives you an easily loadable data set on which to repeatedly experiment to find the best settings for the models you were creating with MLTK. You could then apply these models to your full (cleaned) data.

Having said that, summary indexes work just as well as csv, only they are more "permanent", so it depends on your usecase (as I said earlier), on whether it make more sense for you to store your results in 100 csv or 100 reports (in a summary index).

HTH

Zero
Engager

Thanks a lot for your advice! I'll try to find these materials and think more about it.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...