Splunk Search

How to improve the speed of Spunk search

qazwsxe
New Member

I want to get hundreds of millions of data from billions of data, but it takes more than an hour each time.
I just used the simplest search: index="test" name=jack But, it's very slow.

Then I checked the memory and CPU usage. Each search takes only 200-300 MB of memory.
So I modified the max_mem_usage_mb, search_process_memory_usage_percentage_threshold and search_process_memory_usage_threshold parameters in $SPLUNK_HOME/etc/apps/search/local/limits.conf, but they didn't seem to play a significant role.
Is there any effective way to improve the speed of my search?
Thanks! 🙂

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

You'll be much faster in finding Jack's company if you also specify how to find a company in your search. What that looks like depends on your data which you didn't share with us - knowing your data would help.

That could look like one of these:

index=foo sourcetype=company_register name=jack
index=foo category=employees name=jack
etc.

If you have an accelerated datamodel, it could look like this:

| tstats summariesonly=t values(company) as companies from datamodel=your_model where your_model.name=jack

To chain that you could build a dashboard with in-page drilldowns that steps through the tree you expect in your data.

View solution in original post

0 Karma

woodcock
Esteemed Legend

If you need a generic simmary of your Millions of events, then try fieldsummary:

index=<YouShouldAlwaysSpecifyAnIndex> AND sourcetype=<AndSourcetypeToo> AND name="jack" | fieldsummary
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

You'll be much faster in finding Jack's company if you also specify how to find a company in your search. What that looks like depends on your data which you didn't share with us - knowing your data would help.

That could look like one of these:

index=foo sourcetype=company_register name=jack
index=foo category=employees name=jack
etc.

If you have an accelerated datamodel, it could look like this:

| tstats summariesonly=t values(company) as companies from datamodel=your_model where your_model.name=jack

To chain that you could build a dashboard with in-page drilldowns that steps through the tree you expect in your data.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

What do you need raw events for? Put all relevant fields into the data model and go from there.

0 Karma

qazwsxe
New Member

Another problem is that if I specify the fields to be returned, in the case of large amounts of data, the speed is slower than that of direct search.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

I misplaced my crystal ball, this would be so much easier if you included your searches and data samples/description.

0 Karma

qazwsxe
New Member

The search statement I use is |pivot datamodel dataset SPLITROW name as new_name FILTER name is jack.The speed is slower than index="test" name=jack.At this time, CPU and memory usage increased sharply.

0 Karma

qazwsxe
New Member

Sorry,there is a mistake.I know what to do,thanks!!!!!!!

0 Karma

qazwsxe
New Member

I want to use the data model to speed up the search, so I need to return the search field, which is the result of the event, rather than the statistics. I added a part of the field to the data model, but using | tstats summariesonly = t values (company) as companies from the data model = your_model where your_model. name = Jackdoes not return any event results.

0 Karma

qazwsxe
New Member

Thanks,it's very fast!
But I don't need statistics, I want to return the event results. What should I do?

0 Karma

woodcock
Esteemed Legend

A single indexer like you have in your AllInOne configuration cannot efficiently handle billions of events by itself. You need many more indexers so that the main power of Splunk (Parallel Map and Reduce) can be unleashed.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

There is no reduce in your search.

You're not going to get good advice without describing your use case.

0 Karma

qazwsxe
New Member

I just want to get lots of data in a short time.I added hundreds of millions of pieces of data to an indexer.I just used the simplest search: index="test" name=jack,it's very slow.
Then I tred to build a datamodel,but only PivotTable can be generated. I want to generate search results.
What commands do I need to use to speed up my search through the datamodel?
Thanks! 🙂

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

"Getting data" is not a use case.

qazwsxe
New Member

Sorry,I don't know what you means......

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

There's no value in listing millions of events on screen, which is what your current search does.

Describe what you actually want to achieve instead of just trolling.

0 Karma

qazwsxe
New Member

Sure,I know it's no value in listing millions of events on screen.I want to use keyword search to get the required data from billions of data, and the results may only be hundreds or thousands. But usually the data base is too large, and the search always becomes very slow.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

If you really just want to list events on screen, append | head 1000 to your search. Nobody's meant to page past 1000 events.

0 Karma

qazwsxe
New Member

I tred to build a datamodel.It's verty fast,but only PivotTable can be generated. I want to generate search results.
What commands do I need to use to speed up my search through the datamodel?

0 Karma

qazwsxe
New Member

OK,maybe that's an effective way to do it.Actually,thanks!
I want to call the interface and search recursively.For example, the key word is name = jack.For the first time, relevant information has been searched out, such as mailbox, company, etc.Then search the company, mailbox and so on as keywords again.So go back and forth until you find all the information associated with name = jack.
Is there a good way to optimize the search algorithm, or does Splunk have its own recursive search command?

0 Karma

qazwsxe
New Member

I added more indexers and used distributed. The test data were distributed to three machines, each with five indexers and each indexer with 20,000,000 data. But there is no improvement in search speed. How can I improve it?

0 Karma
Get Updates on the Splunk Community!

Now Available: Cisco Talos Threat Intelligence Integrations for Splunk Security Cloud ...

At .conf24, we shared that we were in the process of integrating Cisco Talos threat intelligence into Splunk ...

Preparing your Splunk Environment for OpenSSL3

The Splunk platform will transition to OpenSSL version 3 in a future release. Actions are required to prepare ...

Easily Improve Agent Saturation with the Splunk Add-on for OpenTelemetry Collector

Agent Saturation What and Whys In application performance monitoring, saturation is defined as the total load ...