Splunk Search
Highlighted

Quickest Way to Search Large Dataset and Export Results

Communicator

I'm trying to find the quickest way to run a large search against a large dataset which will have a large set of results that I want exported to a large .csv file.

So far, running a search via GUI against the few billion events I have seems to, at some point after I've left for the day/lunch/seppuku, timeout. I've changed my server.conf to include a sessionTimeout value of 10080 (a week's worth of minutes) and I'm testing that now. I don't care if my Web session times out, as long as the search completes.

I've also tried running a CLI search which included a preview value of F, but the terminal still displayed results at about 100 at a time, and the search seems to be going SLOOOWWWWly.

Without dipping into large test runs, can anyone tell me what is the fastest approach to getting a .csv output (without any event limit imposed by Splunk) from my giant search?

Thanks!

Tags (3)
Highlighted

Re: Quickest Way to Search Large Dataset and Export Results

Motivator

You can send the search to the background via the gui. While the search is running, clicke the arrow icon that says "send to background". That should auto save it until you come back to view it. Additionally, if it really is taking that long you could save the search and schedule it to run at midnight when there is no load on the system.

I would be curious to know how many events you are exporting... the exporttool might be better in this scenario.

View solution in original post

0 Karma
Highlighted

Re: Quickest Way to Search Large Dataset and Export Results

Communicator

Oh, the big blue arrow? I wish they'd make that more visible. Perhaps by putting a big blue arrow next to it...

I'm exporting about 100K events per day over a 90-day period.

0 Karma
Highlighted

Re: Quickest Way to Search Large Dataset and Export Results

Legend

CLI search will be very slow if you're trying to capture the results by piping. The outputcsv command will be much faster. Run it in the background, wait for it to be done, then fetch the file from $SPLUNK_HOME/var/run/splunk. Are you trying to export raw events or transformed search results, by the way? Also, what are you doing this export for? Would it be possible to do some (or much) of the processing in Splunk to avoid moving around this mass of data?

Also, how large is the large export, approximately?

View solution in original post

Speak Up for Splunk Careers!

We want to better understand the impact Splunk experience and expertise has has on individuals' careers, and help highlight the growing demand for Splunk skills.