Getting Data In

What are my options to export large amounts of Splunk data from UI, CLI or using REST?

sat94541
Communicator

I have several Giga bytes of data that I need to extract from Splunk in raw format. The Dump command does not seem to work as expected:

index=perforce earliest=09/22/2013:23:59:00 latest=09/23/2013:23:59:00 | dump format=raw basefilename=perforce_dmp

Trying to export this amount of data from the UI seems unreliable. Please provide clear instruction on how to obtain an estimated 35 to 40GB of data using the CLI, REST API and the Dump command.

Tags (4)
1 Solution

rbal_splunk
Splunk Employee
Splunk Employee

Below are the options to export large amount of data from Splunk.

Option 1: Export from UI – but for this to work you may need to increase the web timeout by setting

Try setting server.socket_timeout in web.conf to 3 minutes.
server.socket_timeout=180

http://docs.splunk.com/Documentation/Splunk/6.0.6/Admin/Webconf

Option 2: The best is to use the CLI command as documented in link

http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/CLIsearchsyntax

This link provides various argument for CLI command argument.

Here refer argument "maxout" as by default CLI only export 100 rows. To export large amount of event add -maxout to CLI command to adjust the number of event to be exported.

For example I used below command and was able to export 200,000 events.

splunk search "index=_internal earliest=09/14/2014:23:59:00 latest=09/16/2014:01:00:00 " -output rawdata -maxout 200000 > c:/test123.dmp

Option 3: The other option could be to use REST call . Here is a blog that has useful information .

http://blogs.splunk.com/2013/09/15/exporting-large-results-sets-to-csv/

Also here is search that I used to export some million of records

curl -k -u admin:XXXXXX --data-urlencode search="search google.com OR yahoo.com earliest=-2day latest=-1day" -d "output_mode=raw" https://testbox:8089/servicesNS/admin/search/search/jobs/export > socid12346_export.log

The result set was 3,193,277 records. The file is 3.2GB, which is far too big for me to open .

View solution in original post

outcoldman
Communicator

From my experience, I needed to export several GB of data, and the best performance I got is from dump command.
You need to do few things:

  1. Change the job Lifetime to 7 days.
  2. Move job to the background, so it will not depend on UI.

With CLI my export was running for more than 24 hours, with dump I got my data out in 3-4 hours.

0 Karma

rbal_splunk
Splunk Employee
Splunk Employee

Below are the options to export large amount of data from Splunk.

Option 1: Export from UI – but for this to work you may need to increase the web timeout by setting

Try setting server.socket_timeout in web.conf to 3 minutes.
server.socket_timeout=180

http://docs.splunk.com/Documentation/Splunk/6.0.6/Admin/Webconf

Option 2: The best is to use the CLI command as documented in link

http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/CLIsearchsyntax

This link provides various argument for CLI command argument.

Here refer argument "maxout" as by default CLI only export 100 rows. To export large amount of event add -maxout to CLI command to adjust the number of event to be exported.

For example I used below command and was able to export 200,000 events.

splunk search "index=_internal earliest=09/14/2014:23:59:00 latest=09/16/2014:01:00:00 " -output rawdata -maxout 200000 > c:/test123.dmp

Option 3: The other option could be to use REST call . Here is a blog that has useful information .

http://blogs.splunk.com/2013/09/15/exporting-large-results-sets-to-csv/

Also here is search that I used to export some million of records

curl -k -u admin:XXXXXX --data-urlencode search="search google.com OR yahoo.com earliest=-2day latest=-1day" -d "output_mode=raw" https://testbox:8089/servicesNS/admin/search/search/jobs/export > socid12346_export.log

The result set was 3,193,277 records. The file is 3.2GB, which is far too big for me to open .

jrballesteros05
Communicator

Hello, I know this thread is quite old but I have a question about the REST option. When I check the job activity I always realized the job expired. Is there a way to configure the expiration time in the REST API?

Best regards.

jhedgpeth
Path Finder

my experience with ./splunk export ... seems even better than these options.

0 Karma

rbal_splunk
Splunk Employee
Splunk Employee

Could you share command used and the results you got

0 Karma

jhedgpeth
Path Finder
./splunk export eventdata -index main -dir /var/tmp/export -sourcetype router_log

more options:

./splunk help export
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Enhance Security Operations with Automated Threat Analysis in the Splunk EcosystemAre you leveraging ...

What Is Splunk? Here’s What You Can Do with Splunk

Hey Splunk Community, we know you know Splunk. You likely leverage its unparalleled ability to ingest, index, ...

Level Up Your .conf25: Splunk Arcade Comes to Boston

With .conf25 right around the corner in Boston, there’s a lot to look forward to — inspiring keynotes, ...