Getting Data In

What are my options to export large amounts of Splunk data from UI, CLI or using REST?

Communicator

I have several Giga bytes of data that I need to extract from Splunk in raw format. The Dump command does not seem to work as expected:

index=perforce earliest=09/22/2013:23:59:00 latest=09/23/2013:23:59:00 | dump format=raw basefilename=perforce_dmp

Trying to export this amount of data from the UI seems unreliable. Please provide clear instruction on how to obtain an estimated 35 to 40GB of data using the CLI, REST API and the Dump command.

Tags (4)
1 Solution

Splunk Employee
Splunk Employee

Below are the options to export large amount of data from Splunk.

Option 1: Export from UI – but for this to work you may need to increase the web timeout by setting

Try setting server.sockettimeout in web.conf to 3 minutes.
server.socket
timeout=180

http://docs.splunk.com/Documentation/Splunk/6.0.6/Admin/Webconf

Option 2: The best is to use the CLI command as documented in link

http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/CLIsearchsyntax

This link provides various argument for CLI command argument.

Here refer argument "maxout" as by default CLI only export 100 rows. To export large amount of event add -maxout to CLI command to adjust the number of event to be exported.

For example I used below command and was able to export 200,000 events.

splunk search "index=_internal earliest=09/14/2014:23:59:00 latest=09/16/2014:01:00:00 " -output rawdata -maxout 200000 > c:/test123.dmp

Option 3: The other option could be to use REST call . Here is a blog that has useful information .

http://blogs.splunk.com/2013/09/15/exporting-large-results-sets-to-csv/

Also here is search that I used to export some million of records

curl -k -u admin:XXXXXX --data-urlencode search="search google.com OR yahoo.com earliest=-2day latest=-1day" -d "outputmode=raw" https://testbox:8089/servicesNS/admin/search/search/jobs/export > socid12346export.log

The result set was 3,193,277 records. The file is 3.2GB, which is far too big for me to open .

View solution in original post

Communicator

From my experience, I needed to export several GB of data, and the best performance I got is from dump command.
You need to do few things:

  1. Change the job Lifetime to 7 days.
  2. Move job to the background, so it will not depend on UI.

With CLI my export was running for more than 24 hours, with dump I got my data out in 3-4 hours.

0 Karma

Splunk Employee
Splunk Employee

Below are the options to export large amount of data from Splunk.

Option 1: Export from UI – but for this to work you may need to increase the web timeout by setting

Try setting server.sockettimeout in web.conf to 3 minutes.
server.socket
timeout=180

http://docs.splunk.com/Documentation/Splunk/6.0.6/Admin/Webconf

Option 2: The best is to use the CLI command as documented in link

http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/CLIsearchsyntax

This link provides various argument for CLI command argument.

Here refer argument "maxout" as by default CLI only export 100 rows. To export large amount of event add -maxout to CLI command to adjust the number of event to be exported.

For example I used below command and was able to export 200,000 events.

splunk search "index=_internal earliest=09/14/2014:23:59:00 latest=09/16/2014:01:00:00 " -output rawdata -maxout 200000 > c:/test123.dmp

Option 3: The other option could be to use REST call . Here is a blog that has useful information .

http://blogs.splunk.com/2013/09/15/exporting-large-results-sets-to-csv/

Also here is search that I used to export some million of records

curl -k -u admin:XXXXXX --data-urlencode search="search google.com OR yahoo.com earliest=-2day latest=-1day" -d "outputmode=raw" https://testbox:8089/servicesNS/admin/search/search/jobs/export > socid12346export.log

The result set was 3,193,277 records. The file is 3.2GB, which is far too big for me to open .

View solution in original post

Communicator

Hello, I know this thread is quite old but I have a question about the REST option. When I check the job activity I always realized the job expired. Is there a way to configure the expiration time in the REST API?

Best regards.

Path Finder

my experience with ./splunk export ... seems even better than these options.

0 Karma

Splunk Employee
Splunk Employee

Could you share command used and the results you got

0 Karma

Path Finder
./splunk export eventdata -index main -dir /var/tmp/export -sourcetype router_log

more options:

./splunk help export
0 Karma