Getting Data In

What are my options to export large amounts of Splunk data from UI, CLI or using REST?

sat94541
Communicator

I have several Giga bytes of data that I need to extract from Splunk in raw format. The Dump command does not seem to work as expected:

index=perforce earliest=09/22/2013:23:59:00 latest=09/23/2013:23:59:00 | dump format=raw basefilename=perforce_dmp

Trying to export this amount of data from the UI seems unreliable. Please provide clear instruction on how to obtain an estimated 35 to 40GB of data using the CLI, REST API and the Dump command.

Tags (4)
1 Solution

rbal_splunk
Splunk Employee
Splunk Employee

Below are the options to export large amount of data from Splunk.

Option 1: Export from UI – but for this to work you may need to increase the web timeout by setting

Try setting server.socket_timeout in web.conf to 3 minutes.
server.socket_timeout=180

http://docs.splunk.com/Documentation/Splunk/6.0.6/Admin/Webconf

Option 2: The best is to use the CLI command as documented in link

http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/CLIsearchsyntax

This link provides various argument for CLI command argument.

Here refer argument "maxout" as by default CLI only export 100 rows. To export large amount of event add -maxout to CLI command to adjust the number of event to be exported.

For example I used below command and was able to export 200,000 events.

splunk search "index=_internal earliest=09/14/2014:23:59:00 latest=09/16/2014:01:00:00 " -output rawdata -maxout 200000 > c:/test123.dmp

Option 3: The other option could be to use REST call . Here is a blog that has useful information .

http://blogs.splunk.com/2013/09/15/exporting-large-results-sets-to-csv/

Also here is search that I used to export some million of records

curl -k -u admin:XXXXXX --data-urlencode search="search google.com OR yahoo.com earliest=-2day latest=-1day" -d "output_mode=raw" https://testbox:8089/servicesNS/admin/search/search/jobs/export > socid12346_export.log

The result set was 3,193,277 records. The file is 3.2GB, which is far too big for me to open .

View solution in original post

outcoldman
Communicator

From my experience, I needed to export several GB of data, and the best performance I got is from dump command.
You need to do few things:

  1. Change the job Lifetime to 7 days.
  2. Move job to the background, so it will not depend on UI.

With CLI my export was running for more than 24 hours, with dump I got my data out in 3-4 hours.

0 Karma

rbal_splunk
Splunk Employee
Splunk Employee

Below are the options to export large amount of data from Splunk.

Option 1: Export from UI – but for this to work you may need to increase the web timeout by setting

Try setting server.socket_timeout in web.conf to 3 minutes.
server.socket_timeout=180

http://docs.splunk.com/Documentation/Splunk/6.0.6/Admin/Webconf

Option 2: The best is to use the CLI command as documented in link

http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/CLIsearchsyntax

This link provides various argument for CLI command argument.

Here refer argument "maxout" as by default CLI only export 100 rows. To export large amount of event add -maxout to CLI command to adjust the number of event to be exported.

For example I used below command and was able to export 200,000 events.

splunk search "index=_internal earliest=09/14/2014:23:59:00 latest=09/16/2014:01:00:00 " -output rawdata -maxout 200000 > c:/test123.dmp

Option 3: The other option could be to use REST call . Here is a blog that has useful information .

http://blogs.splunk.com/2013/09/15/exporting-large-results-sets-to-csv/

Also here is search that I used to export some million of records

curl -k -u admin:XXXXXX --data-urlencode search="search google.com OR yahoo.com earliest=-2day latest=-1day" -d "output_mode=raw" https://testbox:8089/servicesNS/admin/search/search/jobs/export > socid12346_export.log

The result set was 3,193,277 records. The file is 3.2GB, which is far too big for me to open .

jrballesteros05
Communicator

Hello, I know this thread is quite old but I have a question about the REST option. When I check the job activity I always realized the job expired. Is there a way to configure the expiration time in the REST API?

Best regards.

jhedgpeth
Path Finder

my experience with ./splunk export ... seems even better than these options.

0 Karma

rbal_splunk
Splunk Employee
Splunk Employee

Could you share command used and the results you got

0 Karma

jhedgpeth
Path Finder
./splunk export eventdata -index main -dir /var/tmp/export -sourcetype router_log

more options:

./splunk help export
0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...