I am trying to export 10 million rows to a CSV. I want to explore options available for the following versions:
I also want to know what enhancements are in place in the latest 4.1.3 version for both Web and API/CLI
This depends a bit on whether you just need to export raw events or the result of a reporting/summarizing search.
In either case, it's a difficult task with 4.0.x. For a raw events search, one method is to write code and be clever in accessing the REST endpoints directly to "page" through the data. For either a raw or statistical events search, you can use outputcsv to persist even more rows to disk and retrieve them directly.
For 4.1.x, the CLI can emit an unlimited stream of raw events (in reverse-time order), however this data will not be CSV. This is achieved by setting -maxout 0
. For a reporting/summarizing search, the limit is 500k and can be paged through using the REST API.
Yes I am referring to (hopefully) improved large data export functionality in 4.1.x
This depends a bit on whether you just need to export raw events or the result of a reporting/summarizing search.
In either case, it's a difficult task with 4.0.x. For a raw events search, one method is to write code and be clever in accessing the REST endpoints directly to "page" through the data. For either a raw or statistical events search, you can use outputcsv to persist even more rows to disk and retrieve them directly.
For 4.1.x, the CLI can emit an unlimited stream of raw events (in reverse-time order), however this data will not be CSV. This is achieved by setting -maxout 0
. For a reporting/summarizing search, the limit is 500k and can be paged through using the REST API.
Thanks, paging is fairly tedious in 4.0.x. I am hoping its improved in 4.1.x. I am interested only in reporting or summarized searches.
Those are two very different questions. You may want to open a second question for the 4.1.x question. Or are you referring to changes in 4.1.x purely related to large data exports?