Getting Data In

How to get a snapshot (say 200 events) of all data belonging for every sourcetypes?

koshyk
Super Champion

I want to import all type of data from prod system to dev system after sanitising it. Also we want to capture all type of sourcetypes (different type of data) if possible and 200 events per each sourcetype to make a good data seed for my DEV system.

May be i'm lazy and I know I can write a shell script and loop and use CLI
But is there a clever way to use map command or a search to extract _index,sourcetype, _raw and head 200 on all of them and export them as a single file?

0 Karma
1 Solution

woodcock
Esteemed Legend

Something like this (you don't need all the serial stuff if you can live with 1000 events or fewer because list tops out at 1000):

index=* 
| stats list(_raw) AS raw first(source) AS source first(index) AS index BY sourcetype 
| eval source = case(source="C:", source, true(), "/x/y/z/" . source)
| rex mode=sed field=source "s%.*[\\\\/]%%"
| mvexpand raw 
| streamstats count AS serial BY sourcetype
| search serial<=200
| fields - serial

Then for a single file this:

| outputcsv MyFileName

Or 1 file for each sourcetype, this:

| map maxsearches=30 search="|outputcsv append=t $source$"

View solution in original post

woodcock
Esteemed Legend

Something like this (you don't need all the serial stuff if you can live with 1000 events or fewer because list tops out at 1000):

index=* 
| stats list(_raw) AS raw first(source) AS source first(index) AS index BY sourcetype 
| eval source = case(source="C:", source, true(), "/x/y/z/" . source)
| rex mode=sed field=source "s%.*[\\\\/]%%"
| mvexpand raw 
| streamstats count AS serial BY sourcetype
| search serial<=200
| fields - serial

Then for a single file this:

| outputcsv MyFileName

Or 1 file for each sourcetype, this:

| map maxsearches=30 search="|outputcsv append=t $source$"

koshyk
Super Champion

indebted to you mate. worked like a charm

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to January Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...

[Puzzles] Solve, Learn, Repeat: Reprocessing XML into Fixed-Length Events

This challenge was first posted on Slack #puzzles channelFor a previous puzzle, I needed a set of fixed-length ...

Data Management Digest – December 2025

Welcome to the December edition of Data Management Digest! As we continue our journey of data innovation, the ...