- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
I have this compressed (reduced version of large structure) which is a combination of basic text and JSON:
2024-07-10 07:27:28 +02:00 LiveEvent: {"data":{"time_span_seconds":300,
"active":17519,
"total":17519,
"unique":4208,
"total_prepared":16684,
"unique_prepared":3703,
"created":594,
"updated":0,
"deleted":0,"ports":[
{"stock_id":49,
"goods_in":0,
"picks":2,
"inspection_or_adhoc":0,
"waste_time":1,
"wait_bin":214,
"wait_user":66,
"stock_open_seconds":281,
"stock_closed_seconds":19,
"bins_above":0,
"completed":[43757746,43756193],
"content_codes":[],
"category_codes":[{"category_code":4,"count":2}]},
{"stock_id":46,
"goods_in":0,
"picks":1,
"inspection_or_adhoc":0,
"waste_time":0,
"wait_bin":2,
"wait_user":298,
"stock_open_seconds":300,
"stock_closed_seconds":0,
"bins_above":0,
"completed":[43769715],
"content_codes":[],
"category_codes":[{"category_code":4,"count":1}]},
{"stock_id":1,
"goods_in":0,
"picks":3,
"inspection_or_adhoc":0,
"waste_time":0,
"wait_bin":191,
"wait_user":40,
"stock_open_seconds":231,
"stock_closed_seconds":69,
"bins_above":0,
"completed":[43823628,43823659,43823660],
"content_codes":[],
"category_codes":[{"category_code":1,"count":3}]}
]},
"uuid":"8711336c-ddcd-432f-b388-8b3940ce151a",
"session_id":"d14fbee3-0a7a-4026-9fbf-d90eb62d0e73",
"session_sequence_number":5113,
"version":"2.0.0",
"installation_id":"a031v00001Bex7fAAB",
"local_installation_timestamp":"2024-07-10T07:35:00.0000000+02:00",
"date":"2024-07-10",
"app_server_timestamp":"2024-07-10T07:27:28.8839856+02:00",
"event_type":"STOCK_AND_PILE"}
I eventually need each “stock_id” ending up as an individual event, and keep the common information along with it like: timestamp, uuid, session_id, session_sequence_number and event_type.
Can someone guide me how to use props and transforms to achieve this?
PS. I have read through several great posts on how to split JSON arrays into events, but none about how to keep common fields in each of them.
Many thanks in advance.
Best Regards,
Bjarne
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

TL&DR - you can't split events within Splunk itself during ingestion.
Longer explanation - each event is processed as a single entity. You could try to do a copy of the event using CLONE_SOURCETYPE and then process each of those instances separately (for example - cut some part from one copy but other part from another copy) but it's not something that can be reasonably implemented, it's unmaintaineable in the long run and you can't do it dynamically (like split a json into however many items an array has). Oh, and of course structured data manipulation in ingest time is a relatively big no-no.
So your best bet would be to pre-process your data with a third-party tool. (or at least write a scripted input doing the heavy lifting of splitting the data).
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content


I'm not sure it can be done reliably using props and transforms. I'd use a scripted input to parse the data and re-format it.
If this reply helps you, Karma would be appreciated.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @richgalloway,
Thanks for your input.
Do you happen to have any scripting ideas for this?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content


I have nothing specific to offer. In a previous job, I used a Python script to parse data and then restructure it so it was easier for Splunk to ingest. It wasn't JSON (I think it was XML), but still should be pretty straightforward.
If this reply helps you, Karma would be appreciated.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And btw this one: How to split JSON array into Multiple events at Index Time?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

That one relies on the fact that it was a simple array and could be cut with regexes into pieces. The splitting mechanism would break apart if the data changed - for example if there was another field added except the "local" one to the "outer" json.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @PickleRick,
The JSON structure is very solid, and don’t change, except there can be many (+1000) or few (4) “stock_id”.
You talked about scripting inputs as well, do you have any suggestions/examples?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Your case is completely different because you want to keep some of the "outer" information shared between separate events (which actually isn't that good idea because your license usage will get multiplied on those events).
As for the scripted input - see those resources for technicalities from Splunk side. Of course the internals - splitting the event - is entirely up to you.
https://docs.splunk.com/Documentation/Splunk/latest/AdvancedDev/ScriptSetup
https://dev.splunk.com/enterprise/docs/developapps/manageknowledge/custominputs
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The thing is, if se don’t split them at index time, the indexers will have even more work to do, as the structures can be huge.
PS. I’m aware of the extra license usage here as well.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @PickleRick,
Thanks for your feedback, though I’m surprised with the answer, as I’ve seen other clear indication and solution to splitting JSON arrays to individual events like: How to parse a JSON array delimited by "," into separate events with their unique timestamps?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

1. Please, don't post links butchered by some external "protection" service.
2. You get this wrong 😉 Those articles don't describe splitting json events. They describe breaking input data stream so that it breaks on the "inner" json boundaries instead of the "outer" ones. It doesn't have anything to do with manipulating a single event already being broken from the input stream. It's siimilar to telling Splunk not to break the stream into lines but rather ingest something delimited by whitespaces separately. But your case is completely different because you want to carry over some common part (some common metadata I assume) from the outer json structure to each part extracted from the inner json array. This is way above the simple string-based manipulation that Splunk can do in the ingestion pipeline.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Thanks for the advice.
- Well after working with Splunk for +10 years I frankly don’t agree with the “simple string-based manipulation that Splunk can in the ingestion pipe”, I’d say I’ve seen amazing (to the extend crazy) things done with props and transforms.
Said that, Splunk might not be able to do exactly what I’m after here, but I’m willing to spend time trying anyway, as this will have a major impact on the performance at search time.
Yes, there are some meta data that need to stay with each event to be able to find them again.
I have some ideas in my head on how to twist this, but right now I’m on vacation, and can’t test them the next weeks time or so, so I’m just “warming up”, and looking for / listening in to others crazy ideas of what they have achieved in Splunk 🙂
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

It's not about "whose is longer". And yes, I've seen many interesting hacks but the fact remains - Splunk works one event at a time. So you can't "carry over" any info from one event to another using just props and transforms (except for that very very ugly and unmaintainable trick with actually cloning the event and separately modifying each copy). Also you cannot split an event (or merge it) after it's been through the line breaking/merging phase.
So you can't turn
{"whatever": ["a","b","c"], "something":"something"}
into
{"whatever": "a", "something":"something"}
{"whatever": "b", "something":"something"}
{"whatever": "c", "something":"something"}
Using props and transforms alone. Ingestion pipeline doesn't deal with structured data (with the exception of indexed extractions on UF but that's a different story).
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Longer than yesterday helps though 🙂
Ok - here are some thoughts I had getting around this, without having a chance to play with it yet.
SEDCMD - looks as a possibility, while knowing it’s not going to be the newbie kind of thing. There is support for back ref, so I thought of coping a core meta field as an addition into each stock_id, and then split the structure to events by each stuck_id
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

You're thinking in wrong order. That's why I'm saying it's not possible with Splunk alone.
If you don't know this one, it's one of the mainstays of understanding of Splunk indexing process- https://community.splunk.com/t5/Getting-Data-In/Diagrams-of-how-indexing-works-in-the-Splunk-platfor...
As you can see, line breaking is one of the absolute first things happening with the input stream. You can't "backtrack" your way within the ingestion pipeline to do SEDCMD before line breaking.
And, as I wrote already, it's really a very bad idea to tackle structured data with regexes.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

TL&DR - you can't split events within Splunk itself during ingestion.
Longer explanation - each event is processed as a single entity. You could try to do a copy of the event using CLONE_SOURCETYPE and then process each of those instances separately (for example - cut some part from one copy but other part from another copy) but it's not something that can be reasonably implemented, it's unmaintaineable in the long run and you can't do it dynamically (like split a json into however many items an array has). Oh, and of course structured data manipulation in ingest time is a relatively big no-no.
So your best bet would be to pre-process your data with a third-party tool. (or at least write a scripted input doing the heavy lifting of splitting the data).
