I know there's a ton of these questions out here but I've got one of my own. I've looked at the other questions out there and between them and some initial help from Bert gotten a good start but I can't seem to get this to work right.
We have a single JSON package being received via HEC - this package contains anywhere from 1 to 500 events. Our users would like those events broken out into individual events within Splunk. Here's what the initial package looks like:
{"batch_id":"0-39-1490386204359","sampling_rate":1,"n":500,"events":[{"earlyAccess":null,"profileId":"d037d7da-bd83-11e6-80d6-0a2c5cb56663","ip":"74.000.000.238","deviceVersion":[{"component":"app","build":"oncue#1.19.37.108208#mallard2_GA.0","otherDetail":null},{"component":"service","build":"qa-a.stb.fios.tv","otherDetail":null},{"component":"location","build":"1000.00,1000.00","otherDetail":null},{"component":"Client","build":"108208#release_signed","otherDetail":null}],"__lane":"prod","__host":"unknown","navigationStack":[{"name":"Playback","index":0,"type":"LINEAR"}],"platformVersion":"108208#release_signed","assetType":"LinearAsset","assetId":"50281370fd523ecbaafd8f4d8145e006-adf51f1b5b403b6daa88b62d9c8567fa-2017-03-24-1","restrictedBy":null,"programStartTime":1490380200000,"attributionId":"8cca77ff-1480-b037-b0b2-015b01e09338","assetSessionId":"1490386121819","__sourceType":"device","serviceTimestamp":null,"appVersion":"1.19.37.108208","playType":"tuneIn","collectionId":"","deviceType":"501","accountId":"1004249","playRate":1000,"maestro":{"vhoId":"","host":"qa-a-aws.stb.fios.tv","userAgent":"Mozilla/5.0 (STB; CPU 501 OS 108208) OnCue/1.19.37","version":"4.4.3851","inHome":true,"ipAddress":"74.000.000.238"},"encoderDelay":124000,"__eventId":"s9a98hR/sMGKvAFbAe90tQ==","__timestamp":1490386121909,"sessionId":"617F6EA8-1490385232","__eventName":"1","programEndTime":1490390100000,"programId":"adf51f1b5b403b6daa88b62d9c8567fa","__source":"unknown","liveTuneType":"live","deviceTimestamp":1490386121819,"deviceId":"617F6EA8","__eventVersion":16,"recordingId":null,"channelId":"50281370fd523ecbaafd8f4d8145e006","timeZone":"America/New_York","eventProgramPoint":1490386121714},{"earlyAccess":null,"profileId":"45737761-ac2b-11e6-80d6-0a2c5cb56663","ip":"68.000.000.133","deviceVersion":[{"component":"app","build":"oncue#1.19.40.108253#mallard2_GA.0","otherDetail":null},{"component":"service","build":"qa-a.stb.fios.tv","otherDetail":null},{"component":"location","build":"1000.00,1000.00","otherDetail":null},{"component":"Client","build":"108253#release","otherDetail":null}],"__lane":"prod","__host":"unknown","navigationStack":[{"name":"","index":0,"type":""}],"platformVersion":"108253#release","assetType":"LinearAsset","assetId":"2732f41bdecc33aca2a23146eabd0954-5e4c3aaa6ef7312b8104c94c842d6a3f-2017-03-24-1","restrictedBy":null,"programStartTime":1490385600000,"attributionId":"ffffffff-ffff-ffff-ffff-fffffffffff","assetSessionId":"1490386010685","__sourceType":"device","serviceTimestamp":null,"appVersion":"1.19.40.108253","playType":"tuneOut","collectionId":"","deviceType":"501","accountId":"1003469","playRate":0,"maestro":{"vhoId":"","host":"qa-a-aws.stb.fios.tv","userAgent":"Mozilla/5.0 (STB; CPU 501 OS 108253) OnCue/1.19.40","version":"4.4.3851","inHome":true,"ipAddress":"68.000.000.133"},"encoderDelay":49000,"__eventId":"uXvQcxR/sMGKvAFbAe6R2w==","__timestamp":1490386063835,"sessionId":"617F7743-1490378565","__eventName":"1","programEndTime":1490387400000,"programId":"5e4c3aaa6ef7312b8104c94c842d6a3f","__source":"unknown","liveTuneType":"live","deviceTimestamp":1490386063730,"deviceId":"617F7743","__eventVersion":16,"recordingId":null,"channelId":"2732f41bdecc33aca2a23146eabd0954","timeZone":"America/New_York","eventProgramPoint":1490387400000}]}
So far, I'm apply the following props.conf to this data:
CHARSET=UTF-8
SHOULD_LINEMERGE=false
disabled=false
SEDCMD-removeheader=s/^(\{[\w\W]+\[{"earlyAccess":)/{"earlyAccess":/g
SEDCMD-removeeventcommas=s/},{"earlyAccess":/}{"earlyAccess":/g
SEDCMD-fixfooter=s/\]\}//g
LINE_BREAKER={"earlyAccess
TRUNCATE=0
TIME_PREFIX="deviceTimestamp":
TIME_FORMAT=%s%3N
KV_MODE=json
That gives me this output but doesn't break between events:
{"earlyAccess":null,"profileId":"d037d7da-bd83-11e6-80d6-0a2c5cb56663","ip":"74.000.000.238","deviceVersion":[{"component":"app","build":"oncue#1.19.37.108208#mallard2_GA.0","otherDetail":null},{"component":"service","build":"qa-a.stb.fios.tv","otherDetail":null},{"component":"location","build":"1000.00,1000.00","otherDetail":null},{"component":"Client","build":"108208#release_signed","otherDetail":null}],"__lane":"prod","__host":"unknown","navigationStack":[{"name":"Playback","index":0,"type":"LINEAR"}],"platformVersion":"108208#release_signed","assetType":"LinearAsset","assetId":"50281370fd523ecbaafd8f4d8145e006-adf51f1b5b403b6daa88b62d9c8567fa-2017-03-24-1","restrictedBy":null,"programStartTime":1490380200000,"attributionId":"8cca77ff-1480-b037-b0b2-015b01e09338","assetSessionId":"1490386121819","__sourceType":"device","serviceTimestamp":null,"appVersion":"1.19.37.108208","playType":"tuneIn","collectionId":"","deviceType":"501","accountId":"1004249","playRate":1000,"maestro":{"vhoId":"","host":"qa-a-aws.stb.fios.tv","userAgent":"Mozilla/5.0 (STB; CPU 501 OS 108208) OnCue/1.19.37","version":"4.4.3851","inHome":true,"ipAddress":"74.000.000.238"},"encoderDelay":124000,"__eventId":"s9a98hR/sMGKvAFbAe90tQ==","__timestamp":1490386121909,"sessionId":"617F6EA8-1490385232","__eventName":"1","programEndTime":1490390100000,"programId":"adf51f1b5b403b6daa88b62d9c8567fa","__source":"unknown","liveTuneType":"live","deviceTimestamp":1490386121819,"deviceId":"617F6EA8","__eventVersion":16,"recordingId":null,"channelId":"50281370fd523ecbaafd8f4d8145e006","timeZone":"America/New_York","eventProgramPoint":1490386121714}{"earlyAccess":null,"profileId":"45737761-ac2b-11e6-80d6-0a2c5cb56663","ip":"68.000.000.133","deviceVersion":[{"component":"app","build":"oncue#1.19.40.108253#mallard2_GA.0","otherDetail":null},{"component":"service","build":"qa-a.stb.fios.tv","otherDetail":null},{"component":"location","build":"1000.00,1000.00","otherDetail":null},{"component":"Client","build":"108253#release","otherDetail":null}],"__lane":"prod","__host":"unknown","navigationStack":[{"name":"","index":0,"type":""}],"platformVersion":"108253#release","assetType":"LinearAsset","assetId":"2732f41bdecc33aca2a23146eabd0954-5e4c3aaa6ef7312b8104c94c842d6a3f-2017-03-24-1","restrictedBy":null,"programStartTime":1490385600000,"attributionId":"ffffffff-ffff-ffff-ffff-fffffffffff","assetSessionId":"1490386010685","__sourceType":"device","serviceTimestamp":null,"appVersion":"1.19.40.108253","playType":"tuneOut","collectionId":"","deviceType":"501","accountId":"1003469","playRate":0,"maestro":{"vhoId":"","host":"qa-a-aws.stb.fios.tv","userAgent":"Mozilla/5.0 (STB; CPU 501 OS 108253) OnCue/1.19.40","version":"4.4.3851","inHome":true,"ipAddress":"68.000.000.133"},"encoderDelay":49000,"__eventId":"uXvQcxR/sMGKvAFbAe6R2w==","__timestamp":1490386063835,"sessionId":"617F7743-1490378565","__eventName":"1","programEndTime":1490387400000,"programId":"5e4c3aaa6ef7312b8104c94c842d6a3f","__source":"unknown","liveTuneType":"live","deviceTimestamp":1490386063730,"deviceId":"617F7743","__eventVersion":16,"recordingId":null,"channelId":"2732f41bdecc33aca2a23146eabd0954","timeZone":"America/New_York","eventProgramPoint":1490387400000}
The actual event break should be taking place at:
{"earlyAccess":
I've tried LINE_BREAKER in various formats as well as trying combinations of BREAK_ONLY_BEFORE and MUST_BREAK_AFTER but haven't had any luck getting the breaks to happen - Splunk still processes it all as a single event. Everything else is working fine with it - it's just not breaking. Any assistance on how to get these darn things to break right would be greatly appreciated...
I'm the tech who worked with burras on this case and in looking at the props, we identified the issue was due to a character that had not been escaped properly.
LINE_BREAKER=([\r\n,]*(?:{[^[{]+[)?){"earlyAccess
vs
LINE_BREAKER=([\r\n,]*(?:{[^[{]+\[)?){"earlyAccess
it's subtle, but the "[" bracket closest to the end of the line had not been escaped with "\"
after testing the ingestion of the data again, this worked for the case.
I'm the tech who worked with burras on this case and in looking at the props, we identified the issue was due to a character that had not been escaped properly.
LINE_BREAKER=([\r\n,]*(?:{[^[{]+[)?){"earlyAccess
vs
LINE_BREAKER=([\r\n,]*(?:{[^[{]+\[)?){"earlyAccess
it's subtle, but the "[" bracket closest to the end of the line had not been escaped with "\"
after testing the ingestion of the data again, this worked for the case.
I mentioned this in my other comment above but want it here attached to the accepted answer as well - we were only able to get this working after we moved the props.conf from the indexer cluster to the HF running HEC itself. When running on the indexer cluster it was as if the props.conf didn't even exist.
Thanks to Support I was able to get this working. We found a couple of different things:
1) The props.conf was really close - we had to make 1 change to get it working properly (missing an escape on +[)?) ). Here's the final props.conf that worked:
[asset_play]
CHARSET=UTF-8
SHOULD_LINEMERGE=false
disabled=false
SEDCMD-fixfooters=s/]}//g
LINE_BREAKER=([\r\n,]*(?:{[^[{]+\[)?){"earlyAccess
TRUNCATE=0
TIME_PREFIX="deviceTimestamp":
MAX_TIMESTAMP_LOOKAHEAD=30
TIME_FORMAT=%s%3N
KV_MODE=json
2) We also discovered that putting props.conf on the indexer cluster did not work. Anything that was brought in through HEC was essentially untouched by anything in props.conf on an indexer. We had to specify this props.conf on the HF on which HEC was running.
Thanks everyone for your help with this!
I did a quick test;
inputs.conf 
[http://test_json_batch]
disabled = 0
sourcetype = test_json_batch
token = __removed__
props.conf
[test_json_batch]
LINE_BREAKER = ([\[,\r\n]+)\{"(?:earlyAccess|batch_id)":
SHOULD_LINEMERGE = false
SEDCMD-remove_end = s/]}$//g
And a curl to ingest the event above;
curl -k https://10..1.100:8088/services/collector/raw -H ... -d '{"batch_id....,"eventProgramPoint":1490387400000}]}'
And, counting on "AUTO_KV_JSON = true" default search time json field extraction.
For my test above, each event was broken into event starting with {"earlyAccess" just like @beatus's screenshot.
You should have capture groups in "LINE_BREAKER". It tells Splunk what to throw out in between events. You also don't need two of the SEDCMDs as they can be done with the LINE_BREAKER alone. This worked on my end.
[json:test]
CHARSET=UTF-8
SHOULD_LINEMERGE=false
disabled=false
SEDCMD-fixfooter=s/]}//g
LINE_BREAKER=([\r\n,]*(?:{[^[{]+[)?){"earlyAccess
TRUNCATE=0
TIME_PREFIX="deviceTimestamp":
MAX_TIMESTAMP_LOOKAHEAD = 30
TIME_FORMAT=%s%3N
KV_MODE=json
I tried with this props.conf but it's still showing as a single event for me. I can see all of the individual event data in the fields prefaced by event{}. but my customer really needs it broken into individual events because of the different timestamps in each event. The transactional data they're looking at is very time sensitive so just using that first deviceTimestamp field isn't close enough.
Yup, those props give me that exact result. See screen shot.
Doh, I missed the part about HEC. Sorry, apparently my reading comprehension could use some work.
How high volume is the data source? You could use the old style HTTP post inputs potentially which will allow you to hit this with the typical data pipeline.
Regarding event separation itself, 
basically solution of  @beatus  should work for HEC "raw" inputs. It will not work for json endpoint "collector/event" 
We're definitely using the "raw" input for HEC. Any chance that there's a disconnect between what would be seen in the "Add Data" section breaking and what would actually be shown in production? Where it might work in one but not the other?
I'd go back to basics if that's the case. Ensure the props are present & no other props may be interfering. Check with splunk btool props list --debug.
If you look at Masa's answer, you can see these props work on HEC + raw so they should be working for you.
I definitely agree - it sounds like it should be working.  Here's what I'm seeing in the btool output:
/opt/splunk/etc/apps/search/local/props.conf                      [asset_play]
/opt/splunk/etc/system/default/props.conf                         ANNOTATE_PUNCT = True
/opt/splunk/etc/system/default/props.conf                         AUTO_KV_JSON = true
/opt/splunk/etc/system/default/props.conf                         BREAK_ONLY_BEFORE =
/opt/splunk/etc/system/default/props.conf                         BREAK_ONLY_BEFORE_DATE = True
/opt/splunk/etc/system/local/props.conf                           CHARSET = UTF-8
/opt/splunk/etc/system/default/props.conf                         DATETIME_CONFIG = /etc/datetime.xml
/opt/splunk/etc/system/default/props.conf                         HEADER_MODE =
/opt/splunk/etc/system/local/props.conf                           KV_MODE = json
/opt/splunk/etc/system/default/props.conf                         LEARN_MODEL = true
/opt/splunk/etc/system/default/props.conf                         LEARN_SOURCETYPE = true
/opt/splunk/etc/system/local/props.conf                           LINE_BREAKER = ([\r\n,]*(?:{[^[{]+[)?){"earlyAccess
/opt/splunk/etc/system/default/props.conf                         LINE_BREAKER_LOOKBEHIND = 100
/opt/splunk/etc/system/default/props.conf                         MATCH_LIMIT = 100000
/opt/splunk/etc/system/default/props.conf                         MAX_DAYS_AGO = 2000
/opt/splunk/etc/system/default/props.conf                         MAX_DAYS_HENCE = 2
/opt/splunk/etc/system/default/props.conf                         MAX_DIFF_SECS_AGO = 3600
/opt/splunk/etc/system/default/props.conf                         MAX_DIFF_SECS_HENCE = 604800
/opt/splunk/etc/system/default/props.conf                         MAX_EVENTS = 256
/opt/splunk/etc/system/local/props.conf                           MAX_TIMESTAMP_LOOKAHEAD = 30
/opt/splunk/etc/system/default/props.conf                         MUST_BREAK_AFTER =
/opt/splunk/etc/system/default/props.conf                         MUST_NOT_BREAK_AFTER =
/opt/splunk/etc/system/default/props.conf                         MUST_NOT_BREAK_BEFORE =
/opt/splunk/etc/system/local/props.conf                           SEDCMD-fixfooters = s/]}//g
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION = indexing
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION-all = full
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION-inner = inner
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION-outer = outer
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION-raw = none
/opt/splunk/etc/system/default/props.conf                         SEGMENTATION-standard = standard
/opt/splunk/etc/system/local/props.conf                           SHOULD_LINEMERGE = false
/opt/splunk/etc/system/local/props.conf                           TIME_FORMAT = %s%3N
/opt/splunk/etc/system/local/props.conf                           TIME_PREFIX = "deviceTimestamp":
/opt/splunk/etc/system/default/props.conf                         TRANSFORMS =
/opt/splunk/etc/system/local/props.conf                           TRUNCATE = 0
/opt/splunk/etc/apps/search/local/props.conf                      TZ = UTC
/opt/splunk/etc/apps/search/local/props.conf                      category = Custom
/opt/splunk/etc/apps/search/local/props.conf                      description = dena asset_play
/opt/splunk/etc/system/default/props.conf                         detect_trailing_nulls = false
/opt/splunk/etc/system/local/props.conf                           disabled = false
/opt/splunk/etc/system/default/props.conf                         maxDist = 100
/opt/splunk/etc/system/default/props.conf                         priority =
/opt/splunk/etc/apps/search/local/props.conf                      pulldown_type = 1
/opt/splunk/etc/system/default/props.conf                         sourcetype =
I'm not seeing anything obvious that looks like it would be causing a problem. But running with this configuration gives me an output that's still just a single event so there's gotta be something going on somewhere...
@mmodestino_splunk, this is another jq thing, right?
Ha! @woodcock knows I thought about answering this one. Tricky part here is the use of HEC which doesn't give us the chance to put the json to disk so we can pre-parse. Would need jq integrated into the indexing pipeline...here's hoping my enhancement request gets some eyes one day.
@burras what is sending the json to you? I can tell you about another method that may work for you...but it would include some changes in your solution. The problem here is that, in order for Splunk to see these as individual events, yet keep the json format, we need to unwrap the array...something the indexing pipeline doesn't handle all that well today...check out jq and if you can catch these json events and put them to disk, you can pre-parse them into single events then ingest...
We're getting the data from an external DENA forwarder that's just doing a HTTP push to our receiver. I'll investigate jq and take a look but I'm not confident that rearranging the architecture is an option - we've got some significant limitations in the environment that might make that sort of solution unfeasible (lack of storage, exponential growth expectations for this type of data, etc.).
totally understood and is why i wish it was something avail in our indexing pipeline. fingers crossed.
I've even gotten to the point now where instead of just removing the commas between events I actually introduced a newline between events - and it still doesn't want to break. I don't know if its something that I'm doing wrong or a problem with the data itself...
SEDCMD-removeeventcommas=s/},{"earlyAccess":/}\n{"earlyAccess":/g
LINE_BREAKER=(}\n){"earlyAccess