I have data that looks like this:
2017-01-17 22:18:18.330 Info: [Event:id=API_Metrics] [===== STARTING /individual/preferences/v1.5, RAND=9296226956377273381, TS=14847130983159950
2017-01-17 22:18:18.330 Info: [Event:id=API_Metrics] [===== PARAMS FOR /individual/preferences/v1.5, RAND=9296226956377273381, TS=14847130983159950
2017-01-17 22:18:18.330 Info: policyNumber=####
2017-01-17 22:18:18.330 Info: dob=#####
2017-01-17 22:18:18.330 Info: fname=FFFFFFF
2017-01-17 22:18:18.330 Info: subscriberId=######
2017-01-17 22:18:18.330 Info: lname=LLLLL
2017-01-17 22:18:18.330 Info: =====]
****events that contain none of the above keys or information, just other text****
2017-01-17 22:18:23.092 Info: [Event:id=API_Metrics] [===== ENDING /individual/preferences/v1.5, RAND=9296226956377273381, TS=14847130983159950, TIME=PT4.762855S, CODE=200
The beginning and end of the transaction are clearly defined, but the events that come after PARAMS and before ENDING do not contain the values for usable grouping keys, like RAND or TS.
How can I write a search that will return the whole group from start to end? I tried this:
index="marklogic_datafabric" event_id=* | transaction event_id startswith=position=STARTING endswith=position=ENDING
...but it only returns the lines that have a "position" field, but not those that are in between.
2017-01-19 12:14:11.030 Info: [Event:id=API_Metrics] [===== STARTING /individuals/touchpoint/v1.0/search, RAND=2104724838533797466, TS=14848496416750860
2017-01-19 12:14:11.030 Info: [Event:id=API_Metrics] [===== PARAMS FOR /individuals/touchpoint/v1.0/search, RAND=2104724838533797466, TS=14848496416750860
2017-01-19 12:14:11.250 Info: [Event:id=API_Metrics] [===== ENDING /individuals/touchpoint/v1.0/search, RAND=2104724838533797466, TS=14848496416750860, TIME=PT0.220528S, CODE=200
Also, the time stamps are not all the same, so I unfortunately cannot work with that.
Suggestions, please!
I TOTALLY agree with @somesoni; you should rework your index-time props.conf to make sure all associated lines are treated as one event. In the meantime, you can do this:
index="marklogic_datafabric" event_id=* | reverse | streamstats count(eval(match(_raw, "ENDING"))) AS eventID BY host | stats list(_raw) AS lines BY host eventID
This approach solves your problem AND eliminates transaction
so it will be much faster and not silently drop events.
BTW, MarkLogic is a great tool, isn't it!
The only flaw in your query is the filter of event_id. Just that param will remove every line without "[Event:id=API_Metrics]" from the results.
OK, so remove that part; I only had it there because you had it there in your original base search. It should work fine without it (that part has nothing to do with my solution).
If something worked, you should click Accept
to close the question.
Yeah, would be nice, but that is not a viable option for me. I'm not allowed to have my finger in all the pots like I would like. May be better for me to just talk to our developers about making these kinds of log entries more compatible with current configurations.
Will there be multiple transactions with different RAND and TS values which may overlap??
It would much easier, if you have control to fix how Splunk is processing your raw data into Events. There is definitely a way where, after changing the event processing settings in props.conf on indexer/heavy forwarder, the events in Splunk would look like this, and the transaction command would work just fine. In fact you'd be able to replace transaction command (which is resource intensive) with stats or similar, to faster query.
Event1:
2017-01-17 22:18:18.330 Info: [Event:id=API_Metrics] [===== STARTING /individual/preferences/v1.5, RAND=9296226956377273381, TS=14847130983159950
Event2:
2017-01-17 22:18:18.330 Info: [Event:id=API_Metrics] [===== PARAMS FOR /individual/preferences/v1.5, RAND=9296226956377273381, TS=14847130983159950
2017-01-17 22:18:18.330 Info: policyNumber=####
2017-01-17 22:18:18.330 Info: dob=#####
2017-01-17 22:18:18.330 Info: fname=FFFFFFF
2017-01-17 22:18:18.330 Info: subscriberId=######
2017-01-17 22:18:18.330 Info: lname=LLLLL
2017-01-17 22:18:18.330 Info: =====]
Event3:
2017-01-17 22:18:23.092 Info: [Event:id=API_Metrics] [===== ENDING /individual/preferences/v1.5, RAND=9296226956377273381, TS=14847130983159950, TIME=PT4.762855S, CODE=200