Hi eveybody,
I have a series of alerts that generate new events that are sent to a specific index and also send an email to a web application, but there is no way to identify these "correlated events" by unique id.
My goal is to be able to relate these indexed events to the event created in the web application using only a number, but this number must be assigned by splunk.
Do you know of a way to assign an increasing numerical value to each new event sent to the index?
You can create a unique hash value for your event prior to indexing it and sending it to the web application.
Identify all the fields that is necessary to make your event unique, concat into one string, and calculate the hash value using MD5 or SHA256
#Hash using MD5
| eval event_hash=MD5(_time." | ".field_1." | ".field_2." | ".field_3)
#Hash using SHA256
| eval event_hash=SHA256(_time." | ".field_1." | ".field_2." | ".field_3)
The ingestion pipeline doesn't have a "state" which could be captured and whch you could modify and store (to count events, for example).
But.
There are some default internal fields which can be used to uniquely identify an event.
https://docs.splunk.com/Documentation/Splunk/9.0.1/Knowledge/Usedefaultfields#Internal_fields
You are interested in the _cd field.
However, that field should not be used in "business cases". It's rather meant as a way for debugging splunk.
As the docs say:
Because _cd is used for internal reference only, we do not recommend that you set up searches that involve it.
You can create a unique hash value for your event prior to indexing it and sending it to the web application.
Identify all the fields that is necessary to make your event unique, concat into one string, and calculate the hash value using MD5 or SHA256
#Hash using MD5
| eval event_hash=MD5(_time." | ".field_1." | ".field_2." | ".field_3)
#Hash using SHA256
| eval event_hash=SHA256(_time." | ".field_1." | ".field_2." | ".field_3)
But you are aware that this will be extremely inefficient if you wanted to use it to search for an event based on such hash later?
Anyway, if you have two identical events at the same timestamp, they will get hashed to the same value so the requirement of uniqueness is not fulfilled.
I have solved the problem with this command:
| eval event_hash=md5("". random())
It generates a random number as reference for creating the MD5 hash, then I have a kind of ID for this new event.
I don't get it. Where did you put this hash calculation?
It's a "Splunk Search" forum and your syntax is an SPL syntax which suggests that you're doing it in search time. This way the value will be different each time the search is run.
As I understood the initial request, you wanted a unique value to be permanently assigned to each event.
I have a search saved as an alert in Splunk, every time this alert has a match it generates an event that is added to an index and sends an email with the event to the web app.
In the same search the MD5 is added as a new field and value to the generated event, then the indexed event saves that "hash" as a unique identifier. I actually call it ID.
If someone in the web app asks me about the ID-X alert, I can search the index for the event with ID-X and find it.
It works 🙂
EDIT: I had to add a field to avoid coincidences:
| eval event_hash=md5("earliest". random())
Ahh. So you're saving the output into an index (by means of collect or something like that). Then you simply could just calculate max(your_id)+1.
You weren't very precise about your question 😛
Anyway, with random() as you noticed, you can generate collisions (with hashing function too, but that's not very likely).