I have a ticket dump CSV file format. I'm trying to create some patterns based on the description.
What kind of search commands should I use?
How should I correlate the event based on a description? Please assist.
WebGUI not available
FAILED TO LAUNCH CLICK
URGENT - application not working
Assumptions! That this is data from a web form or email or something, and it's freeform text. You don't quite come out and say that it is, but reading between the lines seems to make it the only sensible option.
In that case, you are fighting the myriad ways people can spell things rong, the various ways they use english [that was on purpose], vocabulary, and everything else that goes into free text/semantics analysis.
So one possible app that may help (search for more, I suppose!) could be https://splunkbase.splunk.com/app/1179/. I have not used this app, I have no affiliation with it in any way, but it looks like it could be cool (might even try it for myself just to see).
My suggestion is to start with keywords. It'll take a while to build a good list of them, that takes actual effort. Then comes the fun part of what to do with them. One idea might be to use the match command to group them.
... | eval message=lower(message) | eval category=case(match(message, "(urgent|fail*)"), "Important", match(message, "(why|question)", "Question", match(message, "(love*|like*|interesting)", "Feedback") | timechart count by category
I believe the syntax is correct in the above, but I haven't really tested it. Might need a bit of tweaking. (Also, I rarely use Match (preferring rex and regex) but I think I got the syntax right...)
The attempt above is to find build a field
category and if anything in the message matches the word
urgent or one of the variants of
fail... then mark it as
important. Otherwise, if any word in it matches
question mark it as a
Question, lastly, if it goes on about how much they or their sister loves it, likes it or finds it interesting, mark it as
feedback. Obviously, we ignore the nonexistent "I hate you and your little dog, too". No sense worrying about impossible things, right?
I guess at this point we would need some additional test data, or maybe something a bit more specific as to the terms you expect or what you want to do with them.
Just FYI, if you haven't figured it out, doing useful things with freeform text can be hard. Possible, but hard.
Anyway, hope this helps, insofar as it goes.
Without more samples, you would use newlines as a delimiter for this data:
... [ your original search ] ... | rex "^(?<message1>[^\r\n]+)[\r\n]+(?<message2>[^\r\n]+)[\r\n]+(?<message3>[^\r\n]+)"
You might also be able to assume that in the third line we have an alert level type of field, and the level is URGENT:
... [ your original search ] ...
| rex "^(?<message1>[^\r\n]+)[\r\n]+(?<message2>[^\r\n]+)[\r\n]+(?<alert_level>\S+)\s+-\s+(?<alert_reason>[^\r\n]+)"
Without more samples, it's hard to provide a more detailed response.