I have a set of data from a friend who is doing some statistical work and he want me to use splunk to give meaning to his numbers.
there's only this inside the data, coming in every second and the data can change but the format is the same.
I had managed to pick out fields for the 1st line with a regex. but i wasn't as successful when i try to pick 113601820946 from line 37. Can anyone help with writing a regex for this? i need to pick what ever that is in the square brackets behind it, which is 113601820946 and needs to be dynamic cos it changes..
09:15:29 [2] [16] [123456XXXXXX1234]
09:15:29 [3] [6] [000000]
09:15:29 [4] [12] [000000000594]
09:15:29 [7] [10] [0516011527]
09:15:29 [11] [6] [820946]
09:15:29 [19] [3] [826]
09:15:29 [25] [2] [59]
09:15:29 [32] [6] [454706]
09:15:29 [37] [13] [113601820946]
09:15:29 [38] [6] [001767]
09:15:29 [39] [2] [00]
09:15:29 [41] [8] [04983408]
09:15:29 [42] [15] [17742463 ]
09:15:29 [49] [3] [840]
09:15:29 [63] [10] [8000000002]
a sample of my 1st attempt to pick from the 1st line: (?im)^(?:[^[\n]*[){3}(?P
Field name = modaco
Values returned = 123456XXXXXX1234
I had an alternative way of fixing my problems, sort of.
\[37\]\s+\[12\]\s+\[(?<UID>\d+)\]
this regex will pick field 37. I was lost for a while till i come to think of anchoring my searches to my text. Also, UID is the name to call the field picked up. To setup the field extraction, just go to field extraction,add new. Give a new name to NAME, any name, it would not mattered anyway what is it called in field extraction because we have named out field as UID in the regex string.
Hi, couldn't you match the end of the line?
(?<modaco>\w+)\s*]$
UPDATE:
Have you looked at multikv
? It is will make separate events of tabular data, e.g. output from top, netstat etc. From the look of your data, there is no header row, but this can be amended by the noheader=true option to multikv.
http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Multikv
The simplest would of course be to have the lines indexed as separate events.
hth,
Kristian
i tried | multikv noheaders = t
as per your guidance and it's not turning out the way i wanted, I can't stats the fields. because there were square brackets in them.
see update above /k
Your regex is working but not beyond the 1st row of my events. It is only picking up the 3rd square bracket of 09:15:29 [2] [16] [123456XXXXXX1234]
.
I needed 09:15:29 [37] [13] [113601820946]
Thanks for your patience
I guess any line is a separate event for you. Try this one, which should extract all the numbers in the row, giving them a name as line, second_number, modaco.
(?i)^[^\[]+\s+\[(?<line>[^\]]+)\]\s+\[(?<second_number>[^\]]+)\]\s+\[(?<modaco>[^\]]+)\]
Edit: if all events are grouped in chunks, and multikv does not to what you want (values might end up being enclosed in square brackets...) then you might want to split them in multiple events first:
| rex max_match=100 "(?m)^(?<rows>.+)$" | mvexpand rows | eval _raw=rows | rex field=rows "(?i)^[^\[]+\s+\[(?<line>[^\]]+)\]\s+\[(?<second_number>[^\]]+)\]\s+\[(?<modaco>[^\]]+)\]"
The | eval _raw=rows is just for ease of reading, but unneeded.
I edited the answer to match your comment
No unfortunately, I applied SHOULD_LINEMERGE = TRUE
to all the events coming in thru the host.
Hi, using the regex;
\[(?<field1>[^\]]+)\][\s]+\[(?<field2>[^\]]+)\][\s]+\[(?<field3>[^\]]+)\]
It will capture each group of [ ]'s into their own group with the first being group 1 and so on.
i would like to but the bit 2 has got meanings in them (which i have not fathom) which is needed to look together in a single time window thus i can't break it all up in bits.
Why merge them? It seems from your uses / needs it would be best to index each line as an event?
i have an additional line called SHOULD_LINEMERGE = TRUE
inside the props.conf. therefore, i only can pick the 1st line in the event. i needed line [37]
and it's not coming out...
You will need to add a field name to each one, I will edit the above to show how (if you want to do it directly through props and not a combination of props and transforms
i added your regex to my props.conf as follow
[sampleone]
EXTRACT-moda = \[([^\]]+)\][\s]+\[([^\]]+)\][\s]+\[([^\]]+)\]
can't work.. i think maybe i did not have a moda value in it.