Splunk Search

giving meaning to numbers

e82than
Communicator

I have a set of data from a friend who is doing some statistical work and he want me to use splunk to give meaning to his numbers.

there's only this inside the data, coming in every second and the data can change but the format is the same.

I had managed to pick out fields for the 1st line with a regex. but i wasn't as successful when i try to pick 113601820946 from line 37. Can anyone help with writing a regex for this? i need to pick what ever that is in the square brackets behind it, which is 113601820946 and needs to be dynamic cos it changes..

09:15:29 [2] [16] [123456XXXXXX1234]
09:15:29 [3] [6] [000000]
09:15:29 [4] [12] [000000000594]
09:15:29 [7] [10] [0516011527]
09:15:29 [11] [6] [820946]
09:15:29 [19] [3] [826]
09:15:29 [25] [2] [59]
09:15:29 [32] [6] [454706]
09:15:29 [37] [13] [113601820946]
09:15:29 [38] [6] [001767]
09:15:29 [39] [2] [00]
09:15:29 [41] [8] [04983408]
09:15:29 [42] [15] [17742463 ]
09:15:29 [49] [3] [840]
09:15:29 [63] [10] [8000000002]

a sample of my 1st attempt to pick from the 1st line: (?im)^(?:[^[\n]*[){3}(?P[^]]+)

Field name = modaco
Values returned = 123456XXXXXX1234

Tags (2)
0 Karma

e82than
Communicator

I had an alternative way of fixing my problems, sort of.

\[37\]\s+\[12\]\s+\[(?<UID>\d+)\] this regex will pick field 37. I was lost for a while till i come to think of anchoring my searches to my text. Also, UID is the name to call the field picked up. To setup the field extraction, just go to field extraction,add new. Give a new name to NAME, any name, it would not mattered anyway what is it called in field extraction because we have named out field as UID in the regex string.

0 Karma

kristian_kolb
Ultra Champion

Hi, couldn't you match the end of the line?

(?<modaco>\w+)\s*]$

UPDATE:

Have you looked at multikv? It is will make separate events of tabular data, e.g. output from top, netstat etc. From the look of your data, there is no header row, but this can be amended by the noheader=true option to multikv.

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Multikv

The simplest would of course be to have the lines indexed as separate events.

hth,

Kristian

0 Karma

e82than
Communicator

i tried | multikv noheaders = t as per your guidance and it's not turning out the way i wanted, I can't stats the fields. because there were square brackets in them.

0 Karma

kristian_kolb
Ultra Champion

see update above /k

0 Karma

e82than
Communicator

Your regex is working but not beyond the 1st row of my events. It is only picking up the 3rd square bracket of 09:15:29 [2] [16] [123456XXXXXX1234].

I needed 09:15:29 [37] [13] [113601820946]

Thanks for your patience

0 Karma

Paolo_Prigione
Builder

I guess any line is a separate event for you. Try this one, which should extract all the numbers in the row, giving them a name as line, second_number, modaco.

(?i)^[^\[]+\s+\[(?<line>[^\]]+)\]\s+\[(?<second_number>[^\]]+)\]\s+\[(?<modaco>[^\]]+)\]

Edit: if all events are grouped in chunks, and multikv does not to what you want (values might end up being enclosed in square brackets...) then you might want to split them in multiple events first:

| rex max_match=100 "(?m)^(?<rows>.+)$" | mvexpand rows | eval _raw=rows | rex field=rows "(?i)^[^\[]+\s+\[(?<line>[^\]]+)\]\s+\[(?<second_number>[^\]]+)\]\s+\[(?<modaco>[^\]]+)\]"

The | eval _raw=rows is just for ease of reading, but unneeded.

0 Karma

Paolo_Prigione
Builder

I edited the answer to match your comment

0 Karma

e82than
Communicator

No unfortunately, I applied SHOULD_LINEMERGE = TRUE to all the events coming in thru the host.

0 Karma

Drainy
Champion

Hi, using the regex;

\[(?<field1>[^\]]+)\][\s]+\[(?<field2>[^\]]+)\][\s]+\[(?<field3>[^\]]+)\]

It will capture each group of [ ]'s into their own group with the first being group 1 and so on.

0 Karma

e82than
Communicator

i would like to but the bit 2 has got meanings in them (which i have not fathom) which is needed to look together in a single time window thus i can't break it all up in bits.

0 Karma

Drainy
Champion

Why merge them? It seems from your uses / needs it would be best to index each line as an event?

0 Karma

e82than
Communicator

i have an additional line called SHOULD_LINEMERGE = TRUE inside the props.conf. therefore, i only can pick the 1st line in the event. i needed line [37] and it's not coming out...

0 Karma

Drainy
Champion

You will need to add a field name to each one, I will edit the above to show how (if you want to do it directly through props and not a combination of props and transforms

0 Karma

e82than
Communicator

i added your regex to my props.conf as follow

[sampleone]
EXTRACT-moda = \[([^\]]+)\][\s]+\[([^\]]+)\][\s]+\[([^\]]+)\]

can't work.. i think maybe i did not have a moda value in it.

0 Karma
Get Updates on the Splunk Community!

Harnessing Splunk’s Federated Search for Amazon S3

Managing your data effectively often means balancing performance, costs, and compliance. Splunk’s Federated ...

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...