Splunk Enterprise

unique search string from json

sfreudiger
Explorer

hello there,

i am trying to analyze json data that contains a lot of fields.
here i want to first search for a string where one part of it is static and one part of it is variable and then get a count of how many time each string was found.

example:


field1: some text mystr-555 more tex33t
field2: other text mystr-555 more textg5
field3: foobaar mystr-555 bar bar foo
field4: xyz mystr-222 foo 98432
field5: random numbers and text mystr-222 more text


so i search for "** *mystr-* **" and get 5 different results (since it found 5 different fields).
i'd like to somehow pass only the found string further so i can do further analysis with it, but i fail miserably 😕

i tried dedup, eval and return but most definitely i used them in a wrong way.

i am very new to working in this field (normally i use grep awk and sed) and i am aware that i am asking newbie things, but i know it is possible and not that hard. maybe some one of you have the five minutes to help me out here.

any input would be appreciated,

regards,
sam

Tags (1)
0 Karma
1 Solution

DMohn
Motivator

Disregard my previous comment, I missed the first part of your question...

You can use the rex command to do the field extraction, and then count by values of your field.

 base_search | rex max_match=100 "(?<myextraction>mystr-\d+)"  | stats count by myextraction

The max_match is needed to tell the rex command to not stop after the first match, but to create a multi-value extraction. If you presumably have more than 100 hits in your events, you need to adjust this accordingly.

View solution in original post

0 Karma

DMohn
Motivator

Disregard my previous comment, I missed the first part of your question...

You can use the rex command to do the field extraction, and then count by values of your field.

 base_search | rex max_match=100 "(?<myextraction>mystr-\d+)"  | stats count by myextraction

The max_match is needed to tell the rex command to not stop after the first match, but to create a multi-value extraction. If you presumably have more than 100 hits in your events, you need to adjust this accordingly.

0 Karma

sfreudiger
Explorer

not allowed to have more posts since i only have a rep of 40, hence here my reply to dmohn:

dear dmohn,

thank you for your fast reply!
it looks good but i get an "No results found."

what does the d+ do?

0 Karma

sfreudiger
Explorer

ignore my previous post, i was able to get it working.

my example i gave here was not correct, i managed to get what i want with this search:

host=twitter | rex max_match=100 "(?mystr-[0-9][0-9][0-9][0-9]-\d+)" | stats count by myextraction

thank you a lot!!

0 Karma

DMohn
Motivator

@sfreudiger:
Glad you managed it anyway!

Just for your information: the regex \d+ translates to 'one or more digits [0-9]' - So you could simplify your extraction to mystr-\d{4}-\d+

To debug regex extractions, have a look at https://regex101.com - I gained a ton of regex knowledge there!

0 Karma

sfreudiger
Explorer

thank you again!

cant award any points yet, once i have enough i'll come back here and give you some karma 😉

sfreudiger
Explorer

for clarification: i would like my result to be something like this:

mystr-555: count:3
mystr-222: count:2

.. and then i want to go on and add date/time information, but that's next and nothing i worry about now

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...