Splunk Search

Using join, earliest, table and latest.

xploresplunk
New Member

I am a new splunk user and I want to create a stats table showing different findings of an event using fields. However, I am running into error when I use the earliest command twice. Here's what I have so far:

index= xxx source=xxx sourcetype=xxx
| stats latest(name) as name, latest(call_time) as call_time
| eval call_time=strftime(...)
| table name, call_time

| join name
   [ search index=xxx source=xxx sourcetype=xxx conversation="\*hello\*"
   | stats earliest(_time) as first_hello by name
   | eval first_hello=srtftime(...) ]

| join name
   [ search index=xxx source=xxx  sourcetype=xxx messages="\*how\*"
   | stats earliest(_time) as first_how by name
   | eval first_how=srtftime(...) ]

| table name, call_time, first_hello, first_how

My errors are the following:
1. Both first_hello and first_how, are displaying the same time.
2. As I added the 'join' I could tell that the number of statistics decreased. I want to find a way that it displays all the events and that if a certain time (or word) cannot be found then it will just stay blank. I don't know if this is possible. I can see how that contradicts the purpose of 'join' but I couldn't find another way to do it.

Clarifications:
1. When it comes to messages and conversations, I want to find the first time that each field had a value containing the specific word(hello and how correspondingly). These two fields contain values that look like paragraphs. In other words, I want to find the first time that xxname said hello in conversation and how in messages.

Goal:
Display a table that shows: name,TIME of the last call (corresponding to that name), TIME of the first time the word hello was said in the values of the conversation field, TIME of the first time the word how was said in the values of the messages field.

Let me know if I need to clarify anything else.

0 Karma

woodcock
Esteemed Legend

Never use join; there is always a better, more-splunky way. Try this:

index=xxx source=xxx sourcetype=xxx
| eval type = case(
   match(conversation, "hello"), "hello_call_time",
   match(messages,     "how"),   "how_call_time",
   true(),                       null())
| multireport
[ stats first(call_time) AS call_time BY name]
[ stats min(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
| stats list(*) AS * BY name

xploresplunk
New Member

Could you explain a little bit more what you're doing here? I don't think this is doing what I'm trying to obtain. I am trying to create a table that displays name, call_time, first_hello and first_how. I can successfully obtain call_time by using the latest() function. My error happens when trying to obtain the timestamp of first_hello and first_how. I need to use the earliest() function for both, and I don't know why, but they're both storing the same time. Let me know if I need to clarify anything else

0 Karma

woodcock
Esteemed Legend

Did you try it? It does exactly that. The search is not very long and is pretty much self-explanatory except for the multireport part. All that does is fork the data at that point and pass a copy to both stanzas and merges the results of those together.

0 Karma

xploresplunk
New Member

Yeah, I believe I did exactly the same. The table is only displaying the name and call_time.

0 Karma

woodcock
Esteemed Legend

Quite correct; there was a bug, I have modified my answer and tested it. Give it a try now.

0 Karma

xploresplunk
New Member

It's still only displaying name and call_time

0 Karma

woodcock
Esteemed Legend

Check this run-anywhere example to prove the concept works, generally:

index=_* 
 | eval type = case(
    match(sourcetype, "splunkd"), "hello_call_time",
    match(sourcetype, "_|mogo|kv"), "how_call_time",
    true(), null())
| eval call_time = _time, name = date_minute

 | multireport
 [ stats first(call_time) AS call_time BY name]
 [ stats min(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
 | stats list(*) AS * BY name
0 Karma

xploresplunk
New Member

Yeah, this works for me. However, when I use my own information and fields it doesnt work. This is a shot in the dark, but maybe because match is expecting a sourcetype? I dont know if I misunderstood

0 Karma

xploresplunk
New Member

The words 'hello' and 'how' are under the events of different fields. Are there any other ways to achieve what Im trying to?

0 Karma

woodcock
Esteemed Legend

The updated solution using match() should work. Did you notice that I changed my original answer to fix a bug?

0 Karma

xploresplunk
New Member

Yeah, when I use that it only displays the name and call_time

0 Karma

woodcock
Esteemed Legend

in fact, just post your existing search; something has to be off in the translation.

0 Karma

xploresplunk
New Member
index=xxx source=xxx sourcetype=xxx
 | eval type = case(
    match(conversation, "hello"), "first_hello",
    match(messages,     "how"),   "first_how",
    true(),                       null())
 | multireport
 [ stats first(call_time) AS call_time BY name]
 [ stats earliest(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
 | stats list(*) AS * BY name
0 Karma

woodcock
Esteemed Legend

that definitely should do it. I stand by the answer and am baffled that it isn't working.

0 Karma

xploresplunk
New Member

Are there any other ways to approach this? I had tried using match before and it never really worked. It only worked whenever I used: search=xxx ... conversation="*hello*"
| stats earliest(_time) as first_hello by name
| eval first_hello=srtftime(...) ]

I just can't figure out how to both searches at the same time because I run in the error I stated on the question. Thank you so much for your help!

0 Karma

woodcock
Esteemed Legend

You could try this:

index=xxx source=xxx sourcetype=xxx
| eval type = case(
   searchmatch("hello"), "first_hello",
   searchmatch("how"),   "first_how",
   true(), null())
| multireport
[ stats first(call_time) AS call_time BY name]
[ stats earliest(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
| stats list(*) AS * BY name
0 Karma

xploresplunk
New Member

Still only showing the two same columns

0 Karma

woodcock
Esteemed Legend

Make sure that your strings in match() do not have asterisks ( * ).

0 Karma

woodcock
Esteemed Legend

No, sourcetype has nothing to do with it; that's juts a hacky way that I used to fake my data to be somewhat like yours.

0 Karma

jnudell_2
Builder

Hi @xploresplunk ,

The rule of thumb in Splunk is: "When possible, avoid the use of join unless it's absolutely necessary". This is because join is expensive and clunky, although it can achieve the desired results.

For your particular situation, I don't think join is necessary.

Try something like this:

index= xxx source=xxx sourcetype=xxx
| stats earliest(eval(match(conversation, "\*hello\*"))) as first_hello
earliest(eval(match(messages, "\*how\*"))) as first_how
latest(call_time) as call_time by name
| table name, call_time, first_hello, first_how

See if this gets you what you're looking for.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...