Splunk Search

Using join, earliest, table and latest.

New Member

I am a new splunk user and I want to create a stats table showing different findings of an event using fields. However, I am running into error when I use the earliest command twice. Here's what I have so far:

index= xxx source=xxx sourcetype=xxx
| stats latest(name) as name, latest(call_time) as call_time
| eval call_time=strftime(...)
| table name, call_time

| join name
   [ search index=xxx source=xxx sourcetype=xxx conversation="\*hello\*"
   | stats earliest(_time) as first_hello by name
   | eval first_hello=srtftime(...) ]

| join name
   [ search index=xxx source=xxx  sourcetype=xxx messages="\*how\*"
   | stats earliest(_time) as first_how by name
   | eval first_how=srtftime(...) ]

| table name, call_time, first_hello, first_how

My errors are the following:
1. Both first_hello and first_how, are displaying the same time.
2. As I added the 'join' I could tell that the number of statistics decreased. I want to find a way that it displays all the events and that if a certain time (or word) cannot be found then it will just stay blank. I don't know if this is possible. I can see how that contradicts the purpose of 'join' but I couldn't find another way to do it.

Clarifications:
1. When it comes to messages and conversations, I want to find the first time that each field had a value containing the specific word(hello and how correspondingly). These two fields contain values that look like paragraphs. In other words, I want to find the first time that xxname said hello in conversation and how in messages.

Goal:
Display a table that shows: name,TIME of the last call (corresponding to that name), TIME of the first time the word hello was said in the values of the conversation field, TIME of the first time the word how was said in the values of the messages field.

Let me know if I need to clarify anything else.

0 Karma

Esteemed Legend

Never use join; there is always a better, more-splunky way. Try this:

index=xxx source=xxx sourcetype=xxx
| eval type = case(
   match(conversation, "hello"), "hello_call_time",
   match(messages,     "how"),   "how_call_time",
   true(),                       null())
| multireport
[ stats first(call_time) AS call_time BY name]
[ stats min(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
| stats list(*) AS * BY name

New Member

Could you explain a little bit more what you're doing here? I don't think this is doing what I'm trying to obtain. I am trying to create a table that displays name, call_time, first_hello and first_how. I can successfully obtain call_time by using the latest() function. My error happens when trying to obtain the timestamp of first_hello and first_how. I need to use the earliest() function for both, and I don't know why, but they're both storing the same time. Let me know if I need to clarify anything else

0 Karma

Esteemed Legend

Did you try it? It does exactly that. The search is not very long and is pretty much self-explanatory except for the multireport part. All that does is fork the data at that point and pass a copy to both stanzas and merges the results of those together.

0 Karma

New Member

Yeah, I believe I did exactly the same. The table is only displaying the name and call_time.

0 Karma

Esteemed Legend

Quite correct; there was a bug, I have modified my answer and tested it. Give it a try now.

0 Karma

New Member

It's still only displaying name and call_time

0 Karma

Esteemed Legend

Check this run-anywhere example to prove the concept works, generally:

index=_* 
 | eval type = case(
    match(sourcetype, "splunkd"), "hello_call_time",
    match(sourcetype, "_|mogo|kv"), "how_call_time",
    true(), null())
| eval call_time = _time, name = date_minute

 | multireport
 [ stats first(call_time) AS call_time BY name]
 [ stats min(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
 | stats list(*) AS * BY name
0 Karma

New Member

Yeah, this works for me. However, when I use my own information and fields it doesnt work. This is a shot in the dark, but maybe because match is expecting a sourcetype? I dont know if I misunderstood

0 Karma

New Member

The words 'hello' and 'how' are under the events of different fields. Are there any other ways to achieve what Im trying to?

0 Karma

Esteemed Legend

The updated solution using match() should work. Did you notice that I changed my original answer to fix a bug?

0 Karma

New Member

Yeah, when I use that it only displays the name and call_time

0 Karma

Esteemed Legend

in fact, just post your existing search; something has to be off in the translation.

0 Karma

New Member
index=xxx source=xxx sourcetype=xxx
 | eval type = case(
    match(conversation, "hello"), "first_hello",
    match(messages,     "how"),   "first_how",
    true(),                       null())
 | multireport
 [ stats first(call_time) AS call_time BY name]
 [ stats earliest(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
 | stats list(*) AS * BY name
0 Karma

Esteemed Legend

that definitely should do it. I stand by the answer and am baffled that it isn't working.

0 Karma

New Member

Are there any other ways to approach this? I had tried using match before and it never really worked. It only worked whenever I used: search=xxx ... conversation="*hello*"
| stats earliest(_time) as first_hello by name
| eval first_hello=srtftime(...) ]

I just can't figure out how to both searches at the same time because I run in the error I stated on the question. Thank you so much for your help!

0 Karma

Esteemed Legend

You could try this:

index=xxx source=xxx sourcetype=xxx
| eval type = case(
   searchmatch("hello"), "first_hello",
   searchmatch("how"),   "first_how",
   true(), null())
| multireport
[ stats first(call_time) AS call_time BY name]
[ stats earliest(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
| stats list(*) AS * BY name
0 Karma

New Member

Still only showing the two same columns

0 Karma

Esteemed Legend

Make sure that your strings in match() do not have asterisks ( * ).

0 Karma

Esteemed Legend

No, sourcetype has nothing to do with it; that's juts a hacky way that I used to fake my data to be somewhat like yours.

0 Karma

Builder

Hi @xploresplunk ,

The rule of thumb in Splunk is: "When possible, avoid the use of join unless it's absolutely necessary". This is because join is expensive and clunky, although it can achieve the desired results.

For your particular situation, I don't think join is necessary.

Try something like this:

index= xxx source=xxx sourcetype=xxx
| stats earliest(eval(match(conversation, "\*hello\*"))) as first_hello
earliest(eval(match(messages, "\*how\*"))) as first_how
latest(call_time) as call_time by name
| table name, call_time, first_hello, first_how

See if this gets you what you're looking for.

0 Karma