Re: Using join, earliest, table and latest.

xploresplunk · ‎06-26-2019

I am a new splunk user and I want to create a stats table showing different findings of an event using fields. However, I am running into error when I use the earliest command twice. Here's what I have so far:

index= xxx source=xxx sourcetype=xxx
| stats latest(name) as name, latest(call_time) as call_time
| eval call_time=strftime(...)
| table name, call_time

| join name
   [ search index=xxx source=xxx sourcetype=xxx conversation="\*hello\*"
   | stats earliest(_time) as first_hello by name
   | eval first_hello=srtftime(...) ]

| join name
   [ search index=xxx source=xxx  sourcetype=xxx messages="\*how\*"
   | stats earliest(_time) as first_how by name
   | eval first_how=srtftime(...) ]

| table name, call_time, first_hello, first_how

My errors are the following:
1. Both first_hello and first_how, are displaying the same time.
2. As I added the 'join' I could tell that the number of statistics decreased. I want to find a way that it displays all the events and that if a certain time (or word) cannot be found then it will just stay blank. I don't know if this is possible. I can see how that contradicts the purpose of 'join' but I couldn't find another way to do it.

Clarifications:
1. When it comes to messages and conversations, I want to find the first time that each field had a value containing the specific word(hello and how correspondingly). These two fields contain values that look like paragraphs. In other words, I want to find the first time that xxname said hello in conversation and how in messages.

Goal:
Display a table that shows: name,TIME of the last call (corresponding to that name), TIME of the first time the word hello was said in the values of the conversation field, TIME of the first time the word how was said in the values of the messages field.

Let me know if I need to clarify anything else.

woodcock · ‎06-29-2019

Never use join; there is always a better, more-splunky way. Try this:

index=xxx source=xxx sourcetype=xxx
| eval type = case(
   match(conversation, "hello"), "hello_call_time",
   match(messages,     "how"),   "how_call_time",
   true(),                       null())
| multireport
[ stats first(call_time) AS call_time BY name]
[ stats min(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
| stats list(*) AS * BY name

xploresplunk · ‎07-01-2019

Could you explain a little bit more what you're doing here? I don't think this is doing what I'm trying to obtain. I am trying to create a table that displays name, call_time, first_hello and first_how. I can successfully obtain call_time by using the latest() function. My error happens when trying to obtain the timestamp of first_hello and first_how. I need to use the earliest() function for both, and I don't know why, but they're both storing the same time. Let me know if I need to clarify anything else

woodcock · ‎07-01-2019

Did you try it? It does exactly that. The search is not very long and is pretty much self-explanatory except for the multireport part. All that does is fork the data at that point and pass a copy to both stanzas and merges the results of those together.

xploresplunk · ‎07-01-2019

Yeah, I believe I did exactly the same. The table is only displaying the name and call_time.

woodcock · ‎07-01-2019

Quite correct; there was a bug, I have modified my answer and tested it. Give it a try now.

xploresplunk · ‎07-01-2019

It's still only displaying name and call_time

woodcock · ‎07-01-2019

Check this run-anywhere example to prove the concept works, generally:

index=_* 
 | eval type = case(
    match(sourcetype, "splunkd"), "hello_call_time",
    match(sourcetype, "_|mogo|kv"), "how_call_time",
    true(), null())
| eval call_time = _time, name = date_minute

 | multireport
 [ stats first(call_time) AS call_time BY name]
 [ stats min(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
 | stats list(*) AS * BY name

xploresplunk · ‎07-01-2019

Yeah, this works for me. However, when I use my own information and fields it doesnt work. This is a shot in the dark, but maybe because match is expecting a sourcetype? I dont know if I misunderstood

xploresplunk · ‎07-01-2019

The words 'hello' and 'how' are under the events of different fields. Are there any other ways to achieve what Im trying to?

woodcock · ‎07-01-2019

The updated solution using match() should work. Did you notice that I changed my original answer to fix a bug?

xploresplunk · ‎07-01-2019

Yeah, when I use that it only displays the name and call_time

woodcock · ‎07-01-2019

in fact, just post your existing search; something has to be off in the translation.

xploresplunk · ‎07-01-2019

index=xxx source=xxx sourcetype=xxx
 | eval type = case(
    match(conversation, "hello"), "first_hello",
    match(messages,     "how"),   "first_how",
    true(),                       null())
 | multireport
 [ stats first(call_time) AS call_time BY name]
 [ stats earliest(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
 | stats list(*) AS * BY name

woodcock · ‎07-01-2019

that definitely should do it. I stand by the answer and am baffled that it isn't working.

xploresplunk · ‎07-01-2019

Are there any other ways to approach this? I had tried using match before and it never really worked. It only worked whenever I used: search=xxx ... conversation="*hello*"
| stats earliest(_time) as first_hello by name
| eval first_hello=srtftime(...) ]

I just can't figure out how to both searches at the same time because I run in the error I stated on the question. Thank you so much for your help!

woodcock · ‎07-01-2019

You could try this:

index=xxx source=xxx sourcetype=xxx
| eval type = case(
   searchmatch("hello"), "first_hello",
   searchmatch("how"),   "first_how",
   true(), null())
| multireport
[ stats first(call_time) AS call_time BY name]
[ stats earliest(_time) AS call_time BY name type | eval {type}=call_time | fields - type call_time ]
| stats list(*) AS * BY name

xploresplunk · ‎07-01-2019

Still only showing the two same columns

woodcock · ‎07-01-2019

Make sure that your strings in match() do not have asterisks ( * ).

woodcock · ‎07-01-2019

No, sourcetype has nothing to do with it; that's juts a hacky way that I used to fake my data to be somewhat like yours.

jnudell_2 · ‎06-27-2019

Hi @xploresplunk ,

The rule of thumb in Splunk is: "When possible, avoid the use of join unless it's absolutely necessary". This is because join is expensive and clunky, although it can achieve the desired results.

For your particular situation, I don't think join is necessary.

Try something like this:

index= xxx source=xxx sourcetype=xxx

| stats earliest(eval(match(conversation, "\*hello\*"))) as first_hello 

    earliest(eval(match(messages, "\*how\*"))) as first_how 

    latest(call_time) as call_time by name

| table name, call_time, first_hello, first_how

See if this gets you what you're looking for.

Using join, earliest, table and latest.

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Fuel Your Journey: What’s Waiting for You at the .conf26 Acceleration Station

Join the Final Session of the Data Management & Federation Bootcamp Series

From Data to Insight: Announcing the Winners of the Splunk Dashboard Contest

Join the Conversation