Splunk Search

Join two searches with the latest field

New Member

Hello,

Is it possible to perform a join type=left to another search by combining the also the latest field?

Example below

main search
| stats list(time) as events by _time user src
| stats last(events) as latest by _time user src
| rename src as client
ip
| join type=left client_ip latest [search index=xxxxxx earliest=-12h]

Many thanks.

Tags (1)
0 Karma

Esteemed Legend

I made your search "work" but I don't like it at all (for example, are you sure that AND client_message="adding an RR at *" AND client_message="* A *" is correct?):

| tstats `summariesonly` count
FROM datamodel="Web"
WHERE index=XXXX sourcetype=XXXXX
BY  _time span=1s Web.user Web.src Web.dest
| `drop_dm_object_name(Web)`
| stats list(_time) AS events values(dest) AS dest BY _time user src
| eval events=strftime(events,"%m/%d/%Y:%H:%M:%S")
| bucket _time span=1h
| rename COMMENT AS "Walking *backwards* through time, the 'first' we come to is the most-recent event"
| stats dc(dest) AS no_dest first(events) AS first last(events) AS latest BY _time user src
| rename src AS client_ip ]
| join type=left client_ip latest  [search index="xxxx" AND sourcetype="infoblox:dns" AND eventtype="infoblox_dns_change" AND client_message="adding an RR at *" AND  client_message="* A *" earliest=-1d
| rename COMMENT AS "Walking *backwards* through time, the 'first' we come to is the most-recent event"
| stats first(client_name) AS latest BY client_ip]
0 Karma

Esteemed Legend

Show us 2 samples data sets and the expected output.

0 Karma

New Member

Hello, this is the full query that I am running.

| tstats `summariesonly` count FROM datamodel="Web" WHERE index=XXXX sourcetype=XXXXX by _time span=1s Web.user Web.src Web.dest
| `drop_dm_object_name(Web)`
| sort 0 _time
| stats list(_time) as events values(dest) as dest by _time user src
| eval events=strftime(events,"%m/%d/%Y:%H:%M:%S")
| bucket _time span=1h
| stats dc(dest) as no_dest earliest(events) as first last(events) as latest by _time user src
| rename src as client_ip
| join type=left client_ip latest [search index=xxxx sourcetype="infoblox:dns" eventtype=infoblox_dns_change client_message="adding an RR at *" client_message="* A *" earliest=-1d
| stats count by _time client_ip client_name
| stats last(client_name) as client_name last(_time) as _time by client_ip
| fields - _time]
0 Karma

Esteemed Legend

Notice that I did not ask for this and you did not provide what I did ask for.

0 Karma

Builder

There's your problem - you have no latest field in your subsearch. You have _time, client_ip, client_name

And I don't know why you're souble-stats'ing in your subsearch, either :

| stats count by _time client_ip client_name
| stats last(client_name) as client_name last(_time) as _time by client_ip

Just do the second stats, but change from last() to latest(), and you should get the same results

0 Karma

New Member

I do not think this is the issue.

If I just pass only the client_ip everything works fine, but I want to manipulate the time range of the subsearch. When I am passing also the latest in the join then it does not work.

Below it is working fine.

| join type=left clientip [search index=xxxx sourcetype="infoblox:dns" eventtype=infobloxdnschange clientmessage="adding an RR at " clientmessage=" A *" earliest=-1d
| stats count by _time client
ip clientname
| stats last(client
name) as clientname last(time) as time by clientip
| fields - _time]

0 Karma

Builder

You can join on as many fields as you want

But doing it on latest, in your example, is probably not what you really mean - though it may be

What are you actually trying to accomplish?

0 Karma

New Member

From the main search I am collecting the earliest and latest time a user is connecting to the internet via proxy within an hour. Then I am inserting the IP address of the laptop of the user to another search (DNS) in order to find the dynamic updates logs and find the hostname of the laptop as this information does not exist in the proxy logs.

However, when this user is connecting via VPN and there is any interruption, the user within an hour may take another IP address and the old one may have been taken from another laptop. As I run the search every hour I want to be sure that the DNS logs will return the correct name and not the latest name (as the latest DNS update will point this IP address to another laptop). For this reason I was thinking to run the 2nd search with a dynamic field (latest) which will be calculated in the main search and it will search in the DNS only up to the last time this user used this IP address.

In general is there any way to dynamically manipulate from the main search the time range (earliest latest) that the 2nd search will run?

0 Karma

Builder

It's absolutely at least part of your problem

You cannot join on a field that doesn't exist - Splunk looks for matches of all fields in the | join command

If you list one, like latest, that's not in the subsearch, nothing will join

0 Karma

New Member

Basically I am trying to replace the latest="03/21/2020:17:27:23" in the below query with joining it as

index="xxxx" AND sourcetype="infoblox:dns" AND eventtype="infobloxdnschange" AND clientmessage="adding an RR at *" AND clientmessage="* A " earliest=-1d *latest="03/21/2020:17:27:23"**

| join type=left clientip latest [index="xxxx" AND sourcetype="infoblox:dns" AND eventtype="infobloxdnschange" AND clientmessage="adding an RR at " AND client_message=" A *" earliest=-1d ]

The latest exists right?

0 Karma

Builder

No, latest does not exist in the subsearch

Remove it thusly: | join client_ip [...]

From the join docs.splunk page:

Field names must match, not just in name but also in case. You cannot join productid with productID. You must first change the case of the field in the subsearch to match the field in the main search.

0 Karma