Hello,
Is it possible to perform a join type=left to another search by combining the also the latest field?
Example below
main search
| stats list(_time) as events by _time user src
| stats last(events) as latest by _time user src
| rename src as client_ip
| join type=left client_ip latest [search index=xxxxxx earliest=-12h]
Many thanks.
I made your search "work" but I don't like it at all (for example, are you sure that AND client_message="adding an RR at *" AND client_message="* A *"
is correct?):
| tstats `summariesonly` count
FROM datamodel="Web"
WHERE index=XXXX sourcetype=XXXXX
BY _time span=1s Web.user Web.src Web.dest
| `drop_dm_object_name(Web)`
| stats list(_time) AS events values(dest) AS dest BY _time user src
| eval events=strftime(events,"%m/%d/%Y:%H:%M:%S")
| bucket _time span=1h
| rename COMMENT AS "Walking *backwards* through time, the 'first' we come to is the most-recent event"
| stats dc(dest) AS no_dest first(events) AS first last(events) AS latest BY _time user src
| rename src AS client_ip ]
| join type=left client_ip latest [search index="xxxx" AND sourcetype="infoblox:dns" AND eventtype="infoblox_dns_change" AND client_message="adding an RR at *" AND client_message="* A *" earliest=-1d
| rename COMMENT AS "Walking *backwards* through time, the 'first' we come to is the most-recent event"
| stats first(client_name) AS latest BY client_ip]
Show us 2 samples data sets and the expected output.
Hello, this is the full query that I am running.
| tstats `summariesonly` count FROM datamodel="Web" WHERE index=XXXX sourcetype=XXXXX by _time span=1s Web.user Web.src Web.dest
| `drop_dm_object_name(Web)`
| sort 0 _time
| stats list(_time) as events values(dest) as dest by _time user src
| eval events=strftime(events,"%m/%d/%Y:%H:%M:%S")
| bucket _time span=1h
| stats dc(dest) as no_dest earliest(events) as first last(events) as latest by _time user src
| rename src as client_ip
| join type=left client_ip latest [search index=xxxx sourcetype="infoblox:dns" eventtype=infoblox_dns_change client_message="adding an RR at *" client_message="* A *" earliest=-1d
| stats count by _time client_ip client_name
| stats last(client_name) as client_name last(_time) as _time by client_ip
| fields - _time]
Notice that I did not ask for this and you did not provide what I did ask for.
There's your problem - you have no latest
field in your subsearch. You have _time, client_ip, client_name
And I don't know why you're souble-stats'ing in your subsearch, either :
| stats count by _time client_ip client_name
| stats last(client_name) as client_name last(_time) as _time by client_ip
Just do the second stats, but change from last()
to latest()
, and you should get the same results
I do not think this is the issue.
If I just pass only the client_ip everything works fine, but I want to manipulate the time range of the subsearch. When I am passing also the latest in the join then it does not work.
Below it is working fine.
| join type=left client_ip [search index=xxxx sourcetype="infoblox:dns" eventtype=infoblox_dns_change client_message="adding an RR at " client_message=" A *" earliest=-1d
| stats count by _time client_ip client_name
| stats last(client_name) as client_name last(_time) as _time by client_ip
| fields - _time]
You can join
on as many fields as you want
But doing it on latest
, in your example, is probably not what you really mean - though it may be
What are you actually trying to accomplish?
From the main search I am collecting the earliest and latest time a user is connecting to the internet via proxy within an hour. Then I am inserting the IP address of the laptop of the user to another search (DNS) in order to find the dynamic updates logs and find the hostname of the laptop as this information does not exist in the proxy logs.
However, when this user is connecting via VPN and there is any interruption, the user within an hour may take another IP address and the old one may have been taken from another laptop. As I run the search every hour I want to be sure that the DNS logs will return the correct name and not the latest name (as the latest DNS update will point this IP address to another laptop). For this reason I was thinking to run the 2nd search with a dynamic field (latest) which will be calculated in the main search and it will search in the DNS only up to the last time this user used this IP address.
In general is there any way to dynamically manipulate from the main search the time range (earliest latest) that the 2nd search will run?
It's absolutely at least part of your problem
You cannot join on a field that doesn't exist - Splunk looks for matches of all fields in the | join
command
If you list one, like latest
, that's not in the subsearch, nothing will join
Basically I am trying to replace the latest="03/21/2020:17:27:23" in the below query with joining it as
index="xxxx" AND sourcetype="infoblox:dns" AND eventtype="infoblox_dns_change" AND client_message="adding an RR at " AND client_message=" A " earliest=-1d **latest="03/21/2020:17:27:23"*
| join type=left client_ip latest [index="xxxx" AND sourcetype="infoblox:dns" AND eventtype="infoblox_dns_change" AND client_message="adding an RR at " AND client_message=" A *" earliest=-1d ]
The latest exists right?
No, latest
does not exist in the subsearch
Remove it thusly: | join client_ip [...]
From the join docs.splunk page:
Field names must match, not just in name but also in case. You cannot join product_id with product_ID. You must first change the case of the field in the subsearch to match the field in the main search.