I want to do a join of two searches that have a common field ID and time, but I want to have a condition on time when IDs match.
Consider two tables user-info and some-hits
name ipaddress time
user1 18.104.22.168 t0
user2 22.214.171.124 t1
user1 126.96.36.199 t2
ipaddress hits time
188.8.131.52 10 t0.03
184.108.40.206 40 t0.03
220.127.116.11 46 t0.9
18.104.22.168 60 t1
22.214.171.124 78 t1.5
Here I want to match all records from some-hits with user-info whose ipaddress match, but some-hits.time < user-info.time
So t0.03 should match with t0 . t0.9 should match with t1. t1 and t1.5 should match with t2.
I wanted to know how I can use join to achieve this?
join is the last resort to solve search problems, not the first choice - see docs http://docs.splunk.com/Documentation/Splunk/6.3.0/SearchReference/Join or this https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo...
Your problem here is the value of
time is not a number, its a string and therefore Splunk will not do what you expect because it will compare it differently. First you need to remove the
t from the time values and convert it to a numeric value:
eval time=tonumber(trim(time, "t"))
Next you can get the two
time values into new field depending on the
eval user_time=tonumber(trim(like(source, "%user-info%"), "t")) | eval some_time=tonumber(trim(like(source, "%some-info%"), "t"))
And finally use the new time fields to compare them:
your base search here | eval user_time=tonumber(trim(like(source, "%user-info%"), "t")) | eval some_time=tonumber(trim(like(source, "%some-info%"), "t")) | stats count by ipaddress, name, hits, user_time, some_time | where some_time < user_time
This is un-tested so you probably need to tweak it, but it should give you some hints how it can be done.
Thanks for your reply MuS. In my case , both searches have diff index. one of the search is a summary search and the other is a detailed search. The detailed search is my main search (user-info) . The summary search (some-hits) is the second search.
In the link you sent, https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo...
the index is the same but different sourcetypes
Can I write a query w/o join in this use case?
it does not matter if your events are in different
sourcetypes you can most likely use
stats instead of
Just use as base search something like this:
index=a Or index=b sourcetype=c OR sourcetype=d | more Splunk Fu ...
ok . I will try it.
I also need to find the total hits for all the matched ipaddress and time event.
in the example above, I am expecting an output like:
name time ipaddress #hits
user1 t0 126.96.36.199 50 (10 + 40)
user2 t1 188.8.131.52 46
user1 t2 184.108.40.206 138 (60 + 78)
Can i calculate sum for every matched combination from two search result sets?
For my above example it would be like this:
your base search here | eval user_time=tonumber(trim(like(source, "%user-info%"), "t")) | eval some_time=tonumber(trim(like(source, "%some-info%"), "t")) | stats sum(hits) AS total_hits by ipaddress, name, hits, user_time, some_time | where some_time < user_time
Thanks for your help.
I tried to use OR for my base search like u said:
(sourcetype=ib:reserved source=ib:user index=ibsecurity) OR index=ibsummary report=si_hits
I first just wanted ot try this out to see what events I get.
I just get results for the summary search ( index=ibsummary report=sihits).
I do not get any events from first search (ib:user)
What i am missing here?
Just to add clarity to your search, what you have is
(sourcetype=ib:reserved AND source=ib:user AND index=ibsecurity) OR index=ibsummary AND report=sihits. In other words, this will match
(sourcetype=ib:reserved AND source=ib:user AND index=ibsecurity) AND report=si_hits
(index=ibsummary AND report=sihits)
What I think you are looking for is
(sourcetype=ib:reserved source=ib:user index=ibsecurity) OR (index=ibsummary report=si_hits)
where report is a key field only found in the summary index. Give that a shot.
Well run each search seperate and see if you get results back:
(sourcetype=ib:reserved source=ib:user index=ib_security)
BTW, as mentioned in your
earliest=-1w question, you should NOT set
inputs.conf see the docs why : http://docs.splunk.com/Documentation/Splunk/6.3.1/admin/Inputsconf
The solution given above does not work for me. I think it was interpreted wrongly.
Here is my use case:
I have search user-history
username ipaddress logintime logout time
user1 220.127.116.11 12.00 12.05
user2 18.104.22.168 12:10 12:50
user1 22.214.171.124 11:40 11:55
I have another search hits-table
Clientid hits time domain
126.96.36.199 2 12:02 fb.com
188.8.131.52 3 12:02 fb.com
184.108.40.206 5 11:45 boo.com
In second search I want to first find total hits by time and domain which gives me
clientid total-count time domain
220.127.116.11 5 12:02 fb.com
18.104.22.168 5 11:45 boo.com
Now i want to match the user-history results with this such that my result table should look like
Here user2 did not have any hits in the time it was logged in. So its record will not be in result table.
The match should be on ipaddress==clientid and time is between logintime and logout time
username ipaddress #hits time
user1 20.,20.20.20 5 _time value from hits-table
user1 22.214.171.124 5 _time value from hits-table
Can this be done without join?
What would be the best approach considering the user-history table would be large