Hi Experts,
The case is: I have 2 join clauses where the source of _time
on the first search uses date_created
, while the source of _time
of second search uses last_updated
What is the impact on my search result if I use the time picker to select the range of the data?
pseudo search as follows:
index=index01 sourcetype=sourcetype_01 | join type=left order_id [search index=index01 sourcetype=sourcetype_02] | table _time order_id status order_created_dt last_updated_dt activity_id activity_status
Sample data:
sourcetype=sourcetype_01
order_id order_created_dt status last_updated_dt
123 9/1/2015 0:00 Processing 9/2/2015 0:10
sourcetype=sourcetype_02
order_id activity_id activity_status last_updated_dt
123 A001 Open 9/2/2015 0:11
123 A002 Completed 9/2/2015 0:15
123 A003 Open 9/2/2015 0:11
props.conf
[sourcetype_01]
INDEXED_EXTRACTIONS=csv
TIMESTAMP_FIELDS ="order_created_dt"
[sourcetype_02]
INDEXED_EXTRACTIONS=csv
TIMESTAMP_FIELDS ="last_updated_dt"
You can do your left join without join
and it will be much faster and will not be subject to event limits like this:
index=index01 (sourcetype=sourcetype_01 OR sourcetype=sourcetype_02) | stats values(*) AS * | where sourcetype=sourcetype_01| table _time order_id status order_created_dt last_updated_dt activity_id activity_status
As far as your time problem, I would definitely use last_updated_dt
for both sourcetypes and then your problem goes away. Think about it: when did the log get generated (the log is the "event", not the order)? It got generated at last_updated_dt
(or thereabouts) so that is the time that you should use.
Hi imanpoeiri,
If the timestamps are extracted correct you can use this search without Problem:
index=index01 sourcetype=sourcetype_01 OR sourcetype=sourcetype_02 | table _time order_id status order_created_dt last_updated_dt activity_id activity_status
Cheers, MuS
Hi @MuS,
As I declared two timestamp field on props.conf. How to ensure the time that will take effect is last_updated_dt
if I modify the time range?
By default the timepicker range will be used for all the searches and sub searches, and will be used to filter the events set based on the _time.
the only exception is if your specify inline timeranges on the main search or the sub searches.
examples :
timerange = yesterday
searchA | join [ searchB]
will use yesterday everywhere
timerange = yesterday
searchA earliest=-1h@h latest=@h| join [ searchB]
will use last hour for searchA, and yesterday for searchB
timerange = yesterday
searchA | join [ searchB earliest=-1h@h latest=@h ]
will use last hour for searchB, and yesterday for searchA
timerange = yesterday
searchA earliest=-1h@h latest=@h | join [ searchB earliest=-1h@h latest=@h ]
will use last hour for searchB and searchB
Now I realized that my timerange will cause my dashboard will not display the correct value of data if I declare like this.
Thanks for your comment @yannK[Splunk] !
post the search, some events and the expected result please
Hey @MuS,
This hasnt implemented yet. That was just initial thought because I use 2 sources of tables and both of table have its own field date that couldnt share the field with one another.
join
should be your last resort - not your first choice. Using stats
you can the same and much faster 😉 Read here why: http://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-joi...
Hi @MuS,
Thanks for that. But that still not answer my question.
Then provide sufficient details to answer this question