Is there a way to instruct Splunk to begin searching from a specific time forward instead of backwards from the current time?
I'd like to show the growth or decline of transport methods being used by devices that connect to our system over the span of a year or six months. By transport I mean cellular, POT's, ethernet etc...
These devices may connect as little as once a month or as many as 50 times. I only want to include each device once, the first time it connects, and count the transport method being used.
timechart would work great, if I could get Splunk to start from the oldest event, say 6 months ago and move forward counting only new devices/transport connecting. Here's an example of the search
index=foo
| dedup device
| Timechart span=1mon count by TRANSPORT
Because Splunk begins with the most recent event and the fact that the devices continually connect the results are skewed.
If I run the report for the last six months, a device that connected in July for the first time but has been connecting monthly will be counted in December when I want to count it in July and only July unless the transport changes.
I'm hoping I'm missing something very simple to achieve this, any help would be greatly appreciated.
Try this:
index=foo | stats earliest(TRANSPORT) as TRANSPORT earliest(_time) as _time by device | timechart span=1mon count by TRANSPORT
To make searches like these run fast and look back further than you retain data, take a look at http://blogs.splunk.com/2011/01/11/maintaining-state-of-the-union/
I haven't had the opportunity to try both methods but did notice an interesting observation with the stats(earliest) version.
In one data set, I know there are 2 devices using Ethernet, one in Nov 2016 and one in Dec 2015, the total count should be 2. As the search executes, I see the count of 1 in Nov as the stats table is being built but as the search continues the entire column disappears and isn't added again until the very end, Dec 2015. At that time the Ethernet column is re-added and a count of 1 is present in Dec but a count of 0 is listed in Nov.
Now if I run the same search adding "| search TRANSPORT=ETHERNET" I get the correct results, a count of 1 in Dec and a count of 1 in Nov.
Is this an issue with Timechart? Any idea what's going on here?
@martin_mueller. I also think above is the optimal search query for this particular scenario.
Running dedup command on entire index for a year will be expensive as compared to use of stats command to get the oldest values of Transport and Time using earliest command. However, I also feel that base search should have TRANSPORT="*" AND device="*" added to filter out the events which do not have these two fields upfront as out further filters work on these two fields. Further since the query might run for one year period, it is an ideal candidate for Summary Indexing.
Following should also work but it will be expensive query because of reverse and dedup command when compared to the answer suggested by @martin_mueller. PS: reverse command is still better than sort -_time when the search returns too many results.
index=foo TRANSPORT="*" device="*"
| reverse
| dedup device
| timechart span=1mon count by TRANSPORT
On the same lines I would also like to add that the reason why Splunk returns the latest results first and then moves towards older logs is because the way data is indexed in various buckets. Most recent data is written in HOT bucket which Splunk reads first and fastest. Older data rolls over to WARM, COLD, FROZEN and eventually to THAWED buckets based on age of data. While FROZEN data is not searchable, the search query looks for WARM and eventually COLD buckets only for older data which is why they are returned afterwards. Refer to the following Splunk documentation of Splunk indexing: http://docs.splunk.com/Documentation/Splunk/latest/Indexer/HowSplunkstoresindexes
Having said this, in case you want to improve performance of your search running for a 1 year period, you should consider creation of summary index or using sistats instead of stats command which will run faster.
dedup device
is about as expensive as stats earliest() by device
because both have to crawl the entire index once and keep one copy per device in memory... but then dedup
doesn't have a "latest" switch, so it's out. (side note: if you have support, feel free to file an enhancement request to allow dedup
to keep the last event it sees rather than the first)
reverse | dedup device
on the other hand is going to be insanely expensive both in cpu and memory. Additionally, reverse
doesn't map-reduce so all the events and work is going to get sent to your poor search head.
sort 0 + _time | dedup device
could also work, but would be insanely expensive as well. Sorting takes superlinear time, and you'll need to go through the data a second time for the dedup
.
I wouldn't say sort
is worse (or better) than reverse
... you'd need to test how well sort
map-reduces itself onto many indexers.
As an added benefit, stats latest()
can run in batch mode, while neither sort
nor reverse
nor dedup
can.
Adding those wildcard filters isn't going to speed things up because they don't reduce the number of events you have to load off disk (scanCount in the job inspector). Depending on the contents of the index, other filters may be appropriate of course, such as sourcetype=devices.
I believe stats
is more performant than dedup
because stats
is a reducing command and therefore lets the indexers only return the summary data needed rather than the pull event's payload. dedup
is a streaming command and therefore pulls the entire event back from the indexer and then stashes a ton of stuff in memory to reduce.
That's only partially true - indexers can prededup similarly like they can prestats. If you're looking for one event per device, no indexer needs to return more than one of its events per device. All the searchhead needs to do is dedup that tiny set.
I didn't catch the part that was not true. Also, isn't it the case that the stats command does not return the _raw event but rather details about the fields needed by the stats command? So this could easily be less pure data over the wire than the raw events required to be returned by dedup.
The part that's not true is that dedup puts a ton of stuff in memory because - much like stats - the indexers can prededup the data to only return a small set to the search head.
In the real world, I've seen cases where dedup is faster than stats, and vice versa.
Ah - thank you for elaborating good sir! @martin_mueller
@martin_mueller aren't there a search optimization thumb rule?
1) Use Search filters as early as possible in the search and
2) Use dedup in search as early as possible.
Since actual performance is use case specific @g038123 should try both options and check Job Inspector to compare search response time.
+1 on Job Inspector. This is especially relevant given the Search Optimization improvements introduced in 6.5. Some approaches might turn out to be better than they were before.
I realize 6.5 isn't at play here, but I wanted to highlight this for posterity.
Try this:
index=foo | stats earliest(TRANSPORT) as TRANSPORT earliest(_time) as _time by device | timechart span=1mon count by TRANSPORT
To make searches like these run fast and look back further than you retain data, take a look at http://blogs.splunk.com/2011/01/11/maintaining-state-of-the-union/
Thank you for the response and additional information, it was very helpful.
The query you recommended works perfectly, much appreciated!
Similar but ancient thread at Instruct splunk search to search from oldest event first instead of newest?
I'll come back later with more details but you might see posts about the reverse
command: http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Reverse
But, be aware that it's likely not what you're looking for. I think you want something more along the latest or earliest functions of timechart but will need to check when I have more time.