Hi Splunk Experts--
I'm confused about the union command and am hoping you can
help. Specifically, I'm struggling to understand what causes the
"things that get unioned" to be truncated-- in my case to 50,000
records.
Here's an example of what confuses me:
Imagine three sets of data-- I've put them in three separate indexes
called union_1, union_2 and union 3. The data sets are very similar:
each has 60,000 records, each consisting of a timestamp, a color and a
hash. Each data set has exactly one event per second and each covers
the same 60,000 seconds (from 2017-01-01 00:00:01 to 2017-01-01
16:40:00). The color is random and the hash is unique across all
180,000 events (60,000 * three data sets).
Here's union_1:
time color hash
------------------------- ------ --------------------------------
2017-01-01 00:00:01 -0800 blue 08decd051408e648b941b5dbb9b1578c
2017-01-01 00:00:02 -0800 yellow 39d98f7f9a98920ee08631c9e6a4e867
2017-01-01 00:00:03 -0800 green 2b34449aae3a941c64dd76d33a6cfc04
...
2017-01-01 16:39:58 -0800 blue b2cc43ab839bf57711a00f8f7a622e97
2017-01-01 16:39:59 -0800 blue e26f577b10d0fa172c122deca813d38f
2017-01-01 16:40:00 -0800 blue c9b0b55e7513963f7b04cf3c424686f2
...and union_2:
time color hash
------------------------- ------ --------------------------------
2017-01-01 00:00:01 -0800 violet c8e68d6c154fc0ca88220a299dba7c55
2017-01-01 00:00:02 -0800 blue 3e18602a1d137ea4bf9157e67c4386ed
2017-01-01 00:00:03 -0800 violet ecdf61cd34cda950bd782e3a6ba51fd6
...
2017-01-01 16:39:58 -0800 violet 5c00f68da1aa343ec0944fbcd42775fc
2017-01-01 16:39:59 -0800 green 2c3a626ff26a05f9895dc1c9ae1d074e
2017-01-01 16:40:00 -0800 red 9b796de25b072d8a48d3e9a7a716c4e9
...and union_3:
time color hash
------------------------- ------ --------------------------------
2017-01-01 00:00:01 -0800 orange 772468eb812735bfa984b91477afe967
2017-01-01 00:00:02 -0800 violet 6d9ebc2ce8b1c79d42793d624daeb402
2017-01-01 00:00:03 -0800 red a31d8811b95b4597f943f268f4068fb0
...
2017-01-01 16:39:58 -0800 yellow 17b43d58e4920f1d2044552acdad5507
2017-01-01 16:39:59 -0800 violet 12425e908448371c38a1f0fe12aedf73
2017-01-01 16:40:00 -0800 indigo ea1fb54c5c2b5fd2161856ea6937226e
You get the idea... 🙂
Now let's run some SPL:
| union maxout=10000000
[ search index=union_1 ]
[ search index=union_2 ]
[ search index=union_3 ]
| stats count by index
This produces what I'd expect-- 60,000 records per "thing that got
unioned":
index count
------- -----
union_1 60000
union_2 60000
union_3 60000
But let's make things a bit more complicated:
| union maxout=10000000
[ search index=union_1 | head 60000 ]
[ search index=union_2 ]
[ search index=union_3 ]
| stats count by index
Wait, what? Adding a head command to the first search causes the
second and third to be truncated to 50000?
index count
------- -----
union_1 60000
union_2 50000
union_3 50000
How about this one?
| union maxout=10000000
[ search index=union_1 ]
[ search index=union_2 | head 60000 ]
[ search index=union_3 ]
| stats count by index
Hmmm... same result:
index count
------- -----
union_1 60000
union_2 50000
union_3 50000
What if we move the head command to the final search?
| union maxout=10000000
[ search index=union_1 ]
[ search index=union_2 ]
[ search index=union_3 | head 60000 ]
| stats count by index
Wow... now only the final search gets truncated:
index count
------- -----
union_1 60000
union_2 60000
union_3 50000
Notes that may or may not be relevant:
Many commands have a similar effect (i.e. cause the same
truncations) as head-- in particular dedup and sort seem to cause
the same problems.
I suspect that these commands (and presumably many others) cause
the subsearch to no longer qualify as a "streaming subsearch"--
(although honestly I can't imagine why head would do this) and
that this fact makes union behave much more like append.
I believe (but am not sure) that the 50000 truncation limit is due
to maxresultrows in limits.conf-- that value (for me is currently
50000)
For context, here's what I want to do:
In general, get a better understanding of how union works and how
its different than append.
Specifically, union a set of three searches that each produce substantially more
than 50000 records and not experience truncation.
Anybody willing to help me out with this? Would totally appreciate the
benefit of your wisdom 🙂
Thanks!
Hi jsinnott_
At this time, union
behaves alternately like multisearch
(for distributable streaming subsearches) or append
(for subsearches that are not distributable streaming). This is not adequately explained in the doc topic for the union command at present and I'll see what I can do to fix that.
(For more information about the types of streaming search commands, see Command types in the Splunk Enterprise Search Manual.)
Let's take your first search:
| union maxout=10000000
[ search index=union_1 ]
[ search index=union_2 ]
[ search index=union_3 ]
| stats count by index
In this case, all of the searches are distributable streaming, so they area all unioned with multisearch
. This is why you see 60k in each.
Your second search uses the head
command for one of the subsearches. Because head
is centralized streaming rather than distributable streaming, it causes the subsearches that follow it to use the append
command. "Under the hood," the search is converted to:
| search index=union_1
| head 60000
| append
[ search index=union_2 ]
| append
[ search index=union_3 ]
| stats count by index
When union
is used in conjunction with a search that is not distributable streaming, the default for the maxout
argument applies: 50k events. This is mentioned in the doc topic for the union command.
Your third search also ends up being an append
search, because the second subsearch is not distributable streaming due to the head
command. Here's how it looks "under the hood":
| search index=union_1
| append
[ search index=union_2 | head 60000 ]
| append
[ search index=union_3 ]
| stats count by index
Again, the maxout
argument default applies here, limiting the results of the appended searches to 50k events.
In your last example, the first two subsearches are distributable streaming, so they are unioned with multisearch
. But the final subsearch has the head
command, so it gets unioned with append
at the end.
| multisearch
[ search index=union_1 ]
[ search index=union_2 ]|
| append
[ search index=union_3 | head 60000 ]
| stats count by index
The maxout
argument applies to that last subsearch because it is not distributable streaming due to the head
command. So it returns 50k events rather than 60k events.
Note that multisearch
has to be the first command. If your union
search unpacks in a way that puts append
first, you won't get multisearch
to follow it.
Kindest regards,
Matt (Splunk Docs Team)
Hi jsinnott_
At this time, union
behaves alternately like multisearch
(for distributable streaming subsearches) or append
(for subsearches that are not distributable streaming). This is not adequately explained in the doc topic for the union command at present and I'll see what I can do to fix that.
(For more information about the types of streaming search commands, see Command types in the Splunk Enterprise Search Manual.)
Let's take your first search:
| union maxout=10000000
[ search index=union_1 ]
[ search index=union_2 ]
[ search index=union_3 ]
| stats count by index
In this case, all of the searches are distributable streaming, so they area all unioned with multisearch
. This is why you see 60k in each.
Your second search uses the head
command for one of the subsearches. Because head
is centralized streaming rather than distributable streaming, it causes the subsearches that follow it to use the append
command. "Under the hood," the search is converted to:
| search index=union_1
| head 60000
| append
[ search index=union_2 ]
| append
[ search index=union_3 ]
| stats count by index
When union
is used in conjunction with a search that is not distributable streaming, the default for the maxout
argument applies: 50k events. This is mentioned in the doc topic for the union command.
Your third search also ends up being an append
search, because the second subsearch is not distributable streaming due to the head
command. Here's how it looks "under the hood":
| search index=union_1
| append
[ search index=union_2 | head 60000 ]
| append
[ search index=union_3 ]
| stats count by index
Again, the maxout
argument default applies here, limiting the results of the appended searches to 50k events.
In your last example, the first two subsearches are distributable streaming, so they are unioned with multisearch
. But the final subsearch has the head
command, so it gets unioned with append
at the end.
| multisearch
[ search index=union_1 ]
[ search index=union_2 ]|
| append
[ search index=union_3 | head 60000 ]
| stats count by index
The maxout
argument applies to that last subsearch because it is not distributable streaming due to the head
command. So it returns 50k events rather than 60k events.
Note that multisearch
has to be the first command. If your union
search unpacks in a way that puts append
first, you won't get multisearch
to follow it.
Kindest regards,
Matt (Splunk Docs Team)
Hi Matt--
Thanks so much for taking time to write this clear and detailed explanation. It's exactly what I needed-- you're my new best friend!
..j
Hi jsinnott_,
since union
is just another sub search you will hit many limits with it, some are mentioned here http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Union#Optional_arguments
In most cases you can just use stats
to do the same and will not hit any limits. Read some examples here https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo... or in the March 2016 Virtual .conf session here http://wiki.splunk.com/Virtual_.conf
Why union
is truncating events from a second search after using more commands sounds weird and might be worth opening a bug report.
Hope this helps ...
cheers, MuS
Hello and thanks for this. I really appreciate you taking the time to answer.
..j