Splunk Search

Why do these 2 searches return different results based on the dedup?

tkwaller_2
Communicator

Simple searches that return different restults based on where the dedup is. Seems like ti functuioning 2 different ways:

index=dev_tsv md_type="assets" info_owner_orgID="Test" related_vendors="*gibberish*" info_tags="<tagname>"
| dedup id
| stats count by id

But this one returns a different result set than the one above

index=dev_tsv md_type="assets" info_owner_orgID="Test" related_vendors="*gibberish*"
|dedup id
| search info_tags="<tagname>"
| stats count by id

Any thoughts would be helpful.
Thanks as always!

0 Karma
1 Solution

HiroshiSatoh
Champion

First search:
(Info_tags = "<tagname>") only logs are extracted.

Next search:
Logs that are not (info_tags = "<tagname>") are also extracted.
The next dedup may delete the log with (info_tags = "<tagname>") and leave a log without (info_tags = "<tagname>").

I think that there is a difference in the number of cases due to the above difference.

View solution in original post

HiroshiSatoh
Champion

First search:
(Info_tags = "<tagname>") only logs are extracted.

Next search:
Logs that are not (info_tags = "<tagname>") are also extracted.
The next dedup may delete the log with (info_tags = "<tagname>") and leave a log without (info_tags = "<tagname>").

I think that there is a difference in the number of cases due to the above difference.

tkwaller_2
Communicator

This I understand BUT I would think the first search would be a smaller result but its not, it returns 146 results the second search only returns 94.

The first search returns results from 2 times on the same day 52 at 12AM and 94 at 3PM BUT the second only returns one set, 94 at 3PM

It appears the first search is just ensuring there are no duplicate ids for the ones with info_tags, in the second its its ensuring we only get the most recent ids with info_tags.

Why would it function 2 different ways?

0 Karma

HiroshiSatoh
Champion

The number of second searches to be deleted by dedup decreases.

0 Karma

HiroshiSatoh
Champion

search1
ID=1,info_tags=B
ID=2,info_tags=B
ID=3,info_tags=B

search2
ID=1,info_tags=A
ID=1,info_tags=B
ID=2,info_tags=B
ID=2,info_tags=C
ID=3,info_tags=B
ID=3,info_tags=D
↓ dedup
ID=1,info_tags=A
ID=2,info_tags=B
ID=3,info_tags=B
↓ search info_tags=B
ID=2,info_tags=B
ID=3,info_tags=B

0 Karma

tkwaller_2
Communicator

OK I can see now what you mean, since its taking the most recent record and deduping BEFORE getting the info_tag its reduces the overall count. That makes sense.

Thanks for that

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...