I've seen a lot of join
, transaction
and append
SPLs.
Using timechart
to show percentage of each time, it's hard. but everybody wants to do it.
I think you didn't have to use that SPL.
There is a best practice, but I don't know worst practice
Is there SPL's worst practice? or Can you tell me what's wrong with this way of using it?
Hi @to4kawa,
i didn't find a worst practice guide and I'm agree that it could be useful, especially for the new entries: e.g. all the people that worked with SQL and approach Splunk, start using join command in searches!
Anyway a worst practices is surely the opposite of a best practice, and I didn't find a structured guide neither to this, only some hints in a course that I followed at the beginning.
And in addition, i don't think that someone in Splunk can say that there's a worst practice: it isn't a good marketing approach!
In my experience, I try to avoid some features for performace reasons or symply to have a more readable code, these are the main worst practices I avoid:
Then there's something else, but less important:
Ciao.
Giuseppe
We can't put together this kind of information.
For me,I have summarized it as a Japanese blog.
What should we do?
We're all here to learn together.
Now Splunk>Answers isn't like that.
Only people who want to solve their problems ask questions and don't know what other people are doing.
I guess there's nothing you can do about it.
I feel Splunk Answers does solve We're all here to learn together.
I had been active on Splunk Answers for past 4 years, and every single day I have spent on Splunk Answers I have definitely learnt something new. Sometimes we do use Splunk Answers to log our answers similar to blog but with history so that version specific changes can also be documented.
If you have other expectation/views you can definitely mention!
Thank you for replying to something like this rant.
I was fed up with a lot of questions that people other than the questioner didn't understand at all, because there was no one to present the log these days and it was only a query.
https://answers.splunk.com/answers/822035/how-to-increase-chart-column-number-more-than-3000.html
It's not right to use it for something like this in the first place.
@to4kawa use case is a use case. We would not be able to recommend more than 1000 data points in a chart as we would not be able to interpret. But there could be ML/IoT/Research based use cases where 4K*3K correlation is required to be visualized. So this is not a question of worst practice, it is a question on use case!
This is the answer where I have made that recommendation: https://answers.splunk.com/answers/821286/horizontal-scroll-bar-in-column-chart.html#answer-820293
I see. That's certainly true.
It is decided by join
.
p.s.
| makeresults count=3000
| streamstats count
| transpose 0 header_field=count
| appendpipe [|makeresults count=5000
| streamstats count]
| filldown
Is there a computer that can run this?
That's why best practices.
A chain of only Map-Reduce commands can leverage processing of reduced data set on each Search Peer. Divide and Conquer 🙂 https://docs.splunk.com/Documentation/Splunk/latest/Search/Writebettersearches#Parallel_processing_e...
On my personal laptop I ran following SPL to generate 15M events: This search has completed and has returned 15,000,000 results by scanning 0 events in 95.403 seconds
| makeresults count=5000
| streamstats count as sno
| eval label="event_".sno, event=mvrange(1,3001,1)
| stats count by event label
| eval label=label."_".event,data=random()
| fields label data
https://docs.splunk.com/Documentation/Splunk/latest/Search/Quicktipsforoptimization
It's the opposite of this.
Hello my friend,
Your post have been the most insightful. I'll like to add one thing. I recommend Bloodhound app for Splunk to my customers, which specifically designed for identifying user's bad practices , in order to enhance the performances in their environment. A great app to look at what you're looking.
https://splunkbase.splunk.com/app/3541/
I didn't know that.
Thank you. @shivanshu1593
No worries, my friend.
Along these lines Search Activity , also my own app Alerts for Splunk Admins has a few alerts/reports for detecting worst practices such as index=* or similar...
https://conf.splunk.com/watch/conf-online.html?search=worst# has a list of things... but not really focussed on SPL.
https://conf.splunk.com/watch/conf-online.html?search=fn1003# has filtering bad practices and how to avoid them.
https://conf.splunk.com/session/2015/conf2015_MMueller_Consist_Deploying_OptimizingSplunkKnowledge.p... has knowledge objects / CIM normalization bad practices and how to avoid them. [side note, recent versions aren't as bad as it used to be]
On the topic of event correlation there's also this: https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-jo...
Can you tell me what's wrong with this way of using it?
Wrong with what way in particular?
I wouldn't say "don't use transaction", I'd say "use transaction for appropriate cases with smart settings". The docs flowchart you linked to covers some of this.
In addition to the flowchart, if you have a very high cardinality ID with very short durations a transaction ID startswith=something
will be faster than a stats by ID
because transaction
can discard completed IDs from memory while stats
has to keep all of them in memory indefinitely.
I also wouldn't say "don't use join", I'd say "use join for appropriate cases". The answers link I posted covers a lot of this.
There are cases where join
is the right answer, for example you have a complex search that gets some additional fields from a fast tstats. Trying to merge the two into one OR-stats-search often is counterproductive, and a pattern of search | complex stuff | stats | join [tstats]
can be the best solution.
In short, it depends.
I don't want to ban them all myself.
But there are too many wrong uses of it.
For example,
First: transaction
the mail log.
we can't get results no matter how long it takes.
Thanks @martin_mueller
That's using timechart
in comparison to a week ago, I guess.
The reason I asked this question this time is,
what do I do with transaction
| join
?
I say No, you don't have to use that.
I think it's because a lot of beginners ask questions, but I wonder if we can do something about it.
https://docs.splunk.com/Documentation/Splunk/latest/Search/Abouteventcorrelation
The material is there, but it's not in a place for a novice to look.
Hi @to4kawa,
i didn't find a worst practice guide and I'm agree that it could be useful, especially for the new entries: e.g. all the people that worked with SQL and approach Splunk, start using join command in searches!
Anyway a worst practices is surely the opposite of a best practice, and I didn't find a structured guide neither to this, only some hints in a course that I followed at the beginning.
And in addition, i don't think that someone in Splunk can say that there's a worst practice: it isn't a good marketing approach!
In my experience, I try to avoid some features for performace reasons or symply to have a more readable code, these are the main worst practices I avoid:
Then there's something else, but less important:
Ciao.
Giuseppe
Thank you @gcusello
This may be it, but I'll wait a little longer.
help @woodcock
I became who I am because you told me『 not to use the `transaction'.』
and @kamlesh_vaghela
I remember the first time I tried your query, I thought, "Wow".
Do you have an opinion?