I have been trying to create some analyzes in splunk for a few week now. Sometimes I succeed, sometimes I fail. I appreciate a lot of help from community users - it helps a lot. And the results sometimes are amazing.
Anyway I still feel not comfortable and experience a lot of problems with syntax and rules of splunk language . I am thinking about a page/tutorial/blog/youtube channel , something like Splunk for DBA - relational DBA! . To read about theory , rules and syntax commands with examples like stats, join, append, timecharts and other who can manipulate with multiple table indexes their relations and aggregations. Of course this community is a mine of examples and recipes but maybe there is a place where such topics are described and explained in more affordable structured way.
any ideas , hints
K.
https://conf.splunk.com/files/2020/slides/TRU1761C.pdf
Here I found a good pdf ... in fact starcher found and I found his post .
Hi
Have you already seen this
https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/SQLtoSplunk?
It could give some hints how things are done on Splunk vs SQL. BUT you shouldn't follow this too much as how Splunk is working is totally different than in SQL. I suppose that there are many conf presentations which could help you to better understand how to work with Splunk. Some other good source of work with Splunk are:
r. Ismo
wow , the first link is a good source of knowledge 🙂 thanks a lot. There is one more sql I need to implement in splunk but it is not present there.
Maybe you could help . The most efficient way to inner join is something like :
index=db OR index=app
| eval join=if(index="db",processId,pid)
| stats sum(rows) sum(cputime) by join
But how to join two tables with multicolumn key ?
SELECT * FROM mytable1 INNER JOIN mytable2 ON (mytable1.mycolumn= mytable2.mycolumn AND mytable1.mycolumn2= mytable2.mycolumn2)
I understand what drove Splunk to prepare this page but this is best avoided. It encourages users to use some anti-patterns which are not and should not normally be used in Splunk.
Splunk is very different from RDBMS so it needs another "way of thinking". I find it easier to compare Splunk search to processing data with unix shell (I also suspect that choice of the pipe sign to delimit the steps in the pipeline is not accidental 😉 ).
And as a rule of thumb the join command should typically not be used with Splunk. (yes, there are use cases for it so it's there but it's not as common as in SQL).
I don't know what you mean by "multicolumn key" in this context but you can either use stats with multiple by fields or - if you mean it the opposite way - you can create a synthetic field to split by. Like
| eval splitfield=field1."-".field2."-".field3
| stats count by splitfield
Just watch for cardinality...
EDIT: Oh, I didn't see your SQL example. So you can make such syntetic fields from both kinds of data (possibly using conditional eval to calculate them separately for each subset). And then stats by those fields.
@PickleRick, thanks for Your comment - true , Splunk is completely different than RDBMS 🙂 For a guy like me who work with Oracle/Mssql/othersDB is like a torture to create suitable "queries" . Anyway I need to do some jobs using splunk so I need to look for a help from You.
I am surprised I found a way to link two tables where two columns are keys - the most ridiculous way (from my point of view) concatenate two strings/keys is correct ! 🙂