Hi guys at Itility, I attended your session at .conf 2016. I've been playing around with your R app and am seeing that frequently when using the runRdo custom command that I get inconsistent results coming back from R in Splunk. Example below.
The search below occasionally comes back with the correct results and populates splunk with the test data frame. However, more often than not it comes back with a Null error.
| inputlookup iris.csv | runRdo script="set.seed(1); my_iris = dataset[-5]; species = dataset$species; kmeans_iris = kmeans(my_iris,3); kmeans_table = table(kmeans_iris$cluster,species); test = as.data.frame(kmeans_table); return(test);" #error results message session status NA/NaN/Inf in foreign function call (arg 1) In call: do_one(nmeth) 0 400 #correct results Var1 Freq species 1 50 Iris Setosa 2 0 Iris Setosa 3 0 Iris Setosa 1 0 Iris Versicolor 2 2 Iris Versicolor 3 48 Iris Versicolor 1 0 Iris Virginica 2 36 Iris Virginica 3 14 Iris Virginica
Please let me know what you think.
Thanks for using the R app! (and attending our presentation)
There are a couple of things you need to take into account:
1. Splunk is not consistent in the order of the columns (even when using table or fields commands). This means that dataset[-5] will not give you a consistent column. We haven't found a workaround yet, however, you can use column names in R.
2. Splunk in not aware of any data types and will always send out strings (even when it's obvious that your data is numeric). Our app will try to parse the data as numeric but when it fails R will receive chars instead of numerics. It's most safe to cast data types in R explicitly.
When debugging, you can use the parameter
getResults=false which will give you a link to the console output by R. When using the
str() command in R the console will show the data types.
So back to your query. This example should work (works on my machine):
| inputlookup iris.csv | runRdo script=" # Fix the random seed set.seed(1); # Store the dataset in a variable my_iris = dataset; # Seperate the species column from the rest species = as.factor(my_iris$species); my_iris = my_iris[ , !(names(my_iris) %in% c('species'))]; # Cast data types my_iris$petal_length = as.numeric(my_iris$petal_length); my_iris$sepal_length = as.numeric(my_iris$sepal_length); my_iris$petal_width = as.numeric(my_iris$petal_width); my_iris$sepal_width = as.numeric(my_iris$sepal_width); # Show summaries in the console, use getResults=false to see the link to the console str(species); str(my_iris); # Perform the kmeans kmeans_iris = kmeans(my_iris, 3); kmeans_table = table(kmeans_iris$cluster, species); # Return a dataframe return(as.data.frame(kmeans_table));" getResults=t
I hope this fixes your issue! We'd love to hear how your using our app so stay in touch!
Thank you this works perfectly! I can see now that the column order changes if I run the search multiple times. I will avoid using index references from now on and make sure to cast my data types as well.
Hi .. for me nothing is getting printed after clicking the run button in script editor
not even error is coming ..
is opencpu mandatory for this ? and can we isntall it in the same machine as splunk server ?
please respond ASAP
public.opencpu.org would not work ?
actually we dont have right to install opencpu as of now.
so thought to use some public opencpu
Sure, that should work. Just be absolutely sure you're willing to send your data and your algorithm to some unfamiliar host and be aware that you cannot use libraries that are not installed on the OpenCPU server that you're using.
but nothing is coming when i am clicking the run button in splunk app
R console tab is hidden only . at least some error should come
ps i am using public.opencpu.org only
External search command 'runrpairs' returned error code 1. Script output = "errormessage=ConnectionError at "/data/splunkaxpclp/lib/python2.7/site-packages/requests/adapters.py", line 375 : HTTPSConnectionPool(host='public.opencpu.org', port=443): Max retries exceeded with url: /ocpu/library/base/R/identity (Caused by : [Errno -2] Name or service not known) "