It appears that my use of the REST API is somehow causing a leading pipe to be stripped before an inputcsv command. I have this python search string:
"| inputcsv scale_med_validation_data | apply fastflux_model | where 'predicted(is_attack)' = 1 | eval t = now()+3600*1 | eval report_hour=strftime(t, "%H") | eval report_date=strftime(t, "%m/%d/%Y") | tail 50 | collect index=fastflux_summary"
This works as desired when entered manually through the web interface.
However, when submitted through the REST API, the jobs screen shows the search query missing the leading pipe:
"inputcsv scale_med_validation_data | apply fastflux_model | where 'predicted(is_attack)' = 1 | eval t = now()+3600*1 | eval report_hour=strftime(t, "%H") | eval report_date=strftime(t, "%m/%d/%Y") | tail 50 | collect index=fastflux_summary"
Naturally, this causes the inputcsv to fail, and so none of the REST API jobs succeed. Why might the leading pipe not be making it through here?
Hey @kcnolan13,
I just heard back from our engineering team and there is an issue with the script as shown in the docs. Specifically, the issue is where it checks for queries starting with 'search' and then prepends 'search' if it's not found. Here is an updated script that should fix the problem. Note this update here:
" # If the query doesn't already start with the 'search' operator or another
# generating command (e.g. "| inputcsv"), then prepend "search " to it.
if not (searchQuery.startswith('search') or searchQuery.startswith("|")):
searchQuery = 'search ' + searchQuery"
I will update the docs. Let me know how this works for you!
import urllib
import httplib2
from xml.dom import minidom
baseurl = 'https://re-latitude.sv.splunk.com:8089'
userName = 'guest'
password = 'guest'
searchQuery = '| inputcsv foo.csv | where sourcetype=access_common | head 5'
# Authenticate with server.
# Disable SSL cert validation. Splunk certs are self-signed.
serverContent = httplib2.Http(disable_ssl_certificate_validation=True).request(baseurl + '/services/auth/login',
'POST', headers={}, body=urllib.urlencode({'username':userName, 'password':password}))[1]
sessionKey = minidom.parseString(serverContent).getElementsByTagName('sessionKey')[0].childNodes[0].nodeValue
# Remove leading and trailing whitespace from the search
searchQuery = searchQuery.strip()
# If the query doesn't already start with the 'search' operator or another
# generating command (e.g. "| inputcsv"), then prepend "search " to it.
if not (searchQuery.startswith('search') or searchQuery.startswith("|")):
searchQuery = 'search ' + searchQuery
print searchQuery
# Run the search.
# Again, disable SSL cert validation.
print httplib2.Http(disable_ssl_certificate_validation=True).request(baseurl + '/services/search/jobs','POST',
headers={'Authorization': 'Splunk %s' % sessionKey},body=urllib.urlencode({'search': searchQuery}))[1]
Hey @kcnolan13,
I just heard back from our engineering team and there is an issue with the script as shown in the docs. Specifically, the issue is where it checks for queries starting with 'search' and then prepends 'search' if it's not found. Here is an updated script that should fix the problem. Note this update here:
" # If the query doesn't already start with the 'search' operator or another
# generating command (e.g. "| inputcsv"), then prepend "search " to it.
if not (searchQuery.startswith('search') or searchQuery.startswith("|")):
searchQuery = 'search ' + searchQuery"
I will update the docs. Let me know how this works for you!
import urllib
import httplib2
from xml.dom import minidom
baseurl = 'https://re-latitude.sv.splunk.com:8089'
userName = 'guest'
password = 'guest'
searchQuery = '| inputcsv foo.csv | where sourcetype=access_common | head 5'
# Authenticate with server.
# Disable SSL cert validation. Splunk certs are self-signed.
serverContent = httplib2.Http(disable_ssl_certificate_validation=True).request(baseurl + '/services/auth/login',
'POST', headers={}, body=urllib.urlencode({'username':userName, 'password':password}))[1]
sessionKey = minidom.parseString(serverContent).getElementsByTagName('sessionKey')[0].childNodes[0].nodeValue
# Remove leading and trailing whitespace from the search
searchQuery = searchQuery.strip()
# If the query doesn't already start with the 'search' operator or another
# generating command (e.g. "| inputcsv"), then prepend "search " to it.
if not (searchQuery.startswith('search') or searchQuery.startswith("|")):
searchQuery = 'search ' + searchQuery
print searchQuery
# Run the search.
# Again, disable SSL cert validation.
print httplib2.Http(disable_ssl_certificate_validation=True).request(baseurl + '/services/search/jobs','POST',
headers={'Authorization': 'Splunk %s' % sessionKey},body=urllib.urlencode({'search': searchQuery}))[1]
Nice catch, that was it!
Awesome! Glad to hear it.
Try this:
"search | inputcsv ..."
Believe it or not, that hasn't worked either. The "search |" is stripped off and all that shows up in the job viewer query window is still: "inputcsv scale_med_validation_data | apply fastflux_model | where 'predicted(is_attack)' = 1 | eval t = now()+3600*1 | eval report_hour=strftime(t, "%H") | eval report_date=strftime(t, "%m/%d/%Y") | tail 50 | collect index=fastflux_summary"
ok, how about
search index=* | head 1 | eval foo="deleteme" | inputcsv ... | blah blah blah| search NOT foo="deleteme"
(just out of curiosity)
Yeah, I've tried that kind of thing too, but you get this:
"Error in 'inputcsv' command: This command must be the first command of a search."
Good thought though.
Oh crap, that's right.
How about putting the "|inputcsv..." in a macro? Then...
search `foo` | blah blah...
Nice workaround. I'll give it a shot tomorrow and see if it takes.
Something else to try:
search * | head 1 | append [|inputcsv foo.csv | blah ] | blah
Might run into issues if your csv is large (e.g. >50K rows)
So, the macro option doesn't work because unfortunately you still get this:
"Error in 'inputcsv' command: This command must be the first command of a search."
And I'm working with some large CSV files, so the other suggestion isn't ideal for this use case.
Any other tricks up your sleeve?
Have you tried just using curl?
curl -ku 'admin':'changeme' https://myserver:8089/servicesNS/admin/search/search/jobs/export -d search="|inputcsv foo.csv | blah" -d output_mode=csv
Hi @kcnolan13,
What endpoint are you using to submit the search?
Have you tried escaping the pipe character?
My base URL is https://xx.xx.xx.xx:8089/
What method of escaping are you referring to? I tried sticking a "\" in front of the leading pipe, but only ended up with a parse error.
Ok, it looks like you are using the correct management port to submit the request. But what endpoint are you using to submit the search? Are you creating a saved search and then retrieving the results? Are you using an SDK or is there anything else about how you are submitting the search that might help troubleshoot?
It might be good to get more context before going further with escaping characters. That might not be the issue.
For extensive troubleshooting, it might also be helpful to contact support.
I'm using a nearly identical Python script to the example shown here:
The important part probably being:
sid = httplib2.Http(disable_ssl_certificate_validation=True).request(baseurl + '/services/search/jobs','POST',
headers={'Authorization': 'Splunk %s' % sessionKey},body=urllib.urlencode({'search': searchQuery}))[1]
Thanks for the info. I have an active request in to our engineering team to review the Python example here and will add your question/issue to this.
In the meantime, in case it is possible to consider alternatives, there is a Python SDK for developers that might be helpful to you, with info on creating + running searches here:
http://dev.splunk.com/view/python-sdk/SP-CAAAEE5
Thanks @frobinson. I'm aware of the SDK, but hoped I could just bang out this small task with a modified version of the example Python script. I hope the developers fix this issue, if it is indeed on their end.
I understand. I've pinged some folks again about this, will post again here if I get an update. Sorry for the confusion!