Getting Data In

Configuring outputs.conf for an "all in one" box

DEADBEEF
Path Finder

I stood up a test instance of Splunk  that is a "all in one" system, that is indexer and search head.  I wrote an app that pulls data via REST API but realized I wasn't sure if I needed to ensure it had a custom outputs.conf if I am "sending" to the same system.

Since it is acting as in indexer, wouldn't it immediately pull the data and then index it without needing a /local/outputs.conf?  I wasn't sure and couldn't find any clear documentation explaining this specific scenario.

My script pulls data but I don't have anything populating the main index.  If I run the script manually, the data prints to stdout as expected.

Labels (2)
0 Karma
1 Solution

DEADBEEF
Path Finder

Thank you very much for all the suggestions and troubleshooting.  After getting the test script to work but mine failing, I concluded that the issue was with my script.  Breaking it down to the bare minimum commands to replicate the functionality in my script AND running it with the splunk binary (rather than just from the command line as the splunk user)  I realized that Splunk was experiencing some problems.  In my original script, I use the curl command but was using the silent flag (-s) so the errors themselves were being hidden from stdout. 

Now testing it at its bare minimum, I saw Splunk throwing errors because it was unable to access (or couldn't find for some reason) the SSL CA cert path.  There were 2 solutions to this, either use the -k switch in my curl command, or provide the full path using --cacert.  I tested both and they both work but ended up using solution 2.

Testing using the splunk binary

/opt/splunk/bin/splunk cmd ./json.sh

 Solution 1: (-k) non-ssl cert validation

curl -sgk "https://api.website.com/rest_endpoint" -X GET -u "user:api_token" -H 'Accept: application/json'

 Solution 2: Provide the path to standard certs

curl -sg --cacert /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem "https://api.website.com/rest_endpoint" -X GET -u "user:api_token" -H 'Accept: application/json'

View solution in original post

soutamo
SplunkTrust
SplunkTrust

Hi

this should works.

inputs.conf

 

[script://$SPLUNK_HOME/etc/apps/<your app name>/bin/json.sh]
disabled = false
index = test
interval = 60.0
sourcetype = json_no_timestamp

 

then script

 

#!/bin/bash
echo '{"a":"a","b":"b","c":1,"d":{"aa":1,"ab":"ba"}}'

 

You will have your own sourcetype for that script (h1:json) which is as best practices said. Can you share that definition and also sample of your scripts output so the Community can help you to verify that there haven't been any weirdness?

One hint: Don't use both INDEXED_EXTRACTIONS and KV_MODE at same time or you will get duplicate events!

r. Ismo

0 Karma

DEADBEEF
Path Finder

Your example did work for me so I guess maybe it's my script.  I adjusted my .conf regardless but still nothing.  No idea what the issue may be.

---REVISED ---

inputs.conf

 

[script://$SPLUNK_HOME/etc/apps/TA-HackerOne/bin/hacker_one_pull.sh]
disabled = false
index = test
interval = 180.0
sourcetype = h1:json

 

props.conf

 

[h1:json]
DATETIME_CONFIG=CURRENT
INDEXED_EXTRACTIONS=json
KV_MODE=none
LINE_BREAKER=([\r\n]+)

 

data sample

 

-bash-4.2$ pwd
/opt/splunk/etc/apps/TA-HackerOne/bin
-bash-4.2$ ./hacker_one_pull.sh
{
  "id": "49",
  "type": "report",
  "attributes": {
    "name": "Lorem Ipsum",
    "description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.",
    "external_id": "aaa-123",
    "created_at": "2018-02-27T16:48:23.308Z"
  }
}
{
  "id": "20",
  "type": "report",
  "attributes": {
    "name": "Finibus Bonorum",
    "description": "Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium.",
    "external_id": "aaa-726",
    "created_at": "2019-09-11T08:26:14.625Z"
  }
}
-bash-4.2$

 

This shows up in _internal (but only once, never again)

08-29-2020 20:01:04.689 +0000 INFO  ExecProcessor - New scheduled exec process: /opt/splunk/etc/apps/TA-HackerOne/bin/hacker_one_pull.sh
0 Karma

soutamo
SplunkTrust
SplunkTrust

Hi

can you try to change your props.conf to:

[h1:json]
DATETIME_CONFIG =
INDEXED_EXTRACTIONS = json
KV_MODE = none
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
TIMESTAMP_FIELDS = attributes.created_at
TIME_FORMAT = %FT%T.%3Q%Z
disabled = false
pulldown_type = true

Based on my test this should fix it (at least with your examples).

r. Ismo 

0 Karma

DEADBEEF
Path Finder

Thank you very much for all the suggestions and troubleshooting.  After getting the test script to work but mine failing, I concluded that the issue was with my script.  Breaking it down to the bare minimum commands to replicate the functionality in my script AND running it with the splunk binary (rather than just from the command line as the splunk user)  I realized that Splunk was experiencing some problems.  In my original script, I use the curl command but was using the silent flag (-s) so the errors themselves were being hidden from stdout. 

Now testing it at its bare minimum, I saw Splunk throwing errors because it was unable to access (or couldn't find for some reason) the SSL CA cert path.  There were 2 solutions to this, either use the -k switch in my curl command, or provide the full path using --cacert.  I tested both and they both work but ended up using solution 2.

Testing using the splunk binary

/opt/splunk/bin/splunk cmd ./json.sh

 Solution 1: (-k) non-ssl cert validation

curl -sgk "https://api.website.com/rest_endpoint" -X GET -u "user:api_token" -H 'Accept: application/json'

 Solution 2: Provide the path to standard certs

curl -sg --cacert /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem "https://api.website.com/rest_endpoint" -X GET -u "user:api_token" -H 'Accept: application/json'

View solution in original post

soutamo
SplunkTrust
SplunkTrust

Hi

nice to hear that you solve your problem. As you said, the one must test all scripts with splunk cmd your script or splunk cmd python your script to ensure that those are working also when they are calling inside splunk! It’s not unusual that e.g. python scripts works in cmd line and fail when testing first time with splunk. 
Happy splunking !

r. Ismo

thambisetty
Super Champion

You should invoke script using splunk inputs.conf to collect events which are printed upon script execution.

can you share inputs.conf which will call your script?

if you don’t have created, create one like below:

[script://./bin/yourscript.extension]
index=<indexname>
interval = <set frequency >
sourcetype = <set sourcetype>

https://docs.splunk.com/Documentation/Splunk/8.0.5/Admin/Inputsconf

————————————
If this helps, give a like below.
0 Karma

DEADBEEF
Path Finder

I have an inputs, props, and my script.  My script file is in /opt/splunk/etc/apps/<myapp>/bin/hacker_one_pull.sh

[script://./bin/hacker_one_pull.sh]
index = main
interval = 600
sourcetype = h1:json
source = api_hackerone
disabled = false
send_index_as_argument_for_path = false

 

Tags (1)
0 Karma

thambisetty
Super Champion

Any reason for adding below:

send_index_as_argument_for_path 

can you remove that line and restart splunk service and check. 

————————————
If this helps, give a like below.
0 Karma

DEADBEEF
Path Finder

I was reviewing the docs on inputs because I am not getting data in right now and came across that under the scripted inputs section:

send_index_as_argument_for_path = <boolean>
* Whether or not to pass the index as an argument when specified for
  stanzas that begin with 'script://'
* When this setting is "true", the script passes the argument as
  '-index <index name>'.
* To avoid passing the index as a command line argument, set this to "false".
* Default: true.

Anyway, it's commented out now and restarted splunk. 

This shows up in _internal (and has been on every restart in my troubleshooting) but still no data is coming in.  I ran the script manually to ensure it is working (as the splunk user) and JSON data is printing to screen so I know it does work.

08-29-2020 05:37:09.605 +0000 INFO  ExecProcessor - New scheduled exec process: /opt/splunk/etc/apps/TA-HackerOne/bin/hacker_one_pull.sh

 

0 Karma

thambisetty
Super Champion

Can you set all time range and Search with sourcetype and index given in inputs.conf

————————————
If this helps, give a like below.
0 Karma

thambisetty
Super Champion

and also, share your props.conf

confirm if there is timestamp in json logs printing from script.

————————————
If this helps, give a like below.
0 Karma

DEADBEEF
Path Finder

inputs.conf

[script://$SPLUNK_HOME/etc/apps/TA-HackerOne/bin/hacker_one_pull.sh]
index = main
interval = 180.0
sourcetype = h1:json
source = api_hackerone
disabled = false
# send_index_as_argument_for_path = false

props.conf (NOTE: CURRENT time is the intended and desired behavior)

[h1:json]
CHARSET=UTF-8
DATETIME_CONFIG=CURRENT
INDEXED_EXTRACTIONS=json
KV_MODE=none
SHOULD_LINEMERGE=false
category=Structured
description=HackerOne JSON data via REST API
disabled=false
pulldown_type=true
LINE_BREAKER=([\r\n]+)

 Sample data (NOTE: attributes.created_at is not the desired timestamp for _time, hence using CURRENT)

 

-bash-4.2$ pwd
/opt/splunk/etc/apps/TA-HackerOne/bin
-bash-4.2$ ./hacker_one_pull.sh
{
  "id": "49",
  "type": "report",
  "attributes": {
    "name": "Lorem Ipsum",
    "description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.",
    "external_id": "aaa-123",
    "created_at": "2018-02-27T16:48:23.308Z"
  }
}
{
  "id": "20",
  "type": "report",
  "attributes": {
    "name": "Finibus Bonorum",
    "description": "Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium.",
    "external_id": "aaa-726",
    "created_at": "2019-09-11T08:26:14.625Z"
  }
}
-bash-4.2$
0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!