Getting Data In

Why do I have data ingestion issues on newly setup standalone server?

juju
Explorer

I installed Splunk standalone with https://splunk.github.io/splunk-ansible/
Version 9.0.4 on Ubuntu jammy 22.04.2


Instance is up and seems running fine but after configuring data ingestion, I get nothing available in search
I tested 4 different inputs and none of them worked.

I suspect something related to indexing but did not identify where this could be tracked down (log file, splunk table...)
Any advices?

Thanks


* Data receiving on port 9997
* Data input: TCP for syslog data
* Local Log file: /var/log/dpkg.log
* Local systemd-journald


I can find ingestion activities in /opt/splunk/var/log/splunk/metrics.log but not in `index=_* component=metrics | stats count BY index,component,group,host` (no group=tcpin_connections). It hald systemd-journal permission access issue until I fixed it by adding splunk user to systemd-journal group.
Network ports are listeing from `ss -tunap` and tested successfully with `nc`
`curl -vk -u user:pass https://localhost:8089/services/admin/inputstatus/TailingProcessor:FileStatus > TailingProcessor-FileStatus.xml` confirms ingestion for local log file only
`index=_internal source=*metrics.log tcpin_connections` = no results
`| tstats count WHERE index=* OR index=_* BY host` = have data but only splunk internal index (aka _*) and localhost/splunk server
`index=_internal component!="Metrics" | stats count BY index,component,group,host` = only _internal/LMStack*/Trial/splunkhost
Settings > Licensing: "Trial license group": current = No licensing alerts/violations, 0% of quota
Settings > Monitoring console: returns nothing on indexing (0 KB/s), historic data only shows internal sources spike at install time.
Splunkd system status is green. Tried to restart a few times but did not help

I checked the following resources but did not find the issue
https://docs.splunk.com/Documentation/Splunk/9.0.4/Forwarding/Receiverconnection
https://docs.splunk.com/Documentation/Splunk/9.0.4/Troubleshooting/Cantfinddata
https://community.splunk.com/t5/Getting-Data-In/What-are-the-basic-troubleshooting-steps-in-case-of-...

 

The only warning that I have from Web console is on Resource usage IOWait as this is a lab system without production specs. To me, that should only slow down things but not block.

 

Extract from config

 

 

# more /opt/splunk/etc/apps/search/local/inputs.conf
[monitor:///var/log/dpkg.log]
disabled = false
host = mlsplunk002
index = dpkg
sourcetype = dpkg

[tcp://9514]
connection_host = dns
host = mlsplunk002
index = syslog
sourcetype = syslog

[journald://journald]
interval = 30
journalctl-exclude-fields = __MONOTONIC_TIMESTAMP,__SOURCE_REALTIME_TIMESTAMP
journalctl-include-fields = PRIORITY,_SYSTEMD_UNIT,_SYSTEMD_CGROUP,_TRANSPORT,_PID,_UID,_MACHINE_ID,_GID,_COMM,_EXE
journalctl-quiet = true

[tcp://6514]
connection_host = dns
host = mlsplunk002
index = syslog
sourcetype = syslog
# more /opt/splunk/etc/apps/search/local/props.conf
[dpkg]
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
category = Custom
description = /var/log/dpkg.log
disabled = false
pulldown_type = true
# more /opt/splunk/etc/apps/search/local/indexes.conf
[dpkg]
coldPath = $SPLUNK_DB/dpkg/colddb
enableDataIntegrityControl = 0
enableTsidxReduction = 0
homePath = $SPLUNK_DB/dpkg/db
maxTotalDataSizeMB = 512000
thawedPath = $SPLUNK_DB/dpkg/thaweddb

[syslog]
coldPath = $SPLUNK_DB/syslog/colddb
enableDataIntegrityControl = 0
enableTsidxReduction = 0
homePath = $SPLUNK_DB/syslog/db
maxTotalDataSizeMB = 512000
thawedPath = $SPLUNK_DB/syslog/thaweddb

 

 

 

Labels (3)
0 Karma

woodcock
Esteemed Legend

There many things that could be happening including:
The splunk process does not have permissions to read the files.
The splunk process does not have permissions to traverse the path to the files.
You have a stray outputs.conf that is trying to send the data elsewhere instead of indexing locally.
Your event timestamps are too old or too futuristic and are beeing dropped (i.e. MAX_DAYS_AGO/MAX_DAYS_HENCE).
Your events are being thrown into the near future and your timpicker is not set to see them.

You should be able to ask splunk itself what the problem is with a search like this:
index=_* AND ("dpkg" OR "syslog") AND ("ERR*" OR "FAIL*" OR "CANNOT" OR "UNABLE" OR "TIMEOUT")
| stats count max(_time) AS _time values(log_level) last(_raw) AS raw BY index sourcetype punct

juju
Explorer

No real cause/solution found for now but problem not reproduced on a similarly build setup but with bigger specs (aws t3.large)

richgalloway
SplunkTrust
SplunkTrust

You'll only see data from port 9997 if you have forwarder set up to send said data.  You didn't mention a forwarder so we'll ignore this source.

You only see data from port 9514 if you have a syslog source configured to send data to Splunk.  This also was not mentioned so we'll ignore this source, too.  Besides, Splunk discourages sending syslog events directly to a Splunk server.  Instead, use a dedicated syslog server (syslog-ng or rsyslog) with a forwarder.

That leaves the two file inputs, which should work.

Do the indexes listed in inputs.conf exist in Splunk?  Go to Settings->Indexes to see and to create them as necessary.  If the indexes don't exist then Splunk will drop the events.

---
If this reply helps you, Karma would be appreciated.
0 Karma

juju
Explorer

For all of the given sources, there is data sent, either continuously, either on-demand.

This is validated in metrics.log for port 9997
55823:03-01-2023 00:02:02.367 +0000 INFO Metrics - group=tcpin_connections, 192.168.x.y:54836:9997, connectionType=cooked, sourcePort=54836, sourceHost=192.168.x.y, sourceIp=192.168.x.y, destPort=9997, kb=0.334, _tcp_Bps=11.032, _tcp_KBps=0.011, _tcp_avg_thruput=0.011, _tcp_Kprocessed=1877.581, _tcp_eps=0.032, _process_time_ms=0, evt_misc_kBps=0.000, evt_raw_kBps=0.000, evt_fields_kBps=0.000, evt_fn_kBps=0.000, evt_fv_kBps=0.000, evt_fn_str_kBps=0.000, evt_fn_meta_dyn_kBps=0.000, evt_fn_meta_predef_kBps=0.000, evt_fn_meta_str_kBps=0.000, evt_fv_num_kBps=0.000, evt_fv_str_kBps=0.000, evt_fv_predef_kBps=0.000, evt_fv_offlen_kBps=0.000, evt_fv_fp_kBps=0.000, build=3.5.3-08986e05, version=3.5.3-08986e05, os=linux, arch=x64, hostname=mlcribl002, guid=f2095906-d733-4453-8b2a-327df0005014, fwdType=full, ssl=false, lastIndexer=None, ack=true

For port 9514, `logger --server localhost --port 9514 "test splunk logger"`date``

For systemd-journald, in splunkd.log, before group permission was fixed, got
19275:02-27-2023 01:07:10.419 +0000 ERROR ExecProcessor [185621 ExecProcessor] - message from "/opt/splunk/bin/splunkd journald-modinput '$@'" No journal files were opened due to insufficient permissions.
For dpkg.log, in splunkd.log, after setup
02-27-2023 01:08:16.737 +0000 INFO TailingProcessor [218543 MainTailingThread] - Adding watch on path: /var/log/dpkg.log.

Indexes exist as shared in indexes.conf and confirmed in Web UI

0 Karma
Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...