Splunk Search

Why is index not populating?

JeffPoretsky
Loves-to-Learn

User of splunk attempted a search of index="os"

It returns nothing after Dec 23. (Yes this went unnoticed for this long. We were on a single version of RedHat until recently).

Splunk servers are all RH7.9

Version:8.2.4

Build:87e2dda940d1

 

Clients are all 7.9 or 8.5

Labels (2)
0 Karma

JeffPoretsky
Loves-to-Learn

1 - probably

2 - if by that you mean /opt/splunk/splunkforwarder, yes that is the default on all clients in our environment

3 - I see almost all of our servers using the search given

4 - haven't touched our config files since installation. I have done splunk updates and OS patching. Both using a shutdown/patch-or-update/restart sequence that has been approved directly by splunk.

 

I expect I will get nowhere here as the answers so far have presumed knowledge that the admin team here was _never_ _given_.  Again. We were supposed to have training on days 4 and 5 of installation. But since days 1 and 2 were taken doing tasks that we were told had to be done before install could happen - even though we asked what to do before the installation - we DID NOT GET TRAINING.

I know the very basics. But nothing more.

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Sorry for unclear questions 😉

As you have done both OS and splunk updates that could be the reason (especially if you are starting those via init.d).

When you are running that previous query with time span after that patching time, are you still seeing those UF hosts (e.g. add earliest=-1d to the first line).

If not then it's possible that those splunk processes are waiting some confirmation before they are full functional mode on source system? You can check it by login one of those and then try then next

sudo -u<your splunk user> bash
/opt/splunkforwarder/bin/splunk status

If it's running normally this should give you some process id etc.

You could also check what you have on /opt/splunkforwarder/var/log/splunk/splunkd.log

There should be information what UF has done and if there is any errors which prevent splunk to start.

If it's not running then just start it with 

/opt/splunkforwarder/bin/splunk start --accept-license --answer-yes

Run this as your splunk user.

r. Ismo

 

0 Karma

JeffPoretsky
Loves-to-Learn

next step.

 

ran another oneliner to check splunk status

While there are a few which do not have license accepted, which I will fix shortly, all others are running properly, and the log shows current activity.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

OK. We're getting somewhere.

Step by step, maybe we'll find something.

@isoutamo's search should give you number of events (which is a bit less important at the moment) and the latest event you ingested from given server. Are those times long ago or are they fairly recent?

As you say you have /opt/splunk/splunkforwarder directory on those servers (which is good), check if you have a process called splunkd running. For example, by running

ps auxw | grep splunkd

If you're getting some results, that means the forwarder is running but the events are not being ingested. But if you're not getting results, that means that the forwarder process is not running at all. Maybe during your OS upgrade operations you did a reboot and the system was not configured to start the forwarder automatically?

0 Karma

JeffPoretsky
Loves-to-Learn

Ran an ansible one-liner against all my clients. All are running splunkd.

 

Will check the next suggestion shortly.

 

(work got busy)

0 Karma

isoutamo
SplunkTrust
SplunkTrust

You should check from which nodes have sent data to it before that. Then check from MC or query from _internal log if those are still sending any data to your splunk node. If not then log into those hosts and check if splunk is running on those or are there missing inputs or something else like FW is blocking sending/receiving events.

r. Ismo

0 Karma

JeffPoretsky
Loves-to-Learn

huh. on admin nodes I get either a reading error or a message about 0 population, but splunk is running on all splunk servers.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

I think we're getting some things confused a bit here.

Please correct me if I'm wrong anywhere.

1) Your index=os is supposed to collect data from remote (i.e. not hosting splunk infrastructure) hosts

2) You are using Universal Forwarders installed on those hosts to get events from them

The obvious questions (some of which @isoutamo already asked, but I'm not sure if you interpreted them correctly) are:

1) Have there been any changes introduced to your splunk environment around the time the events stopped being ingested?

2) If you are indeed using Universal Forwarders - are they running on the hosts you are getting events from? Not on your splunk servers! On those non-splunk servers you're pulling your logs from.

3) If your Universal Forwarders are running, verify if logs are being ingested from those hosts into _internal index - @isoutamo showed you a search for it

4) If you have recent events in _internal but don't have recent events in the os index, well... either you're filtering them out on your splunk servers (less probable) or they are simply not being ingested from the source machines (more probable). In either case you'd have to go through your your config starting from looking through inputs.conf on your UFs.

0 Karma

isoutamo
SplunkTrust
SplunkTrust

And you see internal logs from those on your splunk server?

index=_internal host!=<your splunk server>
| stats max(_time) as _time count by host

Or just use MC on your node (I suppose that you have enabled Forwarding monitoring in it).

r. Ismo 

0 Karma

JeffPoretsky
Loves-to-Learn

Finally getting around to this.

 

On all servers where splunk is installed the service splunkd is running.

 

(an example from grep:
{server}-106 | CHANGED | rc=0 >>
splunk 5131 1 0 Apr07 ? 01:18:04 splunkd -p 8089 start
splunk 5233 5131 0 Apr07 ? 00:00:00 [splunkd pid=5131] splunkd -p 8089 start [process-runner]

)

 

running the

index=_internal host!=utility-log-* | stats max(_time) as _time count by host

returns many servers.

 

Again. The only changes made to the clients and servers has been patching, updating and rebooting . Yes _some_ of the servers did not have the imitating with accept-license. but that's been fixed.

 

So it looks like I have no hope of getting this fixed due to the way Splunk does "support".

0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

If you are running it with systemd then this should work

  - name: "Start Splunk via service"
    service:
      name: "{{ splunk_service_name }}"
      state: started
    become: yes
    become_user: "{{ privileged_user }}"
    when:
      - splunk.enable_service
      - ansible_system is match("Linux")

Also after you have installed / updated splunk you must add/update (read: 1st remove then add) boot-start again like

- name: "Enable service via boot-start - Linux (init)"
  command: "{{ splunk.exec }} enable boot-start -user {{ splunk.user }} --accept-license --answer-yes --no-prompt"
  become: yes
  become_user: "{{ privileged_user }}"
  when:
    - ansible_system is match("Linux") and not splunk_systemd

and if outside of systemd

  - name: "Start Splunk via CLI"
    command: "{{ splunk.exec }} start --accept-license --answer-yes --no-prompt"
    register: start_splunk
    changed_when: start_splunk.rc == 0 and 'already running' not in start_splunk.stdout
    when: not splunk.enable_service
    until: start_splunk.rc == 0
    retries: 5
    delay: 10
    become: yes
    become_user: "{{ splunk.user }}"

These are same as/based on  (https://github.com/splunk/splunk-ansible).

r. Ismo

0 Karma

JeffPoretsky
Loves-to-Learn

I presume those are all ansible plays.

Thank you.

Now to run those and see how they _will_ fail. 

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Yep, individual tasks from splunk_common -role.

Cannot recall how original those are or have those some modifications already?
0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...