Getting Data In

Good Data Input .. No Indexing

vbrtrmn
Explorer

I have a data source on the local file system configured as such..

Path:

/data/splunk/rrsearch/server-01/processed.1341878400.gz
/data/splunk/rrsearch/server-01/processed.1341964800.gz
/data/splunk/rrsearch/server-02/processed.1341878400.gz
/data/splunk/rrsearch/server-02/processed.1341964800.gz
/data/splunk/rrsearch/server-03/processed.1341878400.gz
/data/splunk/rrsearch/server-03/processed.1341964800.gz
...etc...
  • Path: /data/logs/rrsearch
  • Set Host: Segment on Path / 4
  • Source type: Manual / Baseline Search
  • Index: baseline_search
  • Whitelist: .+processed.+gz$
  • Blacklist: left empty

The Data Inputs - Files & Directories screen shows 620 files.

The problem is none of the data ever seems to get indexed, other data in the /data/splunk path does get indexed for other projects. I feel I'm missing one small step, can anyone throw me a bone?

Per @Lamar's request, inputs.conf

[default]
host = wsi-hub

[monitor:///data/splunk/remote]
host_segment = 4
sourcetype = syslog
blacklist = .*.gz
disabled = 0
host = 

[monitor://$SPLUNK_HOME/var/log/splunk]
blacklist = *.gz
disabled = false

[monitor:///data/logs/rrsearch]
disabled = false
followTail = 0
host = 
host_regex = 
index = baseline_search
whitelist = .+processed.+gz$
sourcetype = Baseline Search
host_segment = 4

In indexes:

Index Name: baseline_search
Max Size: 500,000
Frozen Archive: None 
Current Size: 3,807
Event Count: 54,237,503
Earliest Event: May 13, 2012 7:59:59 PM
Latest Event: Jul 30, 2012 7:59:59 PM
Home Path: /opt/splunk/var/lib/splunk/baseline_search/db
App: search
Tags (2)
0 Karma

Lamar
Splunk Employee
Splunk Employee

I would first, clean up your input for the processed files.

There are a few issues with it --
First, the monitoring stanza won't pick up the data since the directory that you're monitoring is invalid (/data/logs)
Additionally, I would define the fourth segment in your monitor.
Lastly, I wouldn't put spaces in my sourcetype as Splunk doesn't respond well to spaces in sourcetypes.

Fixes Below:

[monitor:///data/splunk/rrsearch/*/]
disabled = false
index = baseline_search
whitelist = .+processed.+gz$
sourcetype = Baseline_Search
host_segment = 4

That should get you a little closer to where you want to be.

Hope it helps.

Lamar
Splunk Employee
Splunk Employee

No problem, remember to flag this as your answer so that the next group of folks that run into this issue can easily figure out what to do.

Take care.

0 Karma

vbrtrmn
Explorer

Adding a new role worked great!

Splunk will be used by manager/marketing types making reports and such. I wanted to make my search engine data as segregated as possible from any syslog data. The search engine data is scrubbed to disassociate individual IPs from their searches. Some of the data in syslog may contain individually identifiable information which they are strictly forbidden from viewing.

I can view the data because I have ethical standards 🙂

Thanks a lot for taking time to help me with this.

0 Karma

Lamar
Splunk Employee
Splunk Employee

I would be curious why you decided to segment this data off from your syslog data.

Again, just curious.

Lamar
Splunk Employee
Splunk Employee

Yeah, you'll probably want to enable this index 'baseline_search' to be searched by default by your user/role.

http://docs.splunk.com/Documentation/Splunk/4.3.3/Admin/Addandeditroles

In particular, these two parameters:

srchIndexesDefault
srchIndexesAllowed

vbrtrmn
Explorer

Thanks for the response, the data seems to at least be indexing now (updated in the body above), it just never appears on the Search page. Currently the only "Source type" is syslog, though there are seven other enabled data sources with files. Perhaps I am missing some step to get other source types to appear in the search?

0 Karma

lguinn2
Legend

I'll give a nod to Lamar's answer, but I also notice that your whitelist doesn't match the filenames... You have

Whitelist: .+processed.+gz$

Which should be

Whitelist: .+parsed.+gz$

vbrtrmn
Explorer

I put in the file names incorrectly. DOH

0 Karma

Lamar
Splunk Employee
Splunk Employee

Without being able to see your 'actual' input configuration I'll take a guess and say that you've got to make sure you're searching on index=baseline_search unless you've set your default indexes to include that one.

Include your inputs.conf and we may be able to get a bit further.

vbrtrmn
Explorer

Finally got sudo access on the server, I updated the question.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...