Activity Feed
- Got Karma for Re: What are the expected results if a Splunk deployment server goes down for an extended period of time?. 11-12-2024 12:51 PM
- Karma Re: Why does Splunk continuously attempt to find a user in LDAP after the user has been removed from Active Directory? for Jeremiah. 06-05-2020 12:48 AM
- Karma Re: Does the splunk_archver app need to be distributed to the search peers via the knowledge bundle? for pradeepkumarg. 06-05-2020 12:48 AM
- Karma Re: Why does a panel in my dashboard form report "Search is waiting for input..." when the search that powers that panel includes two aliased return fields? for woodcock. 06-05-2020 12:48 AM
- Karma Re: Why is Splunk DB Connect incorrectly tailing my rising column and duplicating events? for woodcock. 06-05-2020 12:48 AM
- Karma Re: Why is Splunk DB Connect incorrectly tailing my rising column and duplicating events? for jtacy. 06-05-2020 12:48 AM
- Karma Re: Why is Splunk DB Connect incorrectly tailing my rising column and duplicating events? for jcoates_splunk. 06-05-2020 12:48 AM
- Got Karma for Re: How do I ensure that the timezone of a database input from Splunk DB Connect querying a server in an other timezone normalizes and is recorded as UTC in my indexers?. 06-05-2020 12:48 AM
- Got Karma for Is it better to use 'offline' mode or 'maintenance mode' in a multisite indexer cluster when a peer node will be down for an extended amount of time for maintenance?. 06-05-2020 12:48 AM
- Got Karma for Re: Why is Splunk DB Connect incorrectly tailing my rising column and duplicating events?. 06-05-2020 12:48 AM
- Got Karma for Re: Why is Splunk DB Connect incorrectly tailing my rising column and duplicating events?. 06-05-2020 12:48 AM
- Got Karma for Re: Why is Splunk DB Connect incorrectly tailing my rising column and duplicating events?. 06-05-2020 12:48 AM
- Got Karma for Re: Why is Splunk DB Connect incorrectly tailing my rising column and duplicating events?. 06-05-2020 12:48 AM
- Got Karma for Re: Why is Splunk DB Connect incorrectly tailing my rising column and duplicating events?. 06-05-2020 12:48 AM
- Got Karma for Why is Splunk DB Connect incorrectly tailing my rising column and duplicating events?. 06-05-2020 12:48 AM
- Got Karma for Re: How do I ensure that the timezone of a database input from Splunk DB Connect querying a server in an other timezone normalizes and is recorded as UTC in my indexers?. 06-05-2020 12:48 AM
- Got Karma for Can you encrypt password strings with splunk.secret manually?. 06-05-2020 12:48 AM
- Got Karma for How do I ensure that the timezone of a database input from Splunk DB Connect querying a server in an other timezone normalizes and is recorded as UTC in my indexers?. 06-05-2020 12:48 AM
- Got Karma for How do I ensure that the timezone of a database input from Splunk DB Connect querying a server in an other timezone normalizes and is recorded as UTC in my indexers?. 06-05-2020 12:48 AM
- Got Karma for Will asterisks in the value of a field be treated as wild cards and affect automatic lookup performance?. 06-05-2020 12:48 AM
Topics I've Started
Subject | Karma | Author | Latest Post |
---|---|---|---|
1 | |||
1 | |||
2 | |||
2 | |||
0 | |||
1 | |||
0 | |||
0 | |||
1 | |||
0 |
03-13-2017
12:13 PM
1 Karma
We need to replace one of the local hard disks in a Splunk indexer that is part of our multi-site (2 site) index cluster. We want to do this without kicking off any bucket fixup activity because we plan on restoring the replaced disk with backups rather than let 5TB of data replicate across the cluster unnecessarily.
We plan to have the peer down for an extended period of time while the restore job is occurring on the new local disk. In the meantime while this node is down I know that the cluster will be in a valid state due to the "1:1" replication of searchable/replicated buckets we have between both sites.
My question is, if we're going to take a peer node down for an extended amount of time and we DON'T want to have bucket fixup activities occur should we enable cluster maintenance mode or use the 'offline' mode for the downed peer and increase the cluster masters restart_timeout config to enough hours to cover the maintenance window?
The only thing the Splunk docs really state is that "When you take a peer offline temporarily, it is usually to perform an upgrade or other maintenance for a short period of time". As opposed to maintenance mode that specifically states that bucket fixup activities are mostly halted during the duration of maintenance mode, though they do not give any guidance as to how long you can safely be in maintenance mode and still have a valid cluster.
Can you enable maintenance mode for an extended period of time and still have a valid cluster if you're normally forwarding data to both site1 and site2 indexers simultaneously?
... View more
03-03-2017
10:02 AM
I submitted a ticket with Splunk support and based on their preliminary examination of this issue they believe it may be a bug.
... View more
03-03-2017
09:47 AM
4 Karma
Adding the ORDER BY clause fixed this issue. Even though the database was being written with a rising column the data wasn't ordered in the database. Each time DBConnect queried the database without the ORDER BY clause it caused the checkpoint value to be updated with whatever value was at the end of the column (which may or may not have been in rising order). Forcing the query to ORDER BY did the trick, though there appears to be some limitations. When I tried to backfill from 1 year ago the query continued to time out. I was only successful in getting it to run with the ORDER BY clause by querying less than 1 days worth of data from the table.
... View more
02-27-2017
06:26 AM
1 Karma
The ID column is definitely unique. Interestingly enough I tried changing the query this morning to include "ORDER BY ID" and it worked, the query didn't time out this time (my checkpoint value was only from data an hour ago, my previous attempt was with a checkpoint value of data from a year ago). So far since I made this change I haven't seen any duplicate events. It almost sounds like the database table I'm querying doesn't have it's ID column ordered correctly (if that's a thing, I'm not a DBA by any stretch) and is causing some weirdness with the checkpoint value when it attempts to query the DB table. I'm going to check with the DBA who maintains this today to see if my suspicions are correct.
... View more
02-27-2017
05:44 AM
2 Karma
So after pulling the majority of my hair out last week I finally figured this out.
TLDR; DBConnect does not play nice with a "Lightweight Forwarder"
The short version of the long story is that I did not realize the server we were trying to use as the DBConnect Splunk server was running Splunk with the Lightweight forwarder enabled. DBConnect was working in all capacity from what I could see and since we do all of our parsing at a heavy forwarding tier it took me a while to determine what the ultimate root cause of the timestamping issue was.
So the lightweight forwarder doesn't do any parsing which should have been taken care of by the heavy forwarding tier. The interesting thing was that all of the default time fields that are usually extracted with data, the same time fields that Splunk uses to adjust _time based on timezone, were all null. The only thing I can ascertain is that DBConnect wasn't playing nice since the lightweight forwarder app was enabled and wrote out the events with null time fields. This was causing ALL of the timezone configurations, regardless of where they were, to fail. The _time was never properly recorded since there wasn't any default time values for the calculation to be performed on in the first place.
Once I discovered this I abandoned the lightweight forwarder server and installed DBConnect on a full instance of Splunk on another server. I used the exact same configurations and everything worked as expected. All of the default time fields were populated and present and the timezone configurations were correctly utilized and correctly altered the value of _time based on those configurations.
... View more
02-26-2017
08:57 AM
1 Karma
I'm trying to tail a database using a rising column labelled "ID". Splunk DB Connect is working in the sense that it can search the database and return results but it appears to be updating the checkpoint sporadically and out of order which is causing duplicate events to be indexed.
The database is MS SQL.
inputs.conf:
[mi_input://my_querytable]
connection = My_Database
enable_query_wrapping = 1
index = my_index
input_timestamp_column_fullname = (003) my_querytable.Time.datetime
input_timestamp_column_name = Insert_Time
interval = 120
max_rows = 100000
mode = advanced
output_timestamp_format = yyyy-MM-dd HH:mm:ss
query = SELECT ID,Time,Field2,Field2,Field3,Field4,Field5,Field56,Field7 FROM "my_table"."dbo"."my_querytable" WHERE ID > ?
sourcetype = my_sourcetype
tail_rising_column_checkpoint_value = 1675918831
tail_rising_column_fullname = (001) my_querytable.ID.bigint
tail_rising_column_name = ID
ui_query_catalog = my_catalog
ui_query_mode = advanced
ui_query_schema = dbo
ui_query_table = my_querytable
disabled = 0
fetch_size = 20000
Here are what the logs look like when the query runs every 2 minutes and returns 100000 records via 20000 row pulls.
2017-02-26T16:42:43+0000 [INFO] [mi_input.py], line 193: action=start_executing_dbinput dbinput="mi_input://my_querytable"
2017-02-26T16:43:01+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=20000
2017-02-26T16:43:01+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1401390595
2017-02-26T16:43:16+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=20000
2017-02-26T16:43:17+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1413595646
2017-02-26T16:43:19+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=20000
2017-02-26T16:43:20+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1399518391
2017-02-26T16:43:22+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=20000
2017-02-26T16:43:23+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1399707472
2017-02-26T16:43:25+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=20000
2017-02-26T16:43:26+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1524449693
2017-02-26T16:43:26+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=0
2017-02-26T16:43:26+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1524449693
2017-02-26T16:43:26+0000 [INFO] [mi_input.py], line 211: action=complete_dbinput dbinput="mi_input://my_querytable"
2017-02-26T16:44:43+0000 [INFO] [mi_input.py], line 193: action=start_executing_dbinput dbinput="mi_input://my_querytable"
2017-02-26T16:45:30+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=20000
2017-02-26T16:45:31+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1533335128
2017-02-26T16:45:33+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=20000
2017-02-26T16:45:34+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1533500333
2017-02-26T16:45:36+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=20000
2017-02-26T16:45:37+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1528798868
2017-02-26T16:45:39+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=20000
2017-02-26T16:45:40+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1680663600
2017-02-26T16:45:42+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=20000
2017-02-26T16:45:43+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1675416914
2017-02-26T16:45:43+0000 [INFO] [modular_input_event_writer.py], line 93 : action=print_csv_from_jdbc_to_event_stream dbinput="mi_input://my_querytable" input_mode=advanced events=0
2017-02-26T16:45:43+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1675416914
2017-02-26T16:45:43+0000 [INFO] [mi_input.py], line 211: action=complete_dbinput dbinput="mi_input://my_querytable"
Here is the chronological order of just the checkpoint being updated. Notice how it jumps back and forth in the rising column:
2017-02-26T16:43:01+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1401390595
2017-02-26T16:43:17+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1413595646
2017-02-26T16:43:20+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1399518391
2017-02-26T16:43:23+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1399707472
2017-02-26T16:43:26+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1524449693
2017-02-26T16:43:26+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1524449693
2017-02-26T16:45:31+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1533335128
2017-02-26T16:45:34+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1533500333
2017-02-26T16:45:37+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1528798868
2017-02-26T16:45:40+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1680663600
2017-02-26T16:45:43+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1675416914
2017-02-26T16:45:43+0000 [INFO] [mi_input.py], line 109: action=rising_column_checkpoint_updated dbinput="mi_input://my_querytable" checkpoint=1675416914
I've tried changing the query to something like this but the query constantly times out even when only trying to return 100 rows.
SELECT ID,Time,Field2,Field2,Field3,Field4,Field5,Field56,Field7 FROM "my_table"."dbo"."my_querytable" WHERE ID > ? ORDER BY ID
I also tried changing the max row pull to 1000/1000 every two minutes so it was forced to only grab 1000 rows every two minutes and update the checkpoint value only once every two minutes. The checkpoint value still kept bouncing around and was not in a linear progression.
Anyone have any idea why I'm seeing this kind of behavior and if there is anything I can do about it?
... View more
02-25-2017
07:59 AM
To make things even weirder....none of the default datetime "date_*" fields are present in the database input data.
Completely missing:
date_hour, date_mday, date_minute, date_month, date_second, date_wday, date_year, date_zone
It almost looks like Splunk isn't performing any of the timestamping operations it's supposed to be after it's pulled the data from the database.
... View more
02-25-2017
07:05 AM
While pulling my hair out over this problem I discovered something that's pretty odd. ALL of my database inputs are being indexed with a blank date_zone field; regardless of any props.conf timestamping or timezone changes, either on the host with DBConnect or on the heavy forwarders. The date_zone field is blank, not even a "0" or "local".
I'm not sure what to make of that. I thought the date_zone field should always be generated/present when the event is parsed/indexed.
... View more
02-24-2017
12:47 PM
So I've tried a handful of configuration combinations and haven't had any luck so far. I'm starting to think that either DBConnect somehow overrides Splunks native timestamp extraction configurations or there is a bug. Below are the combinations I've tried thus far as well as my DB input from the DBConnect inputs.conf.
DBConnect Server: No Timezone specified in props.conf (custom app)
Heavy Forwarders: Timezone by Sourcetype in props.conf (custom app)
Result: Doesn't change indexed _time
DBConnect Server: Timezone by sourcetype in props.conf (custom app)
Heavy Forwarders: Timezone by sourcetype in props.conf (custom app)
Result: Doesn't change indexed _time
DBConnect Server: No Timezone specified in props.conf, DATETIME_CONFIG = NONE instead (custom app)
Heavy Forwarders: Timezone by sourcetype in props.conf (custom app)
Result: Doesn't change indexed _time
DBConnect Server: Timezone by sourcetype in props.conf (system local)
Heavy Forwarders: Timezone by Sourcetype in props.conf (custom app)
Result: Doesn't change indexed _time
[mi_input://my_DB_input]
connection = Database
disabled = 0
enable_query_wrapping = 1
index = DatabaseIndex
input_timestamp_column_fullname = (001) my_querytable.Time.datetime
input_timestamp_column_name = Time
interval = 120
max_rows = 10000
mode = advanced
output_timestamp_format = yyyy-MM-dd HH:mm:ss
query = SELECT Time,Field1,Field2,Field3,Field4,Field5,Field6,Field7,Filed8,Field9 FROM "db"."db"."my_querytable" WHERE ID > ?
sourcetype = myCustomSourcetype
tail_rising_column_fullname = (002) my_querytable.ID.bigint
tail_rising_column_name = ID
ui_query_catalog = my_catalog
ui_query_mode = advanced
ui_query_schema = db
ui_query_table = my_querytable
tail_rising_column_checkpoint_value = 2214228803
fetch_size = 1000
... View more
02-23-2017
12:31 PM
I just read the "incorrect_timestamp_behavior" portion of that page you linked. That sounds very much like what I'm seeing. I'm surprised I missed that in the docs. I'll give that a shot and let you all know if it works.
... View more
02-23-2017
12:28 PM
So I changed the timestamp output to "yyyy-MM-dd HH:mm:ss" and added the props.conf stanzas at both the heavy forwarder and on the server running DB Connect. It's still interpreting the time stamp literally and indexes it in EST.
I'll keep messing around with different configurations and see if I can find the Goldilocks zone.
... View more
02-23-2017
08:44 AM
2 Karma
I'm using Splunk DB Connect to pull data from an MS SQL database that is sitting on a server in the US Eastern time zone. The Splunk server with Splunk DB Connect is configured in UTC time. The time stamp column I'm using to extract the time stamp from the event is in Eastern time. All of the data I forward to Splunk is interpreted into UTC time but I'm having issues getting this Eastern time stamp from the database to be correctly indexed with a UTC time stamp for _time. We also use an intermediary heavy forwarder to receive the events from the Splunk DB Connect server before the events are forwarded to the indexers. It looks something like this:
MS SQL Database Server (EST) <--- Splunk DB Connect Server (UTC) ---> Intermediary Heavy Forwarder (UTC) ---> Indexer pool (UTC)
I've tried adding the following props.conf stanzas to both the Splunk DB Connect server and the Heavy Forwarder server but the events are still being indexes with an Eastern timezone time stamp.
[source::mi_input://database_input1]
TZ = US/Eastern
[source::mi_input://database_input2]
TZ = US/Eastern
We are using the output time stamp format of epoch with the following inputs.conf stanza.
output_timestamp_format = epoch
Could this be causing Splunk to automatically assume the epoch time is already in UTC? Perhaps I'm not fully understanding the the function of the TZ stanza.
How can I get Splunk to index the event from the database with a converted timestamp from EST to UTC?
Using Splunk 6.5.1 and Splunk DB Connect 2.4.0
... View more
02-20-2017
05:36 PM
This did the trick. I looked all over the docs regarding this issue before posting here. Can you explain why the double-dollar-signs are necessary in this particular circumstance? Is this referenced anywhere in the documentation?
Thanks for the help!
... View more
02-18-2017
07:24 AM
2 Karma
I have a field in one of my datasets labelled user . We perform automatic lookups globally based on the field user to return a variety of information pertaining to the user identified. Recently I noticed that when searching this particular index in anything other than Fast Mode the results would take an extremely long time to return. Upon further investigation I believe the cause of this is a combination of the automatic lookup and the fact that some of the user fields in the data set have the value ***** .
The device that we're receiving logs from is masking the user field value with asterisks. When the Splunk search returns results it appears to be attempting to lookup ***** based on the automatic lookup and it is severely effecting the performance of the search. It's as if Splunk is interpreting the asterisks as wild cards and iterating over the entire lookup file (which is quite large).
For example. A five second period of time where none of the events include user = ***** return in 2.648 seconds when searching in Smart Mode and allowing the automatic lookup. A similar five second period of time that includes a single user = ***** field/value pair takes 8.549 seconds. Increase the search time frame and the performance difference becomes much greater.
A 1 hour search with Fast Mode and no field extractions or lookups: "This search has completed and has returned 3,162 results by scanning 3,162 events in 2.822 seconds"
The same 1 hour search with Smart Mode utilizing automatic lookups: "This search has completed and has returned 3,162 results by scanning 3,162 events in 256.991 seconds"
For the second test search with Smart mode enabled and automatic lookups the job inspector shows the duration of command.search.lookups as 1,413.19 seconds.
Interestingly enough all of the events with user = ***** are all given the same lookup value for user even though the value was originally all asterisks. This make me think that the automatic lookup is interpreting the asterisks as wild cards and defaulting to some seemingly random value from the lookup table. It also appears that it's iterating over the entire lookup table when encountering these asterisk filled fields.
Has anyone else seen something like this? Should Splunk be interpreting fields with asterisks in them as wild cards?
... View more
02-18-2017
06:53 AM
I'm trying to create a dashboard form with a text input box that populates a token that is used in a number of searches on the dashboard. I recently created a new panel on the dashboard and could not get it to work using the tokens from the text input.
After some trial and error I found that the search doesn't want to start when the subsearch attempts to return two aliased fields. Here is what my search looks like:
index=myindex [|inputlookup MyLookup.csv.gz | search Field=$token1$ | return customField1=$FieldA customField2=$FieldA]
If I remove one of the aliased fields from the return command in the subsearch the search starts without issue and finishes. If I keep both aliased fields in the rerturn command as seen above the search never starts after inputting data in the text box. The rest of the panels on the dashboard will load but this panel will just sit there stating "Search is waiting for input...". If I run this search with a hardcoded value where the token is in a regular search window I have no problem running the search and it returns the results I want.
Can you not return two aliased fields in a subsearch when it is part of a dashboard form or is this a bug? Has anyone else had this issue? I'm currently running 6.5.1 on the Search Head.
... View more
02-15-2017
07:38 AM
2 Karma
Blacklisting the entire splunk_archiver app from bundle replication doesn't appear to have any repercussion on my production search head. We don't archive data to Hadoop, perhaps this app would need to be distributed to maintain that functionality.
You can blacklist the app with this distsearch.conf stanza:
[replicationBlacklist]
label_here = apps/splunk_archiver/...
... View more
02-14-2017
09:50 AM
So I just found something interesting with btool.
/opt/splunk/etc/apps/splunk_archiver/default/distsearch.conf [replicationWhitelist]
/opt/splunk/etc/apps/splunk_archiver/default/distsearch.conf javabin = apps/splunk_archiver/java-bin/...
It looks like the Splunk devs purposefully want the java-bin .jar files to be replicated.....though I have no idea why.
... View more
02-14-2017
07:15 AM
Thanks for the response. I'm going to blacklist the entire splunk_archiver app on one of my dev search heads to see if it has any negative impact. I can't imagine that it will considering the entire function of the archiver app should solely be used on the servers that are indexing data.
... View more
02-14-2017
06:56 AM
1 Karma
We recently upgraded to Splunk 6.5.1 and noticed a fairly large increase in our replicated knowledge bundle size from the Search head to our search peers. After doing some digging it appears that the splunk_archiver app has grown significantly in size since the last version we were on, 6.3.5.
In Splunk 6.3.5 the splunk_archiver app was a total of 108K in the knowledge bundle tar. In Splunk 6.5.1 the splunk_archiver app is a total of 73M in size; mostly due to some large jar files.
Since the splunk_archiver app is a native Splunk component and already lives on the Splunk Indexers, does the search head really need to be distributing the entire splunk_archiver app to the search peers? This is an additional 73M of data that doesn't seem to be necessary to distribute.
Would it be safe to assume that we can blacklist the splunk_archiver app from distribution in the knowledge bundle without breaking anything?
... View more
01-27-2017
06:28 AM
I was speaking more along the lines of the Splunk "file integrity" that it does at startup. It checks all native/installed Splunk files against the manifest located in /opt/splunk. The manifest is a list of all Splunk files with their associated hash.
I'm going to try and remove the line item for GeoLite2-City.mmdb in the manifest file on one of my development boxes and see if Splunk complains.
... View more
01-26-2017
07:36 PM
After upgrading to Splunk 6.5.1 we began receiving an error message in the GUI stating "File Integrity checks found 1 files that did not match the system-provided manifest. See splunkd.log for details." After doing some digging it turned out to be the file "/opt/splunk/share/GeoLite2-City.mmdb" This is the Maxmind free GeoLite2 city database file that is used in conjunction with the iplookup command.
We actually update this file monthly with each new release of the GeoLite2-City.mmdb file. I'm guessing that since this file ships with Splunk it's being checked against the file manifest and is failing the integrity check due to a checksum mismatch.
Is there any way to exclude a file from this integrity check?
Looking at the docs regarding the integrity check and Health Monitoring console I couldn't find anything regarding exclusion of files.
docs.splunk.com/Documentation/Splunk/6.5.1/Admin/ChecktheintegrityofyourSplunksoftwarefiles
docs.splunk.com/Documentation/Splunk/6.5.1/DMC/Customizehealthcheck
... View more
01-03-2017
01:43 PM
Masa,
Thanks for the info. We haven't quite made it to 6.5 and are still sitting on a 6.3 deployment. It's interesting to see that this is a bug in the newer releases.
According to the Splunk Docs in a distributed search environment the Search Head should be distributing only the knowledge bundle delta. "...Splunk Enterprise uses delta-based replication to keep the bundle compact, with the search head usually only replicating the changed portion of the bundle to its search peers."
docs.splunk.com/Documentation/Splunk/6.5.1/DistSearch/Limittheknowledgebundlesize
... View more
12-16-2016
01:48 PM
While digging through my Search head logs, I stumbled upon some WARN messages from the DistributedBundleReplicationManager component regarding "Asynchronous bundle replication" "took too long (longer than 10 seconds)". The knowledge bundle the Search head is currently replicating is about 250mb in size.
Doing a historical look back, I found that this error message was occurring approximately 50 times an hour for the last month or so, each message reporting that the configuration bundle was average 250mb in size. The majority of the files in the bundle that are decently large include lookup files, however, the majority of these lookup files remain static and do not change frequently or ever. It was my understanding that the Search head and search peers kept file hash records of the knowledge bundle components and only replicated the delta of that bundle. Looking at these error messages it appears that the Search head is replicating the 250mb knowledge bundle multiple times per hour.
Is this the expected behavior? I know that all of /etc/apps for the most part is being replicated as well but nothing to my knowledge is changing on a regular enough basis that would require the Search head to send the entire replication bundle 50 times an hour.
What is the threshold on blacklisting large lookup files? Do the search peers need the lookup files? I thought the lookup files were used at search time on the Search head.
... View more
11-22-2016
08:15 AM
1 Karma
I'm currently troubleshooting some data inputs from a Universal Forwarder that I have forwarding to an intermediate Heavy Forwarder tier which forwards to my Indexer tier. I was under the understanding that Universal Forwarders should not do any parsing, however, when I look at the Universal forwarder splunkd.log files, I'm seeing quite a lot of "Failed to parse timestamp" and "The TIME_FORMAT specified is matching timestamps outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE." on the Universal Forwarder.
If the UF is supposed to be sending streams of data and skipping any parsing operations, why am I see these errors at the UF?
Sample logs I'm seeing on the Universal Forwarder:
11-22-2016 01:37:15.717 +0000 WARN DateParserVerbose - The TIME_FORMAT specified is matching timestamps (ZERO_TIME) outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: removed
11-22-2016 01:37:15.717 +0000 WARN DateParserVerbose - Failed to parse timestamp. Defaulting to timestamp of previous event (Tue Nov 22 01:36:58 2016). Context: removed
... View more
07-27-2016
10:12 AM
Finally figured this out. I was pushing a configuration bundle that provided the following as a global setting for indexes.conf:
[default]
maxHotSpanSecs = 7776000
frozenTimePeriodInSecs = 31536000
maxTotalDataSizeMB = 100000
However, since these stanzas were already defined PER INDEX in the /system/default/indexes.conf the local stanzas won precendence over the globally defined stanzas.
Per Splunk indexes.conf documentation on global settings: "If an attribute is defined at both the global level and in a specific stanza, the value in the specific stanza takes precedence."
I added these values as specific stanzas in indexes.conf in another deployed configuration bundle that handles internal index configurations. Once this bundle was deployed the local stanzas in this bundle took precedence over the /system/default/indexes.conf local stanzas and everything is how I'd like it to be.
... View more