About dglinder

dglinder · ‎08-11-2014

Thanks, it looks like the "aeq" queue is a single-threaded process that decompresses .GZ files, and that's all this system is doing. We're investigating sending uncompressed log files to see if that helps. I'll update when we know more.

dglinder · ‎08-08-2014

We have an older RedHat 5.6 box running the Splunk Universal Forwarder 5.0.2 processing a few directories with many *.gz files. The system seems to be keeping up well enough, but we've noticed the metrics.log has started noting a lot of "blocked=true" showing up, mainly from the "aeq" queue. Here's a sample: [root@linux1621 splunk]# grep "name=aeq" metrics.log | tail 08-08-2014 20:01:03.681 +0000 INFO Metrics - group=queue, name=aeq, blocked=true, max_size_kb=500, current_size_kb=499, current_size=61, largest_size=63, smallest_size=0 08-08-2014 20:01:34.683 +0000 INFO Metrics - group=queue, name=aeq, blocked=true, max_size_kb=500, current_size_kb=499, current_size=61, largest_size=61, smallest_size=0 08-08-2014 20:02:05.562 +0000 INFO Metrics - group=queue, name=aeq, blocked=true, max_size_kb=500, current_size_kb=499, current_size=61, largest_size=61, smallest_size=5 08-08-2014 20:02:36.564 +0000 INFO Metrics - group=queue, name=aeq, blocked=true, max_size_kb=500, current_size_kb=499, current_size=61, largest_size=61, smallest_size=22 08-08-2014 20:03:07.565 +0000 INFO Metrics - group=queue, name=aeq, max_size_kb=500, current_size_kb=482, current_size=15, largest_size=61, smallest_size=0 08-08-2014 20:03:38.564 +0000 INFO Metrics - group=queue, name=aeq, max_size_kb=500, current_size_kb=482, current_size=15, largest_size=15, smallest_size=15 08-08-2014 20:04:09.402 +0000 INFO Metrics - group=queue, name=aeq, max_size_kb=500, current_size_kb=0, current_size=0, largest_size=61, smallest_size=0 08-08-2014 20:04:40.403 +0000 INFO Metrics - group=queue, name=aeq, blocked=true, max_size_kb=500, current_size_kb=499, current_size=61, largest_size=61, smallest_size=0 08-08-2014 20:05:11.403 +0000 INFO Metrics - group=queue, name=aeq, blocked=true, max_size_kb=500, current_size_kb=499, current_size=61, largest_size=61, smallest_size=1 08-08-2014 20:05:42.404 +0000 INFO Metrics - group=queue, name=aeq, blocked=true, max_size_kb=500, current_size_kb=499, current_size=61, largest_size=61, smallest_size=7 These seem to correlate with errors seen in the splunkd.log file: 08-08-2014 20:05:42.836 +0000 INFO BatchReader - Continuing... 08-08-2014 20:05:43.044 +0000 INFO BatchReader - Could not send data to output queue (parsingQueue), retrying... 08-08-2014 20:05:43.708 +0000 INFO BatchReader - Continuing... 08-08-2014 20:05:44.057 +0000 INFO BatchReader - Could not send data to output queue (parsingQueue), retrying... 08-08-2014 20:05:44.394 +0000 INFO BatchReader - Continuing... 08-08-2014 20:05:45.363 +0000 INFO BatchReader - Could not send data to output queue (parsingQueue), retrying... 08-08-2014 20:05:46.339 +0000 INFO BatchReader - Continuing... 08-08-2014 20:05:47.939 +0000 INFO BatchReader - Could not send data to output queue (parsingQueue), retrying... 08-08-2014 20:05:48.251 +0000 INFO BatchReader - Continuing... 08-08-2014 20:05:48.459 +0000 INFO BatchReader - Could not send data to output queue (parsingQueue), retrying... I tried increasing the maxKBps in limits.conf (doubled it from 1024 to 2048), but the errors returned right after restart. The CPU and RAM on this system are doing quite well - system load is below 1.00 most of the time, and RAM is mostly buffers and not swapping. What is "aeq" and where are it's parameters adjusted? Can we increase the max_size_kb (presumably to 1024)? Or is this a red herring and we need to look elsewhere?

dglinder · ‎07-11-2014

Since the "-et" and "-lt" fields need either a relative time ("-6d@d" or "-y"), or a Unix Epoc time ("Fri Jul 11 10:00:56 EDT 2014" == 1405087256), but I've found that I need to have the fill_summary_index.py script start and stop at specific times (i.e. "-et June 22, 2014 22:00:00"). I could convert those times to epoc, then run the command on the command line, but I'm lazy and that's prone to error. I use the Unix "date" command like this (in KSH): /splunk cmd python fill_summary_index.py ... -et $(date -d "June 22 2014 22:00" +%s) ... At 3AM this helps me avoid one more possible typo...

dglinder · ‎03-14-2014

Until Splunk developers add a "test email" link on the email configuration page, here's a quick way to test your email settings in Splunk. From the default "Search" home page (the "live dashboard") in version 5.X and earlier, just above the search bar is the "Actions" drop-down menu. Under there choose the "Schedule PDF Delivery..." option. Fill in the "Deliver To" field with your address, then click the "Send Test Email" link at the bottom of that window. Click "Cancel" to exit this window. In version 6.X I couldn't find the same actions option, but you should be able to use the "sendmail" search command: * | top 5 host | sendemail to="YourAccount@example.com" But in my test system this wasn't actually sending an email, even when I added the server= fields to point to my MTA.

dglinder · ‎02-06-2014

After a bit of work we determined that the REGEX fields were being too specific and were not accounting for non-HA nodes. I'm not a Juniper guy, but apparently the "- ive -" means it's in an HA arrangement, and the "- NYC -" (or other customer specified string) are present in non-HA nodes. We also noticed that the role name (the original reason for the call) was being missed on otherwise good log entries because the role name contained a hyphen ("-"). We updated the transforms.conf file to account for these and the roles started populating. Looking around at the rest of the REGEX in that file we noticed that the same fields were used in other lines so they were updated accordingly also. The basic REGEX changes were: 1. Change the search for "\sive\s" to "\s\S+\s" - a string beginning with a single space, then 1 or more non-space characters, terminated by a space. 2. Adding the hyphen character as an acceptable character when extracting the role field: "([\s\d\w\,-]+)". Here is the final transforms.conf file we ended up with: [junipersa-host] DEST_KEY = MetaData:Host REGEX = (\d+.\d+.\d+.\d+)\sJuniper:\s FORMAT = host::$1 [junipersa-client-info] REGEX = Juniper\:\s[^\s]+\s[^\s]+\s-\s\S+\s-\s\[(\d+.\d+.\d+.\d+)\]\s([^\s]+) FORMAT = clientip::$1 user::$2 [junipersa-realm-info] REGEX = Juniper\:\s[^\s]+\s[^\s]+\s-\s\S+\s-\s\[\d+.\d+.\d+.\d+\]\s[^\s]+$([\s\d\w]+)$ FORMAT = realm::$1 [junipersa-role-info] REGEX = Juniper\:\s[^\s]+\s[^\s]+\s-\s\S+\s-\s\[\d+.\d+.\d+.\d+\]\s[^\s]+$[\s\d\w]+$\[([\s\d\w\,-]+)\]\s- FORMAT = role::$1 [junipersa-session-info] REGEX = session:([^\\)]+) FORMAT = sessionid::$1 [junipersa-secure-meeting-info] REGEX = Created\smeeting\s\'([\s\d\w]+)\s$([0-9]+)$\' FORMAT = meeting_name::$1 meeting_id::$2 [sa_sourcetyper] DEST_KEY = MetaData:Sourcetype REGEX = Juniper\:\s[^\s]+\s[^\s]+\s-\s\S+ FORMAT = sourcetype::juniper_sa_log Hope this helps other users.

dglinder · ‎02-06-2014

My engineer reported that our JuniperSA reports were starting to look 'sparse' and seemed to be missing data. The syslog data is still being collected, no changes to the logging were knowingly made, and the raw data within Splunk matched what was being sent. Still, the reports were missing some role names and other data. Here is some sample log entries: Feb 4 08:41:49 A.A.A.A Juniper: 2014-02-04 02:41:49 - NYC - [X.X.X.X] username1(Security Group Alpha)[Role name one] - Login succeeded for username1/Security Group Alpha from X.X.X.X. Feb 4 09:03:38 B.B.B.B Juniper: 2014-02-04 03:03:38 - ive - [Y.Y.Y.Y] username2(Security Group Beta)[Role-name two] - Login succeeded for username2/Security Group Beta from Y.Y.Y.Y. Feb 4 08:49:32 C.C.C.C Juniper: 2014-02-04 02:49:32 - ive - [Z.Z.Z.Z] username3(Security Group Gamma)[Role-name three] - Login succeeded for username3/Security Group Gamma from Z.Z.Z.Z. And here is the transforms.conf file that should be matching the role (the text within the square brackets): [junipersa-role-info] REGEX = Juniper\:\s[^\s]+\s[^\s]+\s-\sive\s-\s\[\d+.\d+.\d+.\d+\]\s[^\s]+$[\s\d\w]+$\[([\s\d\w\,]+)\]\s- FORMAT = role::$1

dglinder · ‎01-27-2014

For anyone who is interested, I worked around this by using eval to change the series field to "OTHER" whenever one of the ignorable series were found: eval series=if(series == "VALUE_internal" OR series == "_internal", "OTHER", series) This changes any place that the "series" value is either "VALUE_internal" or "_internal" and places it in the "OTHER" column. If not, it sets it back to the original value of series. There's the code from before with the addition: earliest=-7d@d latest=@d index="_internal" source="*metrics.log" per_index_thruput | eval series=if(series == "VALUE_internal" OR series == "_internal", "OTHER", series) | eval series=if(series == "_fishbucket", "OTHER", series) | eval GB=kb/(1024*1024) | bucket _time span=1d | convert ctime(_time) as timestamp | timechart span=1d sum(GB) by series (I could have combined the two "eval series=..." pieces but I left them separate for readability.)

dglinder · ‎01-27-2014

When you installed the Splunk Universal Forwarder on the Windows system, did you check the appropriate check-boxes on the "Enable Windows Inputs" page near the end of the install? If not, you'll need to enable them on the Windows systems "inputs.conf" file - link:see this page for details TL;DR notes: Edit the inputs.conf on the Windows system (usually C:\Program Files\SplunkUniversalForwarder\etc\system\local\inputs.conf) and add these lines: [WinEventLog://Application] disabled = 0 [WinEventLog://Security] disabled = 0 [WinEventLog://System] disabled = 0 You'll need to restart the SplunkUniversalForwarder service on the Windows system. Your Splunk index should start receiving these events.

dglinder · ‎01-27-2014

More details than "unable to install" would help.

dglinder · ‎01-27-2014

I don't use sub-searches so I always forget their usefulness.

dglinder · ‎01-27-2014

Thanks lukejadamec - I've update the example.

dglinder · ‎01-27-2014

Can you use the eval command to set a new Success/Failure field? sourcetype="WMI:WinEventLog:Application" EventCode=57755 OR EventCode=34112 OR EventCode=34113 OR EventCode=34114 | eval Outcome=case(EventCode==57755 OR EventCode==34112, "Success", EventCode==34113 OR EventCode==34114, "Failure") | search Outcome="Success" | stats count by host, Outcome (Edit: minor fixes to the EventCode search and eval portions.)

dglinder · ‎12-04-2013

I'm generating a report of the daily usage of my users indexes over the past week using this search: earliest=-7d@d latest=@d index="_internal" source="*metrics.log" per_index_thruput | eval GB=kb/(1024*1024) | bucket _time span=1d | convert ctime(_time) as timestamp | timechart span=1d sum(GB) by series This works well, except the "_fishbucket" shows as one of the values charted. I would like to combine"_fishbucket" and a few other fields into the "OTHER" category, but the only methods I can think of appear to drop them completely from the report. Any suggestions?

dglinder · ‎12-01-2013

Taking the "emphasis on simple" comment, how about installing a second copy of v5.0 on the same server but in a different directory and listening on a different port? It can use your existing indexers and the load from having it sit idle as you start to use the version 6 console will be negligible. This assumes you and your users aren't making use of the version 6 code yet within your apps.

dglinder · ‎11-29-2013

Thanks, I had looked at the metasearch output but passed it over in favor of my usual search commands. And it is definitely much faster than my initial search, order of 10x at least on my tests.

dglinder · ‎11-27-2013

I want to produce a search that returns basic information about our indexes, specifically the index name, the splunk_server(s) that have the index data, and the hosts that provided the data. Right now I have this search: index=* | dedup index splunk_server host | table index splunk_server host This returns a very large table where each row contains a single "host" entry: index splunk_server host index_a server001 client001 index_a server001 client002 index_a server001 client003 index_a server002 client001 index_a server001 client002 index_b server001 client001 What I would like is to group all of the hosts together when the index and splunk_server match. index splunk_server host index_a server001 client001, client002, client003 index_a server002 client001, client002 index_b server001 client001 Adding the mvcombine option helps: index=* | dedup index splunk_server host | sort index splunk_server | mvcombine delim="," host | table index splunk_server host ..but, there are still many places where the index+splunk_server are the same but the hosts between these lines aren't combined. Any ideas?

dglinder · ‎11-10-2013

What is the status of SPL-31098?

dglinder · ‎11-08-2013

bump I hate adding a "me too" for a response...

dglinder · ‎11-06-2013

Are you running the browser from the same server, or is this your laptop connecting to it remotely? If so, check to make sure the iptables on your system isn't blocking port 8000 from the outside world.

dglinder · ‎10-29-2013

If you can't restart the splunkd process, calling the URL "http://SplunkServer:8000/en-US/debug/refresh" will force the Splunk process to re-read the meta file information.

dglinder · ‎10-25-2013

We have a number of jobs running as "admin" that run and create large temporary files on disk and when the disk quota kicks in we aren't certain if it is due to this job, or another job running as admin. Since this is a server shared between multiple development teams, I don't want one teams search to impact other teams ability to debug their code. In the past, a fellow admin has changed the owner to "nobody" to get around the quota problem without resorting to increasing a quota - apparently "nobody" does not have any quota restrictions? My thought is to change the owner of the job to the name of the developer or team that created it and work with them to either resolve the quota issue, or increase their quota to allow these jobs to run. Here are my questions: I'm ok with using "nobody" to work around the quota restrictions for a short time if that works. I can't find a document/wiki/answer that addresses what restrictions the "nobody" owner has - can anyone help? How have others addressed this? I'm tempted to create a "team account" that is just for running that teams jobs while keeping the ability to control run-away jobs in check. Are there other options I've overlooked? Is the "splunk-system-user" an appropriate owner for these jobs? My gut says no since it's usually for internal/system jobs and could be as bad as "admin".

dglinder · ‎10-09-2013

Thanks Michael.Bates! This was the fix for me. My underlying problem was the use of self-signed certs that created a PDF of the Firefox "connection is untrusted" error page. Fixing these two lines allows the PDF to generate correctly.

dglinder · ‎10-03-2013

Thanks, looks like that is the best/only option.

dglinder · ‎09-27-2013

As a DR process note, I was asked to confirm we can rebuild all servers in the current state if necessary. Our current has an older "Splunk App for VMware" app installed (version 2), but the current Apps.Splunk.com only has 3.0. Can I request a copy of a previous version .SPL install file? Or is there a download location for older app install packages? Is the original SPL stored on the server, or is it deleted once the app is installed?

dglinder · ‎09-24-2013

I hear your pain! Though by separating the Splunk duties from the Syslog duties, the main function of one was not impacted during the outage/restart of the other.

Posts	38
Solutions	5
Karma Given	29
Karma Received	23
Member Since	‎07-19-2013

Online Status	Offline
Date Last Visited	‎06-05-2020 02:04 AM

What is the queue named "aeq" and how to increase ...

JuniperSA not showing some roles from the log file...

Add specific fields into the timechart OTHER categ...

Reformat a field from multiple rows down to one ro...

Pros/cons of using "owner = nobody" as related to ...

Request for older versions of an app.

Pros/cons of using Syslog-NG (or other syslog file...

Unable to save searches "[HTTP 403] Client is not ...

Re: What is the queue named "aeq" and how to incre...

What is the queue named "aeq" and how to increase ...

Re: fill_summary_index.py error backfilling 5-minu...

Re: Test Email in Splunk

Re: JuniperSA not showing some roles from the log ...

JuniperSA not showing some roles from the log file...

Re: Add specific fields into the timechart OTHER c...

Re: Forwarding windows event viewer logs to Splunk

Re: Forwarding windows event viewer logs to Splunk

Re: organizing multiple responses

Re: organizing multiple responses

Re: organizing multiple responses

Add specific fields into the timechart OTHER categ...

Re: Looking for a simple (emphasis on "simple") wa...

Re: Reformat a field from multiple rows down to on...

Reformat a field from multiple rows down to one ro...

Re: My saved search names get truncated in the men...

Re: Realtime searches (efficiency & results)

Re: clean install, no web interface

Re: How do I change the owner of a saved search?

Pros/cons of using "owner = nobody" as related to ...

Re: Schedule PDF Delivery -> Preview not working

Re: Request for older versions of an app.

Request for older versions of an app.

Re: Pros/cons of using Syslog-NG (or other syslog ...

Join the Conversation