Activity Feed
- Got Karma for Re: How to get the name of the chosen token in drop-down input ?. 07-26-2023 05:23 AM
- Got Karma for Re: Why am I getting "Socket error communicating with splunkd" when I try to reload my deployment server to push out new configurations?. 08-26-2022 10:09 AM
- Got Karma for Re: How to get the name of the chosen token in drop-down input ?. 09-18-2020 02:11 AM
- Got Karma for Re: Why are custom settings in ui-prefs.conf not being respected?. 09-02-2020 09:20 AM
- Got Karma for Splunk sub-processes start/stop every minute (splunk-admon, splunk-powershell, etc). How do we prevent this?. 07-20-2020 09:35 PM
- Karma Re: Virtual index causing metadata command to error out for other sourcetypes for jhornsby_splunk. 06-05-2020 12:51 AM
- Karma Re: Splunk Analytics for Hadoop & AWS EMR: java.io.IOException: Error while waiting for MapReduce job to complete for pkeenan87. 06-05-2020 12:50 AM
- Karma Re: How many pipelines should I use on a forwarder? for davidpaper. 06-05-2020 12:49 AM
- Karma Re: About spliting events for harishalipaka. 06-05-2020 12:49 AM
- Karma Re: My splunk instance stop responding for akocak. 06-05-2020 12:49 AM
- Got Karma for Re: How to resolve error SSL: CERTIFICATE_VERIFY_FAILED?. 06-05-2020 12:49 AM
- Got Karma for Re: Better way of getting the series of events and all the field values. 06-05-2020 12:49 AM
- Got Karma for Re: Heavy forwarder - routing to different indexes depending on field value. 06-05-2020 12:49 AM
- Got Karma for Re: CPU Usage by Process. 06-05-2020 12:49 AM
- Got Karma for Re: How to resolve error SSL: CERTIFICATE_VERIFY_FAILED?. 06-05-2020 12:49 AM
- Got Karma for Re: Can't get iplocation to work in my search. 06-05-2020 12:49 AM
- Got Karma for Re: Splunk unable to read files. 06-05-2020 12:49 AM
- Got Karma for Re: Splunk Realtime report. 06-05-2020 12:49 AM
- Karma Licensed Splunk Enterprise 6.5.0: How do I prevent "New maintenance" and "New version available" messages in Splunk Web? for rewritex. 06-05-2020 12:48 AM
- Karma Re: How often/quickly does a Splunk universal forwarder read a file? for dshakespeare_sp. 06-05-2020 12:48 AM
Topics I've Started
Subject | Karma | Author | Latest Post |
---|---|---|---|
0 | |||
0 | |||
0 | |||
4 | |||
0 | |||
1 | |||
0 | |||
0 | |||
0 | |||
1 |
02-10-2020
11:15 AM
Yep, thanks.
... View more
02-08-2020
10:02 AM
The fix ended up being 2-fold:
Making sure mapred-site.xml has the following: name mapreduce.framework.name set to value yarn
to pull yarn-site.xml directly from what was in EMR master node (/usr/lib/hadoop-yarn/yarn-site.xml). Specifically, the yarn.application.classpath for EMR 5.28.0 is:
$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*,/usr/lib/hadoop-lzo/lib/*,/usr/share/aws/emr/emrfs/conf,/usr/share/aws/emr/emrfs/lib/*,/usr/share/aws/emr/emrfs/auxlib/*,/usr/share/aws/emr/lib/*,/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar,/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar,/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar,/usr/share/aws/emr/cloudwatch-sink/lib/*,/usr/share/aws/aws-java-sdk/*
Setting this resolved my latest issue.
... View more
02-08-2020
09:57 AM
Thank you so much for this. It's 2020 and this helped solve my issue. If you're using EMR, make sure to SSH to your master node, cd /usr/lib/hadoop-yarn/ and look at yarn-site.xml for yarn.application.classpath and use what's in there in your hadoop client's yarn-site.xml. Mine turned out to be:
$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*,/usr/lib/hadoop-lzo/lib/*,/usr/share/aws/emr/emrfs/conf,/usr/share/aws/emr/emrfs/lib/*,/usr/share/aws/emr/emrfs/auxlib/*,/usr/share/aws/emr/lib/*,/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar,/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar,/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar,/usr/share/aws/emr/cloudwatch-sink/lib/*,/usr/share/aws/aws-java-sdk/*
... View more
02-07-2020
12:07 PM
Testing locally with hadoop cli, I'm running into issues. I feel like the problem stems from something in yarn-site.xml or mapred-site.xml but not really sure where to look.
... View more
02-07-2020
12:06 PM
[provider:my-hadoop-provider]
vix.command = $SPLUNK_HOME/bin/jars/sudobash
vix.command.arg.1 = $HADOOP_HOME/bin/hadoop
vix.command.arg.2 = jar
vix.command.arg.3 = $SPLUNK_HOME/bin/jars/SplunkMR-hy2.jar
vix.command.arg.4 = com.splunk.mr.SplunkMR
vix.env.HADOOP_CLIENT_OPTS = -XX:ParallelGCThreads=4 -XX:+UseParallelGC -XX:+DisplayVMOutputToStderr
vix.env.HADOOP_HEAPSIZE = 512
vix.env.HADOOP_HOME = /opt/hadoop
vix.env.HUNK_THIRDPARTY_JARS = $SPLUNK_HOME/bin/jars/thirdparty/common/avro-1.7.7.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/avro-mapred-1.7.7.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/commons-compress-1.10.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/commons-io-2.4.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/libfb303-0.9.2.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/parquet-hive-bundle-1.6.0.jar,$SPLUNK_HOME/bin/jars/thirdparty/common/snappy-java-1.1.1.7.jar,$SPLUNK_HOME/bin/jars/thirdparty/hive_1_2/hive-exec-1.2.1.jar,$SPLUNK_HOME/bin/jars/thirdparty/hive_1_2/hive-metastore-1.2.1.jar,$SPLUNK_HOME/bin/jars/thirdparty/hive_1_2/hive-serde-1.2.1.jar
vix.env.JAVA_HOME = /usr
vix.env.MAPREDUCE_USER =
vix.family = hadoop
vix.fs.default.name = hdfs://ip-172.29.29.29.ec2.internal:8020/
vix.mapred.child.java.opts = -server -Xmx512m -XX:ParallelGCThreads=4 -XX:+UseParallelGC -XX:+DisplayVMOutputToStderr
vix.mapred.job.map.memory.mb = 2048
vix.mapred.job.queue.name = default
vix.mapred.job.reduce.memory.mb = 512
vix.mapred.job.reuse.jvm.num.tasks = 100
vix.mapred.reduce.tasks = 2
vix.mapreduce.application.classpath = $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*, $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*, /usr/lib/hadoop-lzo/lib/*, /usr/share/aws/emr/emrfs/conf, /usr/share/aws/emr/emrfs/lib/*, /usr/share/aws/emr/emrfs/auxlib/*, /usr/share/aws/emr/lib/*, /usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar, /usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar, /usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar, /usr/share/aws/emr/cloudwatch-sink/lib/*, /usr/share/aws/aws-java-sdk/*
vix.mapreduce.framework.name = yarn
vix.mapreduce.job.jvm.numtasks = 20
vix.mapreduce.job.queuename = default
vix.mapreduce.job.reduces = 3
vix.mapreduce.map.java.opts = -server -Xmx512m -XX:ParallelGCThreads=4 -XX:+UseParallelGC -XX:+DisplayVMOutputToStderr
vix.mapreduce.map.memory.mb = 2048
vix.mapreduce.reduce.java.opts = -server -Xmx512m -XX:ParallelGCThreads=4 -XX:+UseParallelGC -XX:+DisplayVMOutputToStderr
vix.mapreduce.reduce.memory.mb = 512
vix.mode = report
vix.output.buckets.max.network.bandwidth = 0
vix.splunk.heartbeat = 1
vix.splunk.heartbeat.interval = 1000
vix.splunk.heartbeat.threshold = 60
vix.splunk.home.datanode = /tmp/splunk/$SPLUNK_SERVER_NAME/
vix.splunk.home.hdfs = /tmp/splunk/mysh.abc.corp.com/
vix.splunk.search.column.filter = 1
vix.splunk.search.debug = 1
vix.splunk.search.mixedmode = 1
vix.splunk.search.mr.maxsplits = 10000
vix.splunk.search.mr.minsplits = 100
vix.splunk.search.mr.poll = 2000
vix.splunk.search.mr.splits.multiplier = 10
vix.splunk.search.recordreader = SplunkJournalRecordReader,ValueAvroRecordReader,SimpleCSVRecordReader,SequenceFileRecordReader
vix.splunk.search.recordreader.avro.regex = \.avro$
vix.splunk.search.recordreader.csv.regex = \.([tc]sv)(?:\.(?:gz|bz2|snappy))?$
vix.splunk.search.recordreader.sequence.regex = \.seq$
vix.splunk.setup.onsearch = 1
vix.splunk.setup.package = current
vix.yarn.resourcemanager.address = hdfs://ip-172.29.29.29.ec2.internal:8032/
vix.yarn.resourcemanager.scheduler.address = hdfs://ip-172.29.29.29.ec2.internal:8030/
... View more
02-06-2020
01:24 PM
So my vix.mapreduce.framework.name was blank and I just set to yarn and now I get a different error. @rdagan_splunk or @jhornsby_splunk any ideas on this one?
Exception - java.io.IOException: Error while waiting for MapReduce job to complete, job_id=job_1576770149627_0070, state=FAILED, reason=Application application_1576770149627_0070 failed 2 times due to AM Container for appattempt_1576770149627_0070_000002 exited with exitCode: 1
Doing more research now for additional logs.
Searching my hadoop cluster's logs, I see Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster which lead me to https://stackoverflow.com/questions/50927577/could-not-find-or-load-main-class-org-apache-hadoop-mapreduce-v2-app-mrappmaster.
The changes here don't seem to be doing anything.
... View more
02-06-2020
11:06 AM
It makes sense, ok. Will research that further after solving this issue. Thanks.
... View more
02-06-2020
11:00 AM
Gotcha, yea so that's all set. I saw splunk writing to /tmp/hadoop-splunk. So I think all permissions are set correctly. Side question: do you know of a setting to restrict how much data splunk can write to this directory? We saw it filling up the drive without cleanup.
... View more
02-06-2020
10:39 AM
Do you know what vix.splunk.home.datanode is suppose to look like? Should it be an hdfs:// path?
... View more
02-05-2020
04:28 PM
Are there any others that must be configured? It's sort of a bare install of the client.
... View more
02-05-2020
04:18 PM
yarn-site.xml looks like the following, but get same error with / without:
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>hdfs://masternode:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hdfs://masternode:8030</value>
</property>
</configuration>
... View more
02-05-2020
08:21 AM
Some other logs that stood out:
02-05-2020 15:29:07.564 DEBUG ERP.hadoop-cluster - RestStorageService - Response xml: true
02-05-2020 15:29:07.564 DEBUG ERP.hadoop-cluster - RestStorageService - Response entity: null
02-05-2020 15:29:07.564 DEBUG ERP.hadoop-cluster - RestStorageService - Response entity length: ??
02-05-2020 15:29:07.564 DEBUG ERP.hadoop-cluster - RestStorageService - Releasing error response without XML content
02-05-2020 15:29:07.565 DEBUG ERP.hadoop-cluster - RestStorageService - Rethrowing as a ServiceException error in performRequest: org.jets3t.service.ServiceException: Request Error., with cause: org.jets3t.service.impl.rest.HttpException
02-05-2020 15:29:07.565 DEBUG ERP.hadoop-cluster - RestStorageService - Releasing HttpClient connection after error: Request Error.
02-05-2020 15:29:07.565 DEBUG ERP.hadoop-cluster - Jets3tProperties - s3service.disable-dns-buckets=false
02-05-2020 15:29:07.565 DEBUG ERP.hadoop-cluster - Jets3tProperties - s3service.s3-endpoint=s3.amazonaws.com
... View more
02-05-2020
08:09 AM
Looking through - trying to find exactly which permissions you're referring to. I don't see any communication over ports that aren't allowed through firewall. Splunk can search EMR and return data, just not run map reduce queries. So if you have a good spot to check specifically for mapreduce, let me know. Thanks.
... View more
02-05-2020
07:52 AM
The provider is using
Hadoop 2.x (yarn)
with EMR-5.28.0
with Hadoop 2.8.5
Hive 2.3.6
Pig 0.17.0
Hug 4.4.0
My search head has:
- Hadoop CLI 2.8.5
- OpenJDK 1.7.0
Some additional logs:
02-05-2020 15:29:10.388 INFO ERP.hadoop-cluster - Job - The url to track the job: http://localhost:8080/
02-05-2020 15:29:10.388 INFO ERP.hadoop-cluster - AsyncMRJob - Done submitting job.name=SPLK_ec2.server.com_1580916537.266_0, url=http://localhost:8080/
02-05-2020 15:29:10.389 ERROR ERP.hadoop-cluster - SplunkMR - jobClient
02-05-2020 15:29:10.389 ERROR ERP.hadoop-cluster - java.lang.NoSuchFieldException: jobClient
02-05-2020 15:29:10.389 ERROR ERP.hadoop-cluster - at java.lang.Class.getDeclaredField(Class.java:1961)
02-05-2020 15:29:10.389 ERROR ERP.hadoop-cluster - at com.splunk.mr.SplunkMR.getJobClient(SplunkMR.java:592)
02-05-2020 15:29:10.389 ERROR ERP.hadoop-cluster - at com.splunk.mr.AsyncMRJob.run(AsyncMRJob.java:136)
02-05-2020 15:29:10.389 ERROR ERP.hadoop-cluster - at java.lang.Thread.run(Thread.java:748)
02-05-2020 15:29:10.389 ERROR ERP.hadoop-cluster - AsyncMRJob -
02-05-2020 15:29:10.389 ERROR ERP.hadoop-cluster - java.lang.NullPointerException
02-05-2020 15:29:10.389 ERROR ERP.hadoop-cluster - at com.splunk.mr.AsyncMRJob.run(AsyncMRJob.java:137)
02-05-2020 15:29:10.389 ERROR ERP.hadoop-cluster - at java.lang.Thread.run(Thread.java:748)
02-05-2020 15:29:10.389 DEBUG ERP.hadoop-cluster - OutputProcessor - received: null
02-05-2020 15:29:10.394 INFO ERP.hadoop-cluster - AsyncMRJob - start killing MR job id=job_local413061526_0001, job.name=SPLK_ec2.server.com_1580916537.266_0, _state=FAILED
02-05-2020 15:29:10.396 INFO ERP.hadoop-cluster - LocalJobRunner$Job - OutputCommitter set in config null
02-05-2020 15:29:10.401 INFO ERP.hadoop-cluster - FileOutputCommitter - File Output Committer Algorithm version is 1
02-05-2020 15:29:10.401 INFO ERP.hadoop-cluster - FileOutputCommitter - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
02-05-2020 15:29:10.401 INFO ERP.hadoop-cluster - LocalJobRunner$Job - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
02-05-2020 15:29:10.411 DEBUG ERP.hadoop-cluster - ProtobufRpcEngine$Invoker - Call: mkdirs took 6ms
02-05-2020 15:29:10.444 DEBUG ERP.hadoop-cluster - LocalJobRunner$Job - Starting mapper thread pool executor.
02-05-2020 15:29:10.444 DEBUG ERP.hadoop-cluster - LocalJobRunner$Job - Max local threads: 1
02-05-2020 15:29:10.444 DEBUG ERP.hadoop-cluster - LocalJobRunner$Job - Map tasks to process: 2
02-05-2020 15:29:10.445 INFO ERP.hadoop-cluster - LocalJobRunner$Job - Waiting for map tasks
02-05-2020 15:29:10.445 INFO ERP.hadoop-cluster - LocalJobRunner$Job$MapTaskRunnable - Starting task: attempt_local413061526_0001_m_000000_0
02-05-2020 15:29:10.454 DEBUG ERP.hadoop-cluster - SortedRanges$SkipRangeIterator - currentIndex 0 0:0
02-05-2020 15:29:10.469 DEBUG ERP.hadoop-cluster - LocalJobRunner - mapreduce.cluster.local.dir for child : /tmp/hadoop-splunk/mapred/local/localRunner//splunk/jobcache/job_local413061526_0001/attempt_local413061526_0001_m_000000_0
02-05-2020 15:29:10.471 DEBUG ERP.hadoop-cluster - Task - using new api for output committer
02-05-2020 15:29:10.475 INFO ERP.hadoop-cluster - FileOutputCommitter - File Output Committer Algorithm version is 1
... View more
02-04-2020
01:36 PM
I'm trying to do a simple | stats count over a virtual index and receiving errors. Thoughts on where to look for this one?
Splunk 7.3.3 / Splunk 8.x to EMR cluster with master and two slave nodes. It still produces a count, but I assume it's much slower than if it was doing a map-reduce on it.
Exception - com.splunk.mr.JobStartException: Failed to start MapReduce job. Please consult search.log for more information. Message: [ Failed to start MapReduce job, name=SPLK_searchhead1.abc.corp.com_1580844860.138_0 ] and [ null ]
Edit:
Other testing performed:
Upgraded JDK from 1.7 to 1.8. No change to what works/doesn't work.
After adding vix.mapreduce.framework.name=yarn to indexes.conf and mapreduce.framework.name=yarn to yarn-site.xml, I get Exception - failed 2 times due to AM Container for appattempt_...
I've tested outside of splunk and still receive the AM Container error: yarn jar hadoop-streaming.jar streamjob -files wordSplitter.py -mapper wordSplitter.py -input input.txt -output wordCountOut -reducer aggregate
... View more
01-22-2020
01:22 PM
Probably unsupported, but you can take an 8.0 install, copy /opt/splunk/bin/jars/SplunkMR-hy2.jar and copy into your 7.3.3 install to fix this issue.
... View more
01-22-2020
01:05 PM
Nevermind - it seems 8.0 does in fact resolve the issue. I just tested.
... View more
01-22-2020
09:43 AM
Thanks Jo. Just to confirm, this is currently unresolved even in the latest release of splunk? If so, is there any fix planned that will be applied to the 7.3.x chain in say, 7.3.5? Thanks again.
Or did you mean it was patched, but it never made it in to release notes?
... View more
01-21-2020
09:46 AM
Hey thanks for the response. Any chance you can post which release notes items directly corrects this? I need to read up on what's causing it. Thanks!
... View more
01-21-2020
07:45 AM
Without a virtual index enabled, running | metadata type=sourcetypes index=* will return correctly.
Adding a virtual index that uses a hadoop provider, this command now fails due to the fact that it can't find sourcetype details. Searching the virtual index however returns correct sourcetype details.
What is necessary for the metadata command to return successfully? Is there a file I need next to the data to dictate the sourcetype info? Can I remove this index from the metadata results without having to manually specify all indexes I want in the command?
Error:
01-15-2020 20:57:40.884 ERROR metadata - No 'sourcetype' key found in results. Cannot merge metadata.
01-15-2020 20:57:40.884 INFO PreviewExecutor - Finished preview generation in 0.002741056 seconds.
01-15-2020 20:57:40.901 INFO ReducePhaseExecutor - Ending phase_1
01-15-2020 20:57:40.901 INFO UserManager - Unwound user context: x@y.com -> NULL
01-15-2020 20:57:40.901 ERROR SearchOrchestrator - Phase_1 failed due to : Error in 'metadata': No 'sourcetype' key found in results. Cannot merge metadata.
01-15-2020 20:57:40.901 INFO ReducePhaseExecutor - ReducePhaseExecutor=1 action=CANCEL
01-15-2020 20:57:40.901 INFO DispatchExecutor - User applied action=CANCEL while status=0
01-15-2020 20:57:40.901 ERROR SearchStatusEnforcer - sid:md_1579121855.178190 Error in 'metadata': No 'sourcetype' key found in results. Cannot merge metadata.
Version info:
Splunk 7.3.3
Hadoop cli 2.8.4
AWS EMR emr-5.28.0
... View more
<setup>
<block title="Title of page">
<text>All fields are required.</text>
</block>
<block title="Add new credentials" endpoint="storage/passwords" entity="_new">
<input field="name">
<label>Account ID</label>
<type>text</type>
</input>
<input field="password">
<label>API Key</label>
<type>password</type>
</input>
<text>
<![CDATA[ <script type="text/javascript">
$(function() {
$('label[for*="password_id_confirm"]').html("Confirm API Key")
});
</script> ]]>
</text>
</block>
</setup>
... View more
04-29-2019
09:20 AM
All I can find in the docs is:
https://docs.splunk.com/Documentation/Splunk/latest/Data/Extractfieldsfromfileswithstructureddata
No support for mid-file renaming of header fields
Some software, such as Internet Information Server, supports the renaming of header fields in the middle of the file. Splunk software does not recognize changes such as this. If you attempt to index a file that has header fields renamed within the file, the renamed header field is not indexed.
... View more
04-29-2019
09:14 AM
We have a single Splunk instance with custom scripted input that pulls down json, and has indexed extractions.
New fields were added to the json that aren't getting extracted. We want to be able to remove the known headers that Splunk knows of (what fields to extract), so that it can start over and pick up newly added fields. Is there any method of doing this?
Are our only options: 1) change sourcetype or 2) use search time extractions?
... View more
08-02-2018
07:37 PM
I would recommend utilizing the kvstore to maintain state if you're going to want to know the current state of all your machines. Every x minutes check for any new events, and overwrite the existing value for a host with the color/status. inputlookup this kv store, find all new statuses, dedup to get the latest values, then outputlookup append=t to save any changes. Then when you're trying to view status, you only have to input the kvstore to your display.
... View more