Splunk Search

Lookup csv file doesn't load completely

NOCSSMS
Explorer

Hello,

We're running Splunk 8.0.3 with a 2G/day license and want to load a CSV with 332928 lines so that we can use it to enrich our events:

 

 

[root@splunk lookups]# pwd
/opt/splunk/etc/apps/search/lookups
[root@splunk lookups]# wc -l int_des.csv
332928 int_des.csv
[root@splunk lookups]# ls -l int_des.csv
-rw------- 1 root root 23997247 Jan 26 15:51 int_des.csv

 

 

The problem we're facing is apparently the CSV is not loaded completely. When we query like this:

| inputlookup int_des

we only get 173201 records. Is there some limit we're hitting?

 

This is our limits.conf from /opt/splunk/etc/system/local/limits.conf:

 

 

 

[root@splunk lookups]# cat  /opt/splunk/etc/system/local/limits.conf
[search]
allow_batch_mode = 1
allow_inexact_metasearch = 0
always_include_indexedfield_lispy = 0
default_allow_queue = 1
disabled = 0
enable_conditional_expansion = 1
enable_cumulative_quota = 0
enable_datamodel_meval = 1
enable_history = 1
enable_memory_tracker = 0
force_saved_search_dispatch_as_user = 0
load_remote_bundles = 0
log_search_messages = 0
read_final_results_from_timeliner = 1
record_search_telemetry = 1
remote_timeline = 1
search_retry = 0
timeline_events_preview = 0
track_indextime_range = 1
track_matching_sourcetypes = 1
truncate_report = 0
unified_search = 0
use_bloomfilter = 1
use_metadata_elimination = 1
use_search_evaluator_v2 = 1
write_multifile_results_out = 1
############################################################################
# Concurrency
############################################################################
# This section contains settings for search concurrency limits.
# The total number of concurrent searches is
# base_max_searches + #cpus*max_searches_per_cpu

# The base number of concurrent searches.
base_max_searches = 6

# Max real-time searches = max_rt_search_multiplier x max historical searches.
max_rt_search_multiplier = 10

# The maximum number of concurrent searches per CPU.
max_searches_per_cpu = 10

[lookup]
# Maximum size of static lookup file to use a in-memory index for.
max_memtable_bytes = 262144000
# Maximum reverse lookup matches (for search expansion).
max_reverse_matches = 50
# Default setting for if non-memory file lookups (for large files)
# should batch queries.
# Can be overridden using a lookup table's stanza in transforms.conf.
batch_index_query = true
# When doing batch request, what's the most matches to retrieve?
# If more than this limit of matches would otherwise be retrieved,
# we will fall back to non-batch mode matching.
batch_response_limit = 5000000
# Maximum number of lookup error messages that should be logged.
max_lookup_messages = 20
# time to live for an indexed csv
indexed_csv_ttl = 300
# keep alive token file period
indexed_csv_keep_alive_timeout = 30
# max time for the CSV indexing
indexed_csv_inprogress_max_timeout = 300

 

 

 

Where should we look for relevant logs? What can we do to troubleshoot further?

Thanks

Labels (1)
0 Karma
1 Solution

bowesmana
Champion

Is it always 173201?

What if you delete one of the lines between 1 and that number - does the number go down by 1 or stay constant?

What if you delete 10? 

i.e. is it some odd character, that is causing Splunk to stop at a particular line?

Does the search.log in the job inspector for the inputlookup command show anything useful

Also, I note that you're using a lookup definition in inputlookup rather than the CSV directly. Just to be sure, you have set up int_des.csv as the lookup file to be used by that definition?

Have you tried

| inputlookup int_des.csv

to reference the lookup file directly - does that give the same result?

 

View solution in original post

bowesmana
Champion

Is it always 173201?

What if you delete one of the lines between 1 and that number - does the number go down by 1 or stay constant?

What if you delete 10? 

i.e. is it some odd character, that is causing Splunk to stop at a particular line?

Does the search.log in the job inspector for the inputlookup command show anything useful

Also, I note that you're using a lookup definition in inputlookup rather than the CSV directly. Just to be sure, you have set up int_des.csv as the lookup file to be used by that definition?

Have you tried

| inputlookup int_des.csv

to reference the lookup file directly - does that give the same result?

 

View solution in original post

NOCSSMS
Explorer

Thank you for your suggestion. It seems that around that line there was malformed content (unescaped quotes) that was probably causing problems for the CSV parser. We'll work on better sanitization of the data first.

.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!