I have a csv file as a lookup, named "resources.csv." Looking at the actual file, it has about 30,000 lines. In the Splunk search, I am only getting about 15,000 results, though. I'm using the following command to view the lookup table:
|inputlookup resources.csv
The csv file is updated through a script I have running each morning. I have restarted the searchhead that this lookup file is being read by. Nothing has seemed to work. Still only about 15,000 results from the inputlookup command.
Any suggestions?
http://answers.splunk.com/answers/139821/inputlookup-not-returning-all-the-rows-in-csv-file.html
Seems that double quotes affect this. I had six entries with double quotes. I removed them and now I'm getting all my results. Seems that inbetween the lines with quotes was about 15,000 lines.
So, why would double quotes affect a csv lookup file? Rows are determined via linebreaks. And columns are determined by commas.
http://answers.splunk.com/answers/139821/inputlookup-not-returning-all-the-rows-in-csv-file.html
Seems that double quotes affect this. I had six entries with double quotes. I removed them and now I'm getting all my results. Seems that inbetween the lines with quotes was about 15,000 lines.
So, why would double quotes affect a csv lookup file? Rows are determined via linebreaks. And columns are determined by commas.
double quotes are used to quote values in CSV. See https://en.wikipedia.org/wiki/Comma-separated_values#Basic_rules_and_examples
It is either hitting a limit or you are not looking at the file that think you are. How is the file generated? How is it put onto the Search Head (or is it)? The only other option is that you have tripped across a bug. If you have investigated it well, I would open a support case.
I'm creating it via a bash shell script that is using the sqlcmd command provided by the Microsoft MS-SQL driver (this is on a RHEL6 box) to query the database and create the file. The output overwrites the previous csv in a lookup folder every morning.
Does your lookup have more than 1 line per event? Splunk generated lookup files with multivalued fields often have this property. If your lookup file has a primary key, you can try to find the set difference between the lookup file and what inputlookup returns and see if there is any pattern as to which rows are missing.
Only 1 line per event in my csv lookup file. I'm trying to find some rhyme and reason to which rows are missing.
Edit: I thought that this might apply but it doesn't. Thanks @steveyz
First of all, I can't believe your name is @jizzmaster.
Secondly, check limits.conf - I'm wondering if this is what you're hitting?
[lookup]
max_memtable_bytes = <integer>
* Maximum size of static lookup file to use an in-memory index for.
* Defaults to 10000000 in bytes (10MB
I figured the username should fit with the product ...
max_memtable_bytes is 10MB but my csv is 4.5MBs.
Then it would be more like cavemaster I'd think but do I (personally) appreciate your humor
that limits.conf setting does not affect inputlookup. It only affects the performance optimization for performing lookups. inputlookup is basically inputcsv, but from the lookup directories rather than the dispatch directory.
Also, when performing additional searches through Splunk on this lookup file, there are missing fields. When I search for it in the actual file, the rows I'm looking for are there.
This is the command I use to search through the lookup table:
|inputlookup resources.csv |search host_name=2AH39911B
And yes, the "host_name" field exists. As does the specific host_name I'm searching for. No results in the Splunk search though.