How come Splunk only can read 10000 lines from my ...

christianubeda · ‎02-07-2019

Hi team!

I have a problem.

I want to match two fields. The first one is an src_ip from an indexer(traffic events) the second one is an IP from a CSV.

My CSV has 9.000.000 lines and inputlookup only can read the first 10.000 lines...

how can I do it??

woodcock · ‎02-07-2019

It is hard to answer when you do not show us your search. Why would you not share your SPL?

woodcock · ‎02-07-2019

DO NOT USE JOIN. It has limits. Try this:

index=cesa_paloalto sourcetype="pan:traffic" type=TRAFFIC vendor_action=allow
| lookup ipsmalware2.csv ip AS src_ip OUTPUT ip AS keepMeIfNonNull
| where isnotnull(keepMeIfNonNull)
| stats values(src_ip)

christianubeda · ‎02-07-2019

This is the query

index=cesa_paloalto sourcetype="pan:traffic" type=TRAFFIC vendor_action=allow | join src_ip [| inputlookup append=t ipsmalware2.csv | eval src_ip=Ip]
| stats values(src_ip)

I introduced 5 IP's in my csv

Line 1 OK
Line 9999 OK
Line 10004 Fail
Line 12000 FAIL
Line 222333(last one) FAIL

Vijeta · ‎02-07-2019

Do you have 900000 unique Ip in your lookup? If not can you use dedup ip-

index=cesa_paloalto sourcetype="pan:traffic" type=TRAFFIC vendor_action=allow | join src_ip [| inputlookup append=t ipsmalware2.csv | dedup ip|eval src_ip=Ip]
| stats values(src_ip)

christianubeda · ‎02-07-2019

Hi,

Yes, I have 9000000 unique IP.

Vijeta · ‎02-07-2019

How many unique IP your index returns within the timeframe you are searching?

Vijeta · ‎02-07-2019

If that is less than 10K then you can try below

|inputlookup append=t ipsmalware2.csv | eval src_ip=Ip|join src_ip type=inner[|search index=cesa_paloalto sourcetype="pan:traffic" type=TRAFFIC vendor_action=allow ]

christianubeda · ‎02-07-2019

Hi Vijeta,

I only have matches if the IP from my csv is in the first 10000 lines. Actually I have 9000000 lines in my csv. So It didn`t work.

If I do that |inputlookup append=t ipsmalware2.csv I see al files. The problem is when I try to match them...

Vijeta · ‎02-07-2019

That is not coz of matching , that is limitation of a subsearch. The subsearch returns you only 10K results and that is why rest all appear as not matched.

Vijeta · ‎02-07-2019

Are you using lookup in a subsearch, probably that is limiting it to 10K results as there is a max limit for subserach results. Can you use dedup on IP or avoid subsearch by any means? Also would be better if you can paste your query here.

How come Splunk only can read 10000 lines from my csv? I need 9000000!

Stay Connected: Your Guide to January Tech Talks, Office Hours, and Webinars!

[Puzzles] Solve, Learn, Repeat: Reprocessing XML into Fixed-Length Events

Data Management Digest – December 2025

Join the Conversation

How come Splunk only can read 10000 lines from my csv? I need 9000000!

Stay Connected: Your Guide to January Tech Talks, Office Hours, and Webinars!

[Puzzles] Solve, Learn, Repeat: Reprocessing XML into Fixed-Length Events

Data Management Digest – December 2025