How to enrich GeoIP at index time?

BeefSupreme · ‎02-09-2022

I am sure this is a pretty common use case, mainly because IP addresses move, the data is not static so for security retro hunts etc or even just searching a few days of data, the Geo data needs to be static in the data and can't be a search lookup. Technically i can't even think of a use case where you would ever want Geo data to be a search lookup but I am sure there are some use cases out there.

Elasticsearch has a couple options to do this, IE ingest nodes or logstash so I am sure a millions people are doing this in Splunk. If someone could point me at the documentation I would appreciate it.

Closest thing I could find is ingest time eval but not sure how that does geoip enrichment

VatsalJagani · ‎02-09-2022

There is an option for lookup at index time (https://docs.splunk.com/Documentation/Splunk/8.2.4/Data/IngestLookups) but the documentation says it supports only CSV lookups.

I would suggest giving it a try with geo lookup directly and writing a scripted lookup (write a simple python script to perform the geo lookup) to see if that will work that is the best option you have over other options like DataModel and all which will take lots of resources.

If that does not work then I would suggest a middle ground between log source and Splunk.

moliminous · ‎02-09-2022

The csv lookup at index time is only to ingest a csv and would not help in this case.

The iplocation command in SPL is actually a scripted lookup, but just like your scripted lookup suggestion, would still be at search time and would not help in this case.

moliminous · ‎02-09-2022

Splunk only offers geoip via the command iplocation at search time.
You could add the data using a 3rd part product before Splunk ingests it, but as far as Splunk doing that, the closest options you have (that I'm aware of) are:

Custom Accelerated Data Model
Modify Existing Accelerated Data Models to store those fields
Create a lookup table - probably using KVSTORE due to high number of records

It depends on your intended use cases for it.
For the use cases you mentioned, it sounds like it would be used more for investigations in the original logs, in which case you'd have to tie it to time.

For that purpose I would recommend using an Accelerated Data Model, though it wouldn't contain the raw logs.

If you really need it in the raw logs, I would have either Logstash or CRIBL enrich the raw logs before ingest to Splunk or off to raw log storage depending on your needs.

How to enrich GeoIP at index time?

index

Get Operational Insights Quickly with Natural Language on the Splunk Platform

What’s New in Splunk Observability Cloud – June 2025

Almost Too Eventful Assurance: Part 2

Are you a member of the Splunk Community?

How to enrich GeoIP at index time?

index

Get Operational Insights Quickly with Natural Language on the Splunk Platform

What’s New in Splunk Observability Cloud – June 2025

Almost Too Eventful Assurance: Part 2