Best Way to handle Field Names Changing

katzr · ‎12-07-2017

Hello,

I have 2 dashboards built off of a data source with specific fields, but my data source is changing so the fields will be named differently with the same values in them.

What is the best/most efficient way to handle this change? I was thinking I could use coalesce or possibly rename, but I have 30+ panels of data with many different fields, so that would be a lot of work to change all of the panels and there is a lot of room for error there.

Is there a better way to map fields together so the reporting can remain consistent?

Thanks for the help!

woodcock · ‎12-07-2017

If it is only your problem, then change your dashboard to use coalesce(newNameHere, oldNameHere) but if it is a problem for other people in other places, then consider a field alias to map the names together.

Lamar · ‎12-07-2017

Another thing to consider, using coalesce(), is that you can bundle more than one field in the command..ie, coalesce(newNameHere, oldNameHere, oldName2Here, oldName3Here). Additionally, because coalesce doesn't utilize the index (during search) it can be a bit clunky if search high-cardinality elongated time-frame type searches. YMMV.

Choose what works best for your use case.

katzr · ‎12-07-2017

@lamar if I use FieldAlias- I can map the new field to an exisiting field name correct? I won't need to create a new alias and then map the old/new field to it correct?

Lamar · ‎12-07-2017

Nope.

Let's say you have a field (we'll call it old, but something that has existed in your data and you would like to continue using it because all of your dashboards use it) called src_ip.

**NOTE: When onboarding data and utilizing a TA (Addon) for some type of data that is CIM compllant, you'll notice that they'll leverage FIELDALIAS quite heavily simply because it's the cleanest way to package up manipulation of known data to a format that works with CIM.

But, you now have data that comes in with fields called IP, source or source_address. You could create field aliases for each of those that alleviate the need to modify the props or transforms entries for each of those data sources (because they have new field names).

props.conf - Example

[<sourcetype>]
FIELDALIAS-translate_ip = ip as src_ip
FIELDALIAS-translate_source = source as src_ip
FIELDALIAS-translate_source_address = source_address as src_ip

The example above rests on the fact that all of your data sources are coming in under the same source type. But, you could do the same thing against different sourcetypes, sources or hosts.

katzr · ‎12-13-2017

@Lamar @woodcock I ended up using field aliases and they are significantly hurting my performance- would coalesce hurt my performance less possibly?

Lamar · ‎12-13-2017

@katzr -- Could you share with us what you've done in your configs? Additionally, can you share how FIELDALIAS is hurting performance?

EDIT: One other thing -- if you could run the following search on data that has FIELDALIAS configurations against it and then run the search job inspector, that would help better understand where splunk is spending it's time:

<search> | head 100

You'll specifically be looking for command.search.fieldalias in the command.search sub-table.

katzr · ‎12-13-2017

@Lamar yes below is the sourcetype I applied my Field Alias to. The problem is that my panels in my dashboard are taking much longer to load.

[itcc:snow]
INDEXED_EXTRACTIONS = csv
TRUNCATE = 50000
SHOULD_LINEMERGE = false
TIMESTAMP_FIELDS = Opened
TIME_FORMAT = %Y-%m-%d %H:%M:%S
FIELDALIAS-Incident Metric Table = "Assigned To" AS assigned_to "Business service" AS business_service Category AS category Closed AS closed_at "Closed By" AS closed_by "Closure code" AS close_code Company AS caller_id_company "Configuration item" AS cmdb_ci "Contact type" AS contact_type Country AS caller_id_country Email AS caller_id_email "Employee Type" AS caller_id_u_employee_type "Employee number" AS caller_id_employee_number "Incident Type" AS u_prob_type "Job Band" AS caller_id_u_job_band Knowledge AS knowledge Language AS caller_id_preferred_language Location AS location "Made SLA" AS made_sla Name AS caller_id_name Number AS number Opened AS opened_at "Opened by" AS opened_by Organization AS caller_id_u_organization "Password Last Reset" AS caller_id_u_password_last_reset Region AS caller_id_location_u_pg_region Resolved AS resolved_at "Resolved by" AS resolved_by "Service offering" AS service_offering "Service type" AS u_service_type "Short Description" AS short_description Sponsor AS caller_id_u_pg_sponsor Status AS state Subcategory AS subcategory Value AS assignment_group

Lamar · ‎12-13-2017

Do you need the original field names? If not, since you're using INDEXED_EXTRACTIONS, you could just apply your own field names at index time. FWIW, after doing this you can then run your analytic searches against this data using tstats, which should be remarkably faster.

EDIT: I would also position that it makes sense with your FIELDALIAS to separate all of those aliases into their own alias, if this is the route you go. Keep in mind, that if you're going to try coalesce for this effort that your search bloat will go up dramatically unless you macro the coalesce.

katzr · ‎12-13-2017

essentially all of these new field names are the field names of my historical data so I couldn't use tstats for any of those because they are just normal indexed data correct?

So would applying my own field names at index time help my perfomance if I don't use tstats?

Also- where would i specify the field names at index time/how do I do that?

Thanks for the help!

Lamar · ‎12-13-2017

Because you're using INDEXED_EXTRACTIONS for this data, splunk will automatically put them in the tsidx file associated with the itcc:snow sourcetype.

Example tstats search against your data:

| tstats count as total from <your_index> groupby _time, Company | timechart sum(total) by Company

If you applied your own field names in the beginning (read: index time) you wouldn't have to do the coalesce or FIELDALIAS calisthenics to get your data to look the way you want. 😉

Example configuration (on forwarder):
[itcc:snow]
INDEXED_EXTRACTIONS = csv
TRUNCATE = 50000
SHOULD_LINEMERGE = false
TIMESTAMP_FIELDS = Opened
HEADER_FIELD_LINE_NUMBER = 30
FIELD_NAMES = assigned_to, business_service, category, closed_at, closed_by, close_code company ...

woodcock · ‎12-07-2017

You can map anything to anything.

somesoni2 · ‎12-07-2017

You could setup field alias , so that new fields will have equivalent field alias available with old field names. This way your dashboard can stay the same.

Lamar · ‎12-07-2017

Try using a FIELDALIAS.

https://docs.splunk.com/Documentation/Splunk/7.0.0/Knowledge/Addaliasestofields

Best Way to handle Field Names Changing

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Automating Threat Operations and Threat Hunting with Recorded Future

Join the Conversation

Best Way to handle Field Names Changing

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Automating Threat Operations and Threat Hunting with Recorded Future