My use case: I want to create a timechart of the number (count) of requests to a system, split by "connection type": that is, how the requests arrived at the system.
The request type is represented in the log as a field named conn_type
containing a fixed-length string of 8 characters, where each character represents a hexadecimal digit. For example, the value "0000000A"
indicates that the request is from system XYZ.
I want the timechart legend to show descriptive labels, not these hex values. For example, instead of "0000000A", I want the legend to show something like "From system XYZ".
I could replace all of the original values with readable values before creating the timechart. For example:
... | eval conn_type=case(conn_type=="0000000A", "From system XYZ", ...) | timechart ...
but I'd prefer to use a technique that doesn't involve processing every input event for the timechart. That seems like too much processing.
I'd prefer to rename the fields after the timechart
command, like this:
... | timechart count by conn_type | rename "0000000A" as "From system XYZ", "0000000B" as "Entered on the command line" ...
That works, but I wonder, rather than coding this rename
command inline, I could use a CSV file as a lookup, and dynamically build the rename
command. Hence this question.
For example, given the CSV file connection_types.csv
with the following structure:
conn_type,description
0000000A,From system XYZ
0000000B,Entered on the command line
...
(There are a dozen or so connection types.)
can I use a subsearch to build a rename
command as a string, as done by this search string:
|inputlookup connection_types.csv | table conn_type description | eval rename_phrase=conn_type + " as " + "\"" + description + "\"" | stats values(rename_phrase) as rename_phrases | eval search="rename " + mvjoin(rename_phrases,", ") | fields search
and then - here's the trick - run that returned string as a command?
For now, I'll use rename
after the timechart
- it works - but I'm curious to know whether what I've described here is possible.
Hi Graham,
If lookup before timechart isn't what you're looking for (I guess you have many events but only few conn types),
have you consider using custom search command?
here's a simple script that might work for you (might still need to be tuned to really work for you):
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import splunk.Intersplunk as sis
(a, kwargs) = sis.getKeywordsAndOptions()
def main():
results = sis.readResults(None, None, True)
conn_types = {
'0000000A': 'From system XYZ',
'0000000B': 'Entered on the command line',
'0000000C': 'from C',
'0000000D': 'others'
}
for row in results:
for key in row:
if conn_types.get(key.strip()):
row[conn_types[key]] = row[key]
del row[key]
sis.outputResults(results)
return 0
try:
main()
except Exception, e:
import traceback
stack = traceback.format_exc()
sis.generateErrorResults("Error '{e}'. {s}".format(e=e, s=stack))
You can place this script in $SPLUNK_HOME/etc/apps/search/bin/
, say renamecolumns.py
,
and add/edit $SPLUNK_HOME/etc/app/search/local/commands.conf
:
[renamecolumns]
filename = renamecolumns.py
then search string:
source=*test.log | timechart count by conn_type | renamecolumns
Basically this reads the results and modify column names for you,
and of course you can read file from csv if you'd like to.
but if you have many events and/or large lookup,
you might need to test which is faster - custom command or lookup
HTH,
Bill