Hi,
I have log like following
rid=iqwenoasd service=CP scopes=add-w,oot-s fields=birthdate,emails,identifier issuer=AWS CA TE empty=
my expected key-value pairs are like this
rid=iqwenoasd
service=CP
scopes=add-w,oot-s
fields=birthdate,emails,identifier
issuer=AWS CA TE
empyt=null
since there are some special cases in it (value has space character, value is empty), how can I use REGEX in search servers to achieve that? it seems using ([^=]+)=(.+) doesn't work
in transforms.conf
[cp_logs_report]
REGEX = ([^=]+)=(.+)
FORMAT = $1::$2
in props.conf
[my_sourcetype]
REPORT-cp-logs-ext = cp_logs_report
finally I used REGEX = (\S+)=([^=]*)(?:\h+|$)
Hi @hchen11,
did you tried with a regex like this?
| rex "rid\=(?<rid>[^ ]+)\s+service\=(?<service>[^ ]+)\s+scopes\=(?<scopes>[^ ]+)\s+fields\=(?<fields>[^ ]+)\s+issuer\=(?<issuer>.+)\s+empty\=(?<empty>.*)"
to use in SPL or in a filed extraction.
You can test the regex at https://regex101.com/r/dNUKjy/1
Then to have NULL in empty, you can use fillnull
| fillnull empty value="NULL"
Ciao.
Giuseppe
Hi @gcusello ,
I would like to use REGEX in transforms.conf but not in Splunk Web UI.
Another question, should I use index-search in index server (search head)?
Hi @hchen11,
what do you mean with index-search?
do you want to extract fields at index time?
if you configure the above extraction as fields, you have the above definitions in props and transfroms and you don't need to use by GUI or better in SPL, but obviously you use them at search time not index time.
Ciao.
Giuseppe
Hi @gcusello ,
Yes, what I mean is extracting fields at index time.
What should I do if I want to use it at index time? Should I just change REPORT-* to TRANSFORMS-* and source these configs in indexer server?
Hi @hchen11,
see at https://docs.splunk.com/Documentation/Splunk/8.0.5/Data/Configureindex-timefieldextraction but anyway, whay do you wanto to extract field at index time?
this is a good idea if you index not many logs and you have many users and searches, is this your situation?
Ciao.
Giuseppe
Hi @gcusello,
we don't want to impact performance when searching, and that's the main reason we want to do extraction at index time. We have around 50 fields in logs, is it too many or not?
Hi @hchen11,
Numer of fields is one of the parameters, but not the first, the main are resources available, users, searches.
Anyway, extracting field at index time asks more resources and time at index time, if you have many events to index isn't a good idea.
Ciao.
Giuseppe
Hi @gcusello ,
I did not consider this issue carefully before: whether we need to extract at index time. Could you provide more details about how should I make a decision?
About resources, we had around 20 index servers and 3 search servers. About users and searches, around 3 users will use the search and do 100 searches daily and averagely. Furthermore, not all fields exist in every event.
Our discussion is pretty meaningful for me, hope I can get more suggestion from you.
Hi @hchen11,
the architecture of an infrastructure is a job for an Architect, to analyze requirements (users, scheduled searches, log to index, apps, etc...) to define the correct number of servers, their resources and their configurations.
As I said extracting fields at index time is a good practice if you haven't an high load in indexing, so you can give an additional load to indexers at index time and give advantages at search time; but if you have many logs (and having 10 Indexers I can think that this is your situation) and you have only three users, probably it's better to extract fields at search time.
As I said, this should be deeply analyzed by an Architect.
Ciao.
Giuseppe