Solved: How to index arbitrary number of fields and do tst...

hettervik · ‎10-17-2017

Hi,

I've got these strange XML logs, where each log has (among other things) a username and an arbitrary number of hashes, each stored in its own XML field. A simplified version of the log is shown below.

[...]<user>hettervi</user><hash1>sdflkjsdf</hash1><hash2>sdfoiujkalw</hash2>[...]<hashn>powkerldsf</hashn>

There are usually no more than around 13-14 hashes for each event, and what I'm trying to do is to count by users and hashes. To do this I've used the foreach and mvappend command to make the XML fields into a multivalue field, and then count the by that new multivalue field, like shown in the search below.

| foreach hash* [ eval hashes=mvappend(hashes, '<<FIELD>>')]
| stats count by hashes user

The problem is this is quite slow, mostly due to the big amount of logs. I've looked into making a multivalue indexed field so that I can use tstats instead of stats, or use an accelerated datamodel with a multivalue field for the hashes, but as far as I can tell this isn't possible. Any idea on how I can make this search faster, e.g. by doing some indexing and tstats magic?

martin_mueller · ‎10-20-2017

something like

[extract_hashes]
REGEX = <hash\d+>([^<]+)
FORMAT = hash::$1
REPEAT_MATCH = true
WRITE_META = true```

and obviously, props.conf TRANSFORMS-extract_hashes = extract_hashes
then you might be able to do | tstats count where foo by user hash

give that a shot in a sandbox.

View solution in original post

martin_mueller · ‎10-20-2017

something like

[extract_hashes]
REGEX = <hash\d+>([^<]+)
FORMAT = hash::$1
REPEAT_MATCH = true
WRITE_META = true```

and obviously, props.conf TRANSFORMS-extract_hashes = extract_hashes
then you might be able to do | tstats count where foo by user hash

give that a shot in a sandbox.

hettervik · ‎10-24-2017

Thanks, this worked perfectly (in the sandbox)! Using this config we get indexed multivalue hash fields for the events, which I didn't even know was possible. Like how does the multivalue fields get stored in the metadata? Anyhow, I've requested the config to be implemented in prod now, which should speed up my search drastically.

lfedak_splunk · ‎10-17-2017

Hey @hettervi, if they solved your problem, remember to "√Accept" an answer to award karma points 🙂

hettervik · ‎10-18-2017

I will, but it is solved quite yet. I'm in Europe, so expect some answer lag from my side. 🙂

DalJeanis · ‎10-17-2017

I'm assuming you want count by the value of the hash, not the name of the hash. If not, you can adjust the last line.

Try this -

| fields user hash*
| untable user hashname hashvalue
| stats count by user hashvalue

hettervik · ‎10-18-2017

Hi. Thanks, but this wasn't quite what I was looking for. I'm looking in the docs, and I can't quite get this command to fit with my data, that is, I have no field for hash names. The multivalue field I created contains all the hash values, which is originally stored in independent fields (hash1, hash2, ... , hashn).

Anyhow, what I was really looking for was a way to index the fields so that I can do accelerated searches on them, like tstats. I'm already getting the right results, no worries, but it takes to long. I can't just index the fields in a traditional way I think, as this would still require me to retrieve all the events to the search heads before doing calculations, so the field indexing would have no real effect.

How to index arbitrary number of fields and do tstats operations on them?

.conf25 Community Recap

Splunk App Developers | .conf25 Recap & What’s Next

Congratulations to the 2025-2026 SplunkTrust!

Are you a member of the Splunk Community?

How to index arbitrary number of fields and do tstats operations on them?

.conf25 Community Recap

Splunk App Developers | .conf25 Recap & What’s Next

Congratulations to the 2025-2026 SplunkTrust!