I'm trying to extract the total word count from field1 but am unable to find the correct solution. The format is:
field1: {'totalWordCount': 44891, 'totalUsers':49, 'usUsers':20, 'publishers':18, 'articlesByCountry': {'CA':124, 'US':50, 'AUS':19, 'NZ':2}, 'publishersbyCountry':{'CA':124, 'US':50, 'AUS':19}}
Theres much MUCH more to this field than I listed above but I am only interested in the total word count. Any idea how to extract this information?
I've tried |rex field=field1 "'totalWordCount': * " but get an error message "The regex "totalWordCount':*' does not extract anything. It should specify at least one name group. Format: (?<name>...).
Im still new to Splunk so bear with me!
Hi
there seems to be a space before the number i guess thats why regex was failing try it now and try twice once removing the backend slashes before single quote and once including the backslash with single quotes '
index=publishing*
|rex max_match=0 field=_raw "{\'totalWordCount\':\s(?<totalWordCount>\d+)"
|table totalWordCount
Hi @kkosiur
try this
|rex max_match=0 field=_raw "\"totalWordCount\":(?<totalWordCount>\d+)" |table totalWordCount
you can try the above regex but just a query out of curiosity
is it not if you assign your data as sourcetype _JSON this is a json format data right it should extract the field as field1.totalwordcount and you can simply get the count why try the regex ?
Hmm that search didnt work. It returned the error message 'Error in SearchOperator:rex': Usage: regex [field=<field>]<regex>.
Unfortunately I have no control over how the data is ingested so it seems like I am stuck with rex
Hi @kkosiur
oks thats because of the double quotes in my regex your data has single quotes i have changed it
|rex max_match=0 field=_raw "\'totalWordCount\':(?<totalWordCount>\d+)" |table totalWordCount
try now and incase of error can you paste the complete search what you are running it would be helpful
HI @venky1544 , this search returned the same error message again. I am running:
index=publishing*
|rex max_match=0 field=_raw "\'totalWordCount\':(?<totalWordCount>\d+)"
|table totalWordCount