Getting Data In

How to extract wordcount from Array?

kkosiur
Loves-to-Learn Lots

I'm trying to extract the total word count from field1 but am unable to find the correct solution. The format is: 

field1: {'totalWordCount': 44891, 'totalUsers':49, 'usUsers':20, 'publishers':18, 'articlesByCountry': {'CA':124, 'US':50, 'AUS':19, 'NZ':2}, 'publishersbyCountry':{'CA':124, 'US':50, 'AUS':19}}

Theres much MUCH more to this field than I listed above but I am only interested in the total word count. Any idea how to extract this information? 

I've tried |rex field=field1 "'totalWordCount': * " but get an error message "The regex "totalWordCount':*' does not extract anything. It should specify at least one name group. Format: (?<name>...).

Im still new to Splunk so bear with me!

Labels (1)
0 Karma

venky1544
Builder

Hi 

there seems to be a space before the number i guess thats why regex was failing try it now and try twice once removing the backend slashes before single quote and once including the backslash with single quotes '

index=publishing*

|rex max_match=0 field=_raw "{\'totalWordCount\':\s(?<totalWordCount>\d+)"

|table totalWordCount

 

venky1544_0-1647348909035.png

 

0 Karma

venky1544
Builder

Hi @kkosiur

try this  

 |rex max_match=0 field=_raw "\"totalWordCount\":(?<totalWordCount>\d+)" |table totalWordCount

you can try the above regex but just a query out of curiosity 

is it not if you assign your data as sourcetype _JSON  this is a json format data right it should extract the field as field1.totalwordcount and you can simply get the count why try the regex ? 

 

0 Karma

kkosiur
Loves-to-Learn Lots

Hmm that search didnt work. It returned the error message 'Error in SearchOperator:rex': Usage: regex [field=<field>]<regex>. 

Unfortunately I have no control over how the data is ingested so it seems like I am stuck with rex

Tags (1)
0 Karma

venky1544
Builder

Hi @kkosiur 

oks thats because of the double quotes in my regex your data has single quotes i have changed it 

|rex max_match=0 field=_raw "\'totalWordCount\':(?<totalWordCount>\d+)" |table totalWordCount

try now and incase of error can you  paste the complete search what you are running it would be helpful

 

 

0 Karma

kkosiur
Loves-to-Learn Lots

HI @venky1544 , this search returned the same error message again. I am running: 

index=publishing*

|rex max_match=0 field=_raw "\'totalWordCount\':(?<totalWordCount>\d+)"

|table totalWordCount

0 Karma
Get Updates on the Splunk Community!

Index This | When is October more than just the tenth month?

October 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

What’s New & Next in Splunk SOAR

 Security teams today are dealing with more alerts, more tools, and more pressure than ever.  Join us for an ...