Getting Data In

How to extract wordcount from Array?

kkosiur
Loves-to-Learn Lots

I'm trying to extract the total word count from field1 but am unable to find the correct solution. The format is: 

field1: {'totalWordCount': 44891, 'totalUsers':49, 'usUsers':20, 'publishers':18, 'articlesByCountry': {'CA':124, 'US':50, 'AUS':19, 'NZ':2}, 'publishersbyCountry':{'CA':124, 'US':50, 'AUS':19}}

Theres much MUCH more to this field than I listed above but I am only interested in the total word count. Any idea how to extract this information? 

I've tried |rex field=field1 "'totalWordCount': * " but get an error message "The regex "totalWordCount':*' does not extract anything. It should specify at least one name group. Format: (?<name>...).

Im still new to Splunk so bear with me!

Labels (2)
0 Karma

venky1544
Builder

Hi 

there seems to be a space before the number i guess thats why regex was failing try it now and try twice once removing the backend slashes before single quote and once including the backslash with single quotes '

index=publishing*

|rex max_match=0 field=_raw "{\'totalWordCount\':\s(?<totalWordCount>\d+)"

|table totalWordCount

 

venky1544_0-1647348909035.png

 

0 Karma

venky1544
Builder

Hi @kkosiur

try this  

 |rex max_match=0 field=_raw "\"totalWordCount\":(?<totalWordCount>\d+)" |table totalWordCount

you can try the above regex but just a query out of curiosity 

is it not if you assign your data as sourcetype _JSON  this is a json format data right it should extract the field as field1.totalwordcount and you can simply get the count why try the regex ? 

 

0 Karma

kkosiur
Loves-to-Learn Lots

Hmm that search didnt work. It returned the error message 'Error in SearchOperator:rex': Usage: regex [field=<field>]<regex>. 

Unfortunately I have no control over how the data is ingested so it seems like I am stuck with rex

Tags (1)
0 Karma

venky1544
Builder

Hi @kkosiur 

oks thats because of the double quotes in my regex your data has single quotes i have changed it 

|rex max_match=0 field=_raw "\'totalWordCount\':(?<totalWordCount>\d+)" |table totalWordCount

try now and incase of error can you  paste the complete search what you are running it would be helpful

 

 

0 Karma

kkosiur
Loves-to-Learn Lots

HI @venky1544 , this search returned the same error message again. I am running: 

index=publishing*

|rex max_match=0 field=_raw "\'totalWordCount\':(?<totalWordCount>\d+)"

|table totalWordCount

0 Karma
Get Updates on the Splunk Community!

Industry Solutions for Supply Chain and OT, Amazon Use Cases, Plus More New Articles ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Enterprise Security Content Update (ESCU) | New Releases

In November, the Splunk Threat Research Team had one release of new security content via the Enterprise ...

Index This | Divide 100 by half. What do you get?

November 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...