Getting Data In

How to extract wordcount from Array?

kkosiur
Loves-to-Learn Lots

I'm trying to extract the total word count from field1 but am unable to find the correct solution. The format is: 

field1: {'totalWordCount': 44891, 'totalUsers':49, 'usUsers':20, 'publishers':18, 'articlesByCountry': {'CA':124, 'US':50, 'AUS':19, 'NZ':2}, 'publishersbyCountry':{'CA':124, 'US':50, 'AUS':19}}

Theres much MUCH more to this field than I listed above but I am only interested in the total word count. Any idea how to extract this information? 

I've tried |rex field=field1 "'totalWordCount': * " but get an error message "The regex "totalWordCount':*' does not extract anything. It should specify at least one name group. Format: (?<name>...).

Im still new to Splunk so bear with me!

Labels (1)
0 Karma

venky1544
Builder

Hi 

there seems to be a space before the number i guess thats why regex was failing try it now and try twice once removing the backend slashes before single quote and once including the backslash with single quotes '

index=publishing*

|rex max_match=0 field=_raw "{\'totalWordCount\':\s(?<totalWordCount>\d+)"

|table totalWordCount

 

venky1544_0-1647348909035.png

 

0 Karma

venky1544
Builder

Hi @kkosiur

try this  

 |rex max_match=0 field=_raw "\"totalWordCount\":(?<totalWordCount>\d+)" |table totalWordCount

you can try the above regex but just a query out of curiosity 

is it not if you assign your data as sourcetype _JSON  this is a json format data right it should extract the field as field1.totalwordcount and you can simply get the count why try the regex ? 

 

0 Karma

kkosiur
Loves-to-Learn Lots

Hmm that search didnt work. It returned the error message 'Error in SearchOperator:rex': Usage: regex [field=<field>]<regex>. 

Unfortunately I have no control over how the data is ingested so it seems like I am stuck with rex

Tags (1)
0 Karma

venky1544
Builder

Hi @kkosiur 

oks thats because of the double quotes in my regex your data has single quotes i have changed it 

|rex max_match=0 field=_raw "\'totalWordCount\':(?<totalWordCount>\d+)" |table totalWordCount

try now and incase of error can you  paste the complete search what you are running it would be helpful

 

 

0 Karma

kkosiur
Loves-to-Learn Lots

HI @venky1544 , this search returned the same error message again. I am running: 

index=publishing*

|rex max_match=0 field=_raw "\'totalWordCount\':(?<totalWordCount>\d+)"

|table totalWordCount

0 Karma
Get Updates on the Splunk Community!

Observability Unlocked: Kubernetes Monitoring with Splunk Observability Cloud

  Ready to master Kubernetes and cloud monitoring like the pros?Join Splunk’s Growth Engineering team for an ...

Wrapping Up Cybersecurity Awareness Month

October might be wrapping up, but for Splunk Education, cybersecurity awareness never goes out of season. ...

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2

&#x1f5e3; You Spoke, We Listened  Audit Trail v2 wasn’t written in isolation—it was shaped by your voices.  In ...