Getting Data In

How to extract wordcount from Array?

kkosiur
Loves-to-Learn Lots

I'm trying to extract the total word count from field1 but am unable to find the correct solution. The format is: 

field1: {'totalWordCount': 44891, 'totalUsers':49, 'usUsers':20, 'publishers':18, 'articlesByCountry': {'CA':124, 'US':50, 'AUS':19, 'NZ':2}, 'publishersbyCountry':{'CA':124, 'US':50, 'AUS':19}}

Theres much MUCH more to this field than I listed above but I am only interested in the total word count. Any idea how to extract this information? 

I've tried |rex field=field1 "'totalWordCount': * " but get an error message "The regex "totalWordCount':*' does not extract anything. It should specify at least one name group. Format: (?<name>...).

Im still new to Splunk so bear with me!

Labels (1)
0 Karma

venky1544
Builder

Hi 

there seems to be a space before the number i guess thats why regex was failing try it now and try twice once removing the backend slashes before single quote and once including the backslash with single quotes '

index=publishing*

|rex max_match=0 field=_raw "{\'totalWordCount\':\s(?<totalWordCount>\d+)"

|table totalWordCount

 

venky1544_0-1647348909035.png

 

0 Karma

venky1544
Builder

Hi @kkosiur

try this  

 |rex max_match=0 field=_raw "\"totalWordCount\":(?<totalWordCount>\d+)" |table totalWordCount

you can try the above regex but just a query out of curiosity 

is it not if you assign your data as sourcetype _JSON  this is a json format data right it should extract the field as field1.totalwordcount and you can simply get the count why try the regex ? 

 

0 Karma

kkosiur
Loves-to-Learn Lots

Hmm that search didnt work. It returned the error message 'Error in SearchOperator:rex': Usage: regex [field=<field>]<regex>. 

Unfortunately I have no control over how the data is ingested so it seems like I am stuck with rex

Tags (1)
0 Karma

venky1544
Builder

Hi @kkosiur 

oks thats because of the double quotes in my regex your data has single quotes i have changed it 

|rex max_match=0 field=_raw "\'totalWordCount\':(?<totalWordCount>\d+)" |table totalWordCount

try now and incase of error can you  paste the complete search what you are running it would be helpful

 

 

0 Karma

kkosiur
Loves-to-Learn Lots

HI @venky1544 , this search returned the same error message again. I am running: 

index=publishing*

|rex max_match=0 field=_raw "\'totalWordCount\':(?<totalWordCount>\d+)"

|table totalWordCount

0 Karma
Get Updates on the Splunk Community!

Enhance Your Splunk App Development: New Tools & Support

UCC FrameworkAdd-on Builder has been around for quite some time. It helps build Splunk apps faster, but it ...

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Your Next Big Security Credential: No Prerequisites Needed We know you’ve got the skills, and now, earning the ...

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

This is the sixth post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to ...