Solved: Trying to create a new field called hashtag

sohampb · ‎03-25-2013

I am a novice, experimenting with a free version of Splunk, and I have a twitter feed in a text file. A part of it looks like :

Name: The Last Word
Screen Name: TheLastWord
Text: .@lawrence anchors from LA tonight where it's in the 60s. In NYC, it's in the 30s and is supposed to snow. #luckyguy #lastword
Created At: Mon Mar 25 18:23:26 +0000 2013
Source: web
Id: 316254010745188352

(I do not have sourcetype : twitter in my version, so I had to make a new sourcetype).

Now I realize that the regex to extract hashtags is : #[^#\s]*\s , but how do I get splunk to create a new field called hashtag, so that I can report of top hashtags etc ?

Thanks !

btate · ‎03-25-2013

The examples on the rex doc might be useful: http://docs.splunk.com/Documentation/Splunk/5.0.2/SearchReference/Rex

Example1: creates two new fields: 'from' and 'to'. You capture your matches by using parentheses(like in normal regex) and naming the field that will be captured inside angle brackets(prefixed by a '?' within the capture parentheses.

View solution in original post

sohampb · ‎03-25-2013

Thanks a lot this solved it. As I said, I am a novic. I used index=main sourcetype="twitter" | rex "#[^#\s]\s(?P[^ ]+)" | search HASHTAG="" and it worked.

kristian_kolb · ‎03-25-2013

and if there is more than one hashtag per event?

Yes - you can do it in rex as well - add max_match=x to your rex statement, where x would be a number.

/k

kristian_kolb · ‎03-25-2013

In order to extract hashtag as a multivalue field, i.e. where a single event can contain several occurrences of the same field name, you should do it through a REPORT field extraction. This is a configuration directive in props.conf, which refers to a section of transforms.conf like so;

props.conf

[twitter]
REPORT-get_tags = twitter_tags

transforms.conf

[twitter_tags]
REGEX = #(\S+)\s
FORMAT = hashtag::$1
MV_ADD = true

/k

btate · ‎03-25-2013

The examples on the rex doc might be useful: http://docs.splunk.com/Documentation/Splunk/5.0.2/SearchReference/Rex

Example1: creates two new fields: 'from' and 'to'. You capture your matches by using parentheses(like in normal regex) and naming the field that will be captured inside angle brackets(prefixed by a '?' within the capture parentheses.

Trying to create a new field called hashtag

Enterprise Security Content Update (ESCU) | New Releases

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

Index This | What are the 12 Days of Splunk-mas?