Hello,
Since I often search a specific expression in a large set of events, I would like to index it.
Every single instance that I am running has the following format:
instance-name.generic-name.subdomaine.domain.com
In this expression, only domain.com is static and will never change.
I would like to extract generic-name for all of my events. 
props.conf
[generic-name]
TRANSFORMS-generic-name = generic-name
transforms.conf
[generic-name]
REGEX = (?<instancename>[^\.]+)\.(?<gname>[^\.]+)\.(?<subdomain>[^\.]+)\.(?<domain>[^\.]+)\.
fields.conf
[gname]
INDEXED = True
I am wondering if the fact that I am not receiving anything in the Splunk dashboard is coming from my configuration file or my regular expression ?
Thank you in advance for your help
Update: I have tried all the following regexp and there is still no result. I don't receive any data in my sourcetype.
 
		
		
		
		
		
	
			
		
		
			
					
		I've decided to add a totally separate answer here, since if I'm right... your regex is fine (it was just the markup bug we're dealing with now that confused everyone) but your transforms syntax is off.:
Create an indexed field:
[extracted-gname]
REGEX =  whatevercomesbeforeit [^\.]+\.(?<gname>[^\.]+)\.[^\.]+\..+
FORMAT = gname::$1
[extracting-from-host]
SOURCE_KEY = MetaData:Host
REGEX =   [^\.]+\.(?<gname>[^\.]+)\.[^\.]+\..+
FORMAT = gname::$1
[indexed-gname]
REGEX =  whatevercomesbeforeit [^\.]+\.(?<gname>[^\.]+)\.[^\.]+\..+
FORMAT = gname::$1
WRITE_META = true
 
		
		
		
		
		
	
			
		
		
			
					
		I've decided to add a totally separate answer here, since if I'm right... your regex is fine (it was just the markup bug we're dealing with now that confused everyone) but your transforms syntax is off.:
Create an indexed field:
[extracted-gname]
REGEX =  whatevercomesbeforeit [^\.]+\.(?<gname>[^\.]+)\.[^\.]+\..+
FORMAT = gname::$1
[extracting-from-host]
SOURCE_KEY = MetaData:Host
REGEX =   [^\.]+\.(?<gname>[^\.]+)\.[^\.]+\..+
FORMAT = gname::$1
[indexed-gname]
REGEX =  whatevercomesbeforeit [^\.]+\.(?<gname>[^\.]+)\.[^\.]+\..+
FORMAT = gname::$1
WRITE_META = true
Okay I am going to try this. 
This is tranforms.conf right ? 
On props.conf I have:
[generic-name]
TRANSFORMS-generic-name = extracting-from-host
TRANSFORMS-generic-name = extracted-gname
TRANSFORMS-generic-name = indexed-gname
And fields.conf is still:
 [gname]
 INDEXED = True
Oh, yes, the field is not on _raw, it is on host. 
For example, I have those events:
5/6/15 
3:40:17.000 PM  
Script-name=SMTP-RELAY | Status = OK | Proc=Postfix is running | SMTP=Connection to SMTP port succeed
host = instancename3.generic-name.subdomain source = /opt/splunk/bin/scripts/smtp-relay.pl sourcetype = generic-name
5/6/15 
3:40:06.000 PM  
Script-name=SMTP-CHAIN| Status=KO | Description=Email not received from X instance
host = instancename2.generic-name.subdomain  source = /opt/splunk/bin/scripts/smtp-chain.pl sourcetype = generic-name
Oh while pasting those events I have noticed that the host looks like instancename.generic-name.subdomain, there is not the domain.com anymore (if we are extracting from this field). So the regexp is a bit shorter:
[^\.]+\.(?<gname>[^\.]+)
And yes, the general idea would be that I have, like host and sourcetype, a field called gname on those events.
 
		
		
		
		
		
	
			
		
		
			
					
		Yes, if you want to extract the new field, gname as an indexed extraction you use
TRANSFORMS in props conf (rather than REPORT, which would be search_time), and you follow my example for [indexed-gname]
note the last bit... WRITE_META=true
here is the definition from transforms.conf in doc
    WRITE_META = [true|false]
    * NOTE: This attribute is only valid for index-time field extractions.
    * Automatically writes REGEX to metadata.
    * Required for all index-time field extractions except for those where DEST_KEY = _meta (see 
      the description of the DEST_KEY attribute, below)
    * Use instead of DEST_KEY = _meta.
    * Defaults to false.
That will do it. You've no need for messing with fields.conf
That said... it is not recommended to REPLACE the host or source metadata fields.
By all means... make  new index-time field called gname.
But if you mess with the metadata fields that Splunk uses to tell you where things came from at the time of index, you'll be super sorry later if anything in there changes... like the instance name etc... and all you have is host=generic-name and no evidentiary history of where it actually got the data. I say this from watching it destroy a customer's ability to look at anything granularly... because all they had was the "name" which was also the source, and host and the name of the app... a giant well-meaning mess.
Okay I am starting to understand all of that, thank you !
Now I have data ! But it is matching what is in _raw and not in the fields such as 'host.
My regexp looks good in regxp101: [^.]+.(?<gname>[^.]+).[^.]+
Event that is matched:
5/6/15
6:46:34.000 PM
Host=instancename.generic-name.subdomain.domain.com | Status=OK | Message=Connection to SMTP Port (25) succeeded
host = instancename.generic-name.subdomain source = /opt/splunk/etc/apps/myapp/bin/smtp.pl sourcetype =smtpchecker
I believe that it is matched due to Host=instancename.generic-name.subdomain.domain.com.
When I am running your advice:
sourcetype=smtpchecker |rex field=Host "[^.]+.(?<gname>[^.]+).[^.]+"|head 10|table Host
The result is:
Host
instancename.generic-name.subdomain.domain.com
But I have the same result with [^.]+.(?<gname>[^.]+)
I am going to play around with the regexp and try to match it. It might also comes from that it is not parsing the host field, even tough SOURCE_KEY = MetaData:Host should be good.
 
		
		
		
		
		
	
			
		
		
			
					
		to test you want to do this... (you grabbed the original field and not the new one)
sourcetype=smtpchecker | rex field=Host " "[^.]+.(?<gname>[^.]+).[^.]+"|head 10|table Host gname
I traditionally use both and I should have specified that earlier:
a column representing Host and a column representing gname and you'll be be able to see that you grabbed the stuff you want, right next to what you grabbed it from.
then, because I'm paranoid that way... I would always keep increasing the HEAD 10 to HEAD 100 etc... to make sure that the value of Host doesn't alter... such as losing the domain. But you can totally just grab the first part just fine to avoid that. if it's anchored in the specific field, you don't have to give all that extra info. 
Thank you, the regexp is fine. It is matching what I want when I run sourcetype=smtpchecker | rex field=Host " "[^.]+.(?<gname>[^.]+).[^.]+"|head 10|table Host gname.
But in practice, I can not do sourcetype=generic-name gname=*. The only results that I get is for this kind of event:
Host=instancename.toextract.subdomain.domain.com | Status=OK | Message=Connection to SMTP Port (25) succeeded|
host = instancename.toextract.subdomain source = /opt/splunk/etc/apps/myapp/bin/smtp-check.pl sourcetype = generic-name
But not for this event which is in the same sourcetype:
Script=SMTP-RELAY CHECK | Status = OK | Postfix=Postfix is running | SMTP=Connection to SMTP port succeed
host = instancename2.toextract.subdomain source = /opt/splunk/bin/scripts/smtprelay.pl sourcetype = generic-name
I think that the reason is because the _raw data matchs in a first place but not in the second.
An other example:
sourcetype=othersourcetype gname=* | stats count by gname
00 | Load_5_min=0   29
03 | Load_5_min=0   2
049041748046875e-05|Minimum=8   1
05 | Load_5_min=0   2
059906005859375e-06|
07 | Load_5_min=0
Those are not the gname I am looking for. 
They are coming from this kind of event:
Script=load-check | Status=Ok | Load_1_min=0.02 | Load_5_min=0.02 | Load_15_min=0.00
host = instancename.nametoextract.subodmain source = /opt/splunkforwarder/bin/scripts/load-check.shy sourcetype =othersourcetype
I think that MetaData:Host is not working well, I began to despair actually... 
 
		
		
		
		
		
	
			
		
		
			
					
		Have you named two fields in the same sourcetype the same thing? I'm confused...
Oh no my bad, was confused. The field is different than the sourcetype. I was confused myself. I have edited<.
 
		
		
		
		
		
	
			
		
		
			
					
		and now we know why your original regex wasn't working... 
Always a good idea to test inline with the rex command
...|rex field=whatever "yourregex"|head 10|table yourfield
 
					
				
		
I would also like to backtrack from my comment on the "indexed_value" problem. If you are using index-time extractions ("TRANSFORMS-" or "EXTRACT-") then it cannot be the problem.
Is there a difference between transforms and extract ? I have been using TRANSFORMS so far, like the doc.
 
					
				
		
No, EXTRACT- is inline and TRANSFORMS- is split (and gives you more nuance optoins such as MV_ADD, etc.)
 
		
		
		
		
		
	
			
		
		
			
					
		I know. I'm questioning whether indexed extractions are the right tool for the job.
Set this in props.conf:
[your_sourcetype]
...
EXTRACT-gname = ^[^.]+\.(?<gname>[^.]+) in host`
See if that works, and see if that selects the correct events (scanCount vs resultCount).
 
		
		
		
		
		
	
			
		
		
			
					
		If that's good in terms of scanCount vs resultCount and you want to get rid of the ugly host=*.some-gname.* you can do this field extraction:
<some regex> in host
That'll extract your gname from the host field to let you search using gname=some-gname backed by the host field.
You are talking about search time extraction while I am asking for index time.
 
		
		
		
		
		
	
			
		
		
			
					
		If you're trying to search on a part of the host you could do this:
index=foo sourcetype=generic-name host=*.some-gname.*
Should be pretty quick in terms of identifying the right events because host already is indexed. Loading the events is a different matter of course, so look at scanCount vs eventCount to check if your search is well-targeted or not.
 
		
		
		
		
		
	
			
		
		
			
					
		You are missing some parts in your regex:
YOURS: (capturing group not capturing anything, just naming the field):
[^\.]+\.(?<gname>)[^\.]\.domain\.com
MINE: (capturing group now contains the generic segment):
[^\.]+\.(?<gname>[^\.]+)\.[^\.]+\..+
in case it's not clear... here is the segment zoomed in - note the closing paren, and without the + you get the directive once... not one or more:
yours: (?<gname>)[^\.]
mine:  (?<gname>[^\.]+)
Thank you for your answer!
I am still not receiving any result from your search. 
Actually I have also tried it on regexpr.com and you are matching everything with your regexp. 
Maybe I am missing something but it does not seem to work. 
 
		
		
		
		
		
	
			
		
		
			
					
		try regex101.com that will show you what you are capturing and what you are not. It also will walk you through the regex. you can see it working click here:
https://regex101.com/r/zH0tS1/1
 
					
				
		
It is your REGEX; try this one:
(?<instancename>[^/.]+)/.(?<gname>[^/.]+)/.(?<subdomain>[^/.]+)/.(?<domain>[^/.]+)
