Getting Data In

Data cloning - is it a exact clone

MoniGreeth
Engager

Hi All,

In splunk documentation it is mentioned as follows,

"Data cloning
To perform data cloning, specify multiple target groups, each in its own stanza. In data cloning, the forwarder sends copies of all its events to the receivers in two or more target groups. Data cloning usually results in similar, but not necessarily exact, copies of data on the receiving indexers."

AS you see in the bold text, it is mentioned that data is not exact, which is bugging me, will UF forward exact copy of events to the configured indexers or their will be some difference?

for example: say we have two indexer IDX1 and IDX2, configured for cloning in UF, and if we get 1000 events, will both IDX1 and IDX2 will have 1000 events each or their might be some diff?
If anyone can share some information, to get some clarity on this.

Thanks
Moni

Tags (3)
1 Solution

yannK
Splunk Employee
Splunk Employee

With Cloning, the same stream of events will be sent to 2 indexers. So they "should" be identical.

However because the events are parsed on the indexer, they may have different rules and props.conf, or timezone and decide to parse the events differently (by example do a different timestamp detection, have extra parsing rules, line breaking)

If you want identical formatted data, use the cluster replication feature, the copy will be at the "bucket" level after indexing, therefore they will be identical.

Remark on cloning: if you have outputs.conf settings like blockOnCloning=false and dropClonedEventsOnQueueFull then if one of the indexer is not reachable, the forwarder can decide to not wait for it, and sent to the remaining one (for the high availability use case when you do want the data event when a node is dead)
http://docs.splunk.com/Documentation/Splunk/6.1.3/Admin/Outputsconf

View solution in original post

the_wolverine
Champion

Sample configuration?

outputs.conf:
[tcpout:tcpout_group1]
server=10.1.1.197:9997,10.1.1.198:9997
autoLB=true

[tcpout:tcpout_group2]
server=myhost1.splunk.com:9997,myhost2.splunk.com:9997
autoLB=true

inputs.conf:
[monitor:///varl/log/messages]
index=test
sourcetype=mytest
_TCP_ROUTING = tcpout_group1,tcpout_group2

0 Karma

yannK
Splunk Employee
Splunk Employee

With Cloning, the same stream of events will be sent to 2 indexers. So they "should" be identical.

However because the events are parsed on the indexer, they may have different rules and props.conf, or timezone and decide to parse the events differently (by example do a different timestamp detection, have extra parsing rules, line breaking)

If you want identical formatted data, use the cluster replication feature, the copy will be at the "bucket" level after indexing, therefore they will be identical.

Remark on cloning: if you have outputs.conf settings like blockOnCloning=false and dropClonedEventsOnQueueFull then if one of the indexer is not reachable, the forwarder can decide to not wait for it, and sent to the remaining one (for the high availability use case when you do want the data event when a node is dead)
http://docs.splunk.com/Documentation/Splunk/6.1.3/Admin/Outputsconf

sanjeevdixit
Explorer

Hi,
Did you get any answer to this question?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...