2 Biztalk Servers in a Group slow performance

This is going to be a big question so please bear with me

Scenario

I am having 5-6 applications in my Biztalk server 2010 (enterprise edition) using following adapters.

  1.       AS2
  2.       WCF-Custom
  3.       Http
  4.       nSoftware FTP
  5.       WCF-SQL (Ordered)
  6.       POP3
  7.       SMTP
  8.       File

Environment

  1.       2 Biztalk nodes (Active/Active) connected to same group.
  2.       2 SQL nodes (Clustered)
  3.       I have created 3 hosts (REC, Process, Send host and host instances in both the Biztalk Nodes)

Issue

Enabling both the nodes at same time slows down the performance drastically, where as one node active at a time process very quickly.

Observation

By now I figured out few things, (Correct me if I am wrong)

  1.       There are several adapters that are not safe when run on multiple instances. These include (but there may be others) the POP3, FTP, MSMQ/MSMQT and Database adapters in polling scenarios.
  2.       SSO Master Secret must be clustered coz if the first node which is by default Master secret goes down then other node will work on cached version until restarted.

Question

What is the best possible way to design this? Is it a must to have few BizTalk components clustered ?

Thanks


June 7th, 2013 1:57pm

You do not mention what the configuration of your BizTalk Servers and/or SQL Servers is (no of CPU? no of Cores? Memory?)? Typical design strategy I follow for processing hosts is to have a processing host for send/receive/processing PER APPLICATION or from an optimization standpoint atleast one processing host for each adapter type (this provides better isolation).

Having multiple host instances (active, configured but disabled is OK) for receive locations is not recomended due to access conflicts and/or message duplication issues.

SSO should be clustered (and the ideal location for this is the SQL cluster).

I'm not able to comprehend why/how (in an active/active BizTalk Front-end configuration) do you enable the clustsred host on BOTH the nodes. And when you say performance slows down dramatically, have you monitored performance (in terms of CPU, Memory, paging and associated SQL utilization?)

Regards.

Free Windows Admin Tool Kit Click here and download it now
June 7th, 2013 2:57pm

Can you tell exactly which process slows down when enabling both hosts?  Are you seeing Errors or Warnings in the Event Logs?

I'll assume the BizTalk hosts are not clustered (is, Windows clustering) right now.  If that's the case, Environment is pretty close to being a recommended configuration.

1. Correct, those Adapters are not safe running in multiple Host Instances in certain circumstances such as Ordered Delivery and depending on the behavior of the target system, such as non-locking FTP servers.  Those are the only reasons you would Cluster a BizTalk Host.

2. Correct, the Master Secret Server has to be clustered for a fully HA setup.  The preferred method is to cluster the MSS with the SSODB.  This is allowed by the BizTalk ENT license.  See http://msdn.microsoft.com/en-us/library/aa561823.aspx

June 7th, 2013 3:21pm

Many Thanks for reply Skankycheil.

Here is my response

Configuration of BizTalk Servers

Node A and B

CPU: - AMD 6176 SE 2.3Ghz (2 processors) Dual core.

Memory: - 8 gigs.

BTS Server 2010, SQL Server 2008 R2, Windows server 2008 R2

I follow for processing hosts is to have a processing host for send/receive/processing PER APPLICATION 

Are you saying that if I have 5 applications than

I should have

REC_HOST_1, Process_Host_1, Send_Host_1

REC_HOST_2, Process_Host_2, Send_Host_2

REC_HOST_3, Process_Host_3, Send_Host_3

REC_HOST_4, Process_Host_4, Send_Host_4

REC_HOST_5, Process_Host_5, Send_Host_5

And then host instances of above hosts on both the servers??

I'm not able to comprehend why/how (in an active/active BizTalk Front-end configuration) do you enable the clustsred host on BOTH the nodes

Ill try to explain what I have done.

I installed Biztalk on node A and then installed Biztalk on node B.

During configuration of node A I configured everything pointing to my SQL server

During configuration of node B I joined the existing group.

Created host instances for both node A and node B

When I enable all host instances on node A and node B, processing is very slow but if only one nodes host instances (either A or B) are active then performance is very fast.

I have not done anything else in Biztalk environment other than actions mentioned above. (No clustering of host etc.)

And when you say performance slows down dramatically, have you monitored performance (in terms of CPU, Memory, paging and associated SQL utilization?)

The resource utilization on all servers was normal, but Ill give it another go to confirm this.

Free Windows Admin Tool Kit Click here and download it now
June 7th, 2013 3:26pm

Thanks boatseller for reply

Can you tell exactly which process slows down when enabling both hosts?  Are you seeing Errors or Warnings in the Event Logs?

I was not able to determine this, the instances remain active in admin console. Didn't saw any warning or error in admin console.

I'll assume the BizTalk hosts are not clustered (is, Windows clustering) right now.  If that's the case, Environment is pretty close to being a recommended configuration.

Yes the hosts are not clustered.

Thanks

June 7th, 2013 3:51pm

I do not see anything wrong at all with your BizTalk Host machine configurations. What about your SQL Servers (do they have similar configurations or you have TWO servers clustered for SQL and you;ve installed BizTalk on the same)?

Also you say resource utilization on all servers was normal? Then what leads you to deduce the performance slowness? If there is performancce degradation it should reflect in some measurable parameters such as IO/Memory/CPU either on the BizTalk or on the SQL servers. For example undersized SQL nodes would cause performance bottlenecks which would intrun lead to high CPU utilization of BizTalk (as most of the time you;d be in a resource wait state). The bottleneck on SQL could be because of disk IO if you've put all databases on the same disk, etc, etc.

Regards.

Free Windows Admin Tool Kit Click here and download it now
June 7th, 2013 4:06pm

Orchestration instances?  You cold be experiencing a Flood condition and Throttling is kicking in.  As in the system can keep up with one Receive but not two.

Check these counters: http://msdn.microsoft.com/en-us/library/aa578302.aspx

June 7th, 2013 4:07pm

SQL Servers are Clustered.

Also you say resource utilization on all servers was normal? Then what leads you to deduce the performance slowness?

Sometime ago we got a flood of 200-300 files few kb each in size.

Task involved was to read the file, update SQL record in 3 tables and send an email.

we had 500-600 instances running and they were getting processed one by one , as soon as I turned off host instances on node A all instances were processed in few seconds. I initially thought its was problem with node A, but next time when this happened I switched  node B off and this time Node A processed all instances in few seconds.

If there is performancce degradation it should reflect in some measurable parameters such as IO/Memory/CPU either on the BizTalk or on the SQL serversI run a small test and try to capture SQL usage.

The bottleneck on SQL could be because of disk IO if you've put all databases on the same disk, etc, etc

I checked sql server and we have seperate drives for databases

1) SQLDB (All Biztalk and internal DB mdf files)

2)SQLLOG (All Ldf)

3)SQLTEMPDB (Tempdb mdf and ldf)

Free Windows Admin Tool Kit Click here and download it now
June 7th, 2013 4:25pm

Yes boatseller

I had this issue initially but after creating different hosts for Send Rec and Process, we had a better utilisation and throttling never kicked in.

But still we had performance issue as I just mentioned above.

May I know what you mean by this ?

As in the system can keep up with one Receive but not two.

June 7th, 2013 4:30pm

So as boatseller mentioned now you should try with the REC host enabled only on ONE node. Your processing and send can run on both nodes.

The reason for the slow down would be because the FILE Receive is a locking adapter, when two nodes are trying to simultaneously pickup the same file, one of them gets a lock and then backs up to retry. With REC running on BOTH nodes if you cut your FILE Receive Adapter BATCH SIZE TO 1, you should also see a change.

Regards.

Free Windows Admin Tool Kit Click here and download it now
June 7th, 2013 4:57pm

OK I just did testing with 92 files for one application

These are the tasks performed by this app

1) Get the file from network drive (Rec_Host)

2) Validate the status of File from SQL (Send_Host)

3) Send file via AS2(HTTP Send_Host)

4) Update SQL tables (Send_Host).

REC_HOST_1, Process_Host_1, Send_Host_1 running on node A

Process_Host_2, Send_Host_2 running on node B

When I process these files from single node (A) all this gets processed in approx 20 seconds.

When I enable both of them, it slowed down after 60 -70 files and I had 28 instances which were stuck and were getting processed 1 at a time in approx 8-10 seconds.

All these 28 stuck instances were performing step 3 and 4.

During processing of all these 28 instances, cpu usage and memory usage was normal in all servers. 

All these files are 3kb flat files.

Now what could be the cause of this slowdown ?


June 7th, 2013 5:45pm

So, it sounds like the problem is on the Send side then?  Steps 3 & 4.

So, you could very well be flooding the target systems.  For 3, the service on their end might not be able to cope with so many simultaneous connections.  For 4, locking is a common culprit.

Either way, it's easy to test.  On those Send Ports, check the Ordered Delivery box.  This will limit the Send Port to a single instance*, it will perform each operation synchronously.  If that improves performance, then it would follow that target flooding is the problem.

We can then offer some more advice.

*You can leave both hosts running, the routing engine will just pick one.  There's really no way to influence it but it shouldn't matter.

Free Windows Admin Tool Kit Click here and download it now
June 7th, 2013 6:34pm

WHat happens when you run with

Node A - Rec 1

Node B - Processing 2 and Send 2

If the issue is with Send side and you;re indeed flooding the destination systems, then instead of one receive, try ONLY ONE Send.

Regards.

June 8th, 2013 10:28am

Thanks Sankycheil

Ill try this first thing Monday morning.

Cheers

Free Windows Admin Tool Kit Click here and download it now
June 9th, 2013 10:59am

Just to note, simply running only one Send host will not prevent overloading the target systems because those Adapters will still attempt multiple connections internally.

Ordered Delivery is an engine level option which limits any Send operation to 1 Batch of 1 Message at time.

June 9th, 2013 6:42pm

Thanks for reply guys

I just did various enabling/disabling of hosts and then it appears that 

If only one host type is active at a time then it takes least processing time

What I mean is 

Rec 1 or Rec 2, Process 1 or Process 2, Send 1 or Send 2 in any of the node.(Node A, Node B or Both Nodes)

So

REC1 + Process 1 + Send 1

REC2 + Process 1 + Send 1

REC1 + Process 1 + Send 1

REC2 + Process 2 + Send 1

REC1 + Process 1 + Send 2

REC2 + Process 1 + Send 2

E.T.C....

If more than one (REC, Process or Send) hosts are enabled , then it slows down.


Free Windows Admin Tool Kit Click here and download it now
June 10th, 2013 1:03pm

To investigate this scenario, it is best to create the Host Instance depending on the adapter hander type. (Example for SQL, create HostInstance-SQL)

From NLB, drain down one server, keep one server in active mode, then enable one application keep other application in stop mode, checked the times span for messaging and processing using Tracked service instance and Tracked message event. Same test do it for when both servers are active. You can find out where the process is getting slow, which messaging getting more time to process.

Use IIS.log, when web request received and processing time. And SQL profiler for exact request time to get sql request.

Is there any extra call is making to db if both servers are active? To check this you can use performance counter in perfmon.exe, how many persistence point created for one process/application run for one/ two active nodes.

June 10th, 2013 3:53pm

Hi!

First of all do not create one decicated host per task per application, that is not recommended and not best practice. I recommend you to split up work according to what's done and create new ones when needed.

I see you state you have 3 hosts, for sending, receiving and processing, what about tracking?

You should always have 1 tracking host for each message box + in redundancy.

In order for us to assist you anymore can you please run the MessageBox Viewer and share the errors given at this thread.

http://blogs.technet.com/b/jpierauc/archive/2007/12/18/msgboxviewer.aspx

Best regards

Tord Glad Nordahl
Bouvet ASA, Norway
http://www.BizTalkAdmin.com | @tordeman

Please indicate Mark as Answer if this post has answered the question.

Free Windows Admin Tool Kit Click here and download it now
June 10th, 2013 9:47pm

Hi!

First of all do not create one decicated host per task per application, that is not recommended and not best practice. I recommend you to split up work according to what's done and create new ones when needed.

I see you state you have 3 hosts, for sending, receiving and processing, what about tracking?

You should always have 1 tracking host for each message box + in redundancy.

In order for us to assist you anymore can you please run the MessageBox Viewer and share the errors given at this thread.

http://blogs.technet.com/b/jpierauc/archive/2007/12/18/msgboxviewer.aspx

Best regards

Tord Glad Nordahl
Bouvet ASA, Norway
http://www.BizTalkAdmin.com | @tordeman

Please indicate Mark as Answer if this post has answered the question.

Thanks Tord for reply

I have BizTalkServerApplication  host and I have enabled tracking for this host.

All the associates SQL jobs are running accordingly to maintain Messagebox.

Apologies, I didn't mentioned that as I was concentrating only on my application and  hosts.

I used messagebox viewer few weeks back and didn't found any major issue, But I will surely run it tomorrow morning and post the results here.

A question here, Do I have to set up some settings in windows network load balancing?

Also, We use extensively network shared drive for fetching, archiving files. Can that be a performance bottleneck ? If so why it works perfectly fine from single node ?


June 11th, 2013 12:30am

There are a bunch of potential bottlenecks in BizTalk. I wonder if you are in a throttling state and that is the reason for the delay in your environment.

If you are using extensively the file adapter towards fileshares be aware of the 5649 Event ID due to network retry exhaustion.

You can also set up "performance monitor" to monitor your environment for a week (or during problems, delays etc.) and parse that log file through PAL to get an even better idea of how your environment I performing. It close to impossible to pin-point your problem due to the lack of information.

When it comes to NLB be aware that BizTalk in a multi-server environment uses its built-in load balancer and that an extra NLB is not needed, however for IIS and applications utilizing this it may be recommended to add an NLB in front of this.

When it comes to slow performance around 70% of the cases is related to poor or slow performaing disks.

You may want to ensure that the pagefile, TEMP and TMP folders are not running on the windows drive.

You should also ensure that the BizTalk databases (specially the messagebox) isn't too big, there isn't really any defined limit to the size it depends on HW and general performance of your environment.

I wrote a white paper a couple of weeks ago regarding proactivity in BizTalk you may want to read this one to get some more in-depth knowledge of how BizTalk works and how to ensure your environment is performing as it should, it can be downloaded here: http://go.biztalk360.com/biztalk-server-proactive-management/

Best regards

Tord Glad Nordahl
Bouvet ASA, Norway
http://www.BizTalkAdmin.com | @tordeman

Please indicate Mark as Answer if this post has answered the question.

Free Windows Admin Tool Kit Click here and download it now
June 11th, 2013 1:15pm

+ NON CRITICAL WARNINGS - Need to pay attention to, or just suggestions/recommendations

Item Caption

Item Value

Query Report

URLs

Rule ID

General

 

Summary Report

 

 

Errors during Collect

23 - Check the STATUS Log file produced or error messages reported above in this HTML file !

 

 

 

BizTalk Databases - General

 

Summary Report

 

 

SSO DB

Not Clustered (this DB is critical as it keep encrypted ports properties) !

Query Report

How to cluster SSO

35

LOG Db Growth for BizTalkMgmtDb

Not recomended to use percentage (Current =10%,Def=10%) - Recommended for this Db=10240 KB ?!

Query Report

BizTalk Server Database Optimization

39

LOG Db Growth for BizTalkMsgBoxDb

Not recomended to use percentage (Current =10%,Def=10%) - Recommended for this Db=102400 KB ?!

Query Report

BizTalk Server Database Optimization

39

LOG Db Growth for BizTalkDTADb

Not recomended to use percentage (Current =10%,Def=10%) - Recommended for this Db=102400 KB ?!

Query Report

BizTalk Server Database Optimization

39

LOG Db Growth for BAMPrimaryImport

Not recomended to use percentage (Current =10%,Def=10%) - Recommended for this Db=10240 KB ?!

Query Report

BizTalk Server Database Optimization

39

LOG Db Growth for BAMArchive

Not recomended to use percentage (Current =10%,Def=10%) - Recommended for this Db=10240 KB ?!

Query Report

BizTalk Server Database Optimization

39

LOG Db Growth for BizTalkRuleEngineDb

Not recomended to use percentage (Current =10%,Def=10%) - Recommended for this Db=1024 KB ?!

Query Report

BizTalk Server Database Optimization

39

LOG Db Growth for SSODB

Not recomended to use percentage (Current =10%,Def=10%) - Recommended for this Db=1024 KB ?!

Query Report

BizTalk Server Database Optimization

39

BizTalk Databases - Files

 

Summary Report

 

 

LDF files Location for BizTalkDTADb and BizTalkMsgBoxDb

Same Drive (can cause disk contention) !

Query Report

BizTalk Server Performance Characteristics, BizTalk Operation Guide - p101

44

MDF files Location for BizTalkDTADb and BizTalkMsgBoxDb

Same Drive (can cause disk contention) !

Query Report

BizTalk Server Performance Characteristics, BizTalk Operation Guide - p101

43

Throttling Settings

 

Summary Report

 

 

Some Settings were changed

see the corresponding Query Report to see the changes !

Query Report

 

58

BizTalk Hosts

 

Summary Report

 

 

Total Host Instances Running

0 !

Query Report

 

67

Host "BizTalkServerApplication"

Run Receive Location+Send Port+Orchestration (Recommended to dedicate this host to run either ReceiveLocations, Orchestrations, or SendPorts) !

Query Report

BizTalk Performances WhitePaper, BizTalk Operation Guide - p109

104

Host "BizTalkServerInbound_App1"

Run Receive Location+Send Port+Orchestration (Recommended to dedicate this host to run either ReceiveLocations, Orchestrations, or SendPorts) !

Query Report

BizTalk Performances WhitePaper, BizTalk Operation Guide - p109

104

Host "BizTalkServerInbound_App2"

Run Receive Location+Send Port+Orchestration (Recommended to dedicate this host to run either ReceiveLocations, Orchestrations, or SendPorts) !

Query Report

BizTalk Performances WhitePaper, BizTalk Operation Guide - p109

104

Dedicated Tracking host(s)

No (better to have some host(s) dedicated ONLY for tracking) !

Query Report

BizTalk Performances WhitePaper, BizTalk Operation Guide - p60

105

Adapters

 

Summary Report

 

 

Adapter nsoftware.FTP v3

Custom or Third-party adapter !

Query Report

 

73

Adapter nsoftware.OFTP v3

Custom or Third-party adapter !

Query Report

 

73

Adapter nsoftware.SFTP v3

Custom or Third-party adapter !

Query Report

 

73

Non WCF SQL adapter used in some Receive Locations

Prefer to use the WCF one which is more performant !

Query Report

Download BizTalk Adapter Pack 2.0, Use Stored procs with the WCF SQL adapter

83

Ports, Pipelines & Orchs

 

Summary Report

 

 

Total Receive Locations Disabled

4 !

Query Report

 

79

MsgBody Tracking for some ReceivePorts (5 RLs impacted)

Yes (MsgBody can accumulate in MsgBox db if 'TrackedMessages_Copy' job is not running to send them to DTA) !

Query Report

How to Copy Tracked Messages into DTA

77

Total Send Ports Stopped

4 !

Query Report

 

94

Tracking Events for ALL Orchs

Yes (if Orchs are complex and numerous, it can impact seriously performance and cause quick DTA DB increase) !

Query Report

 

100

"Start and End Shapes" tracking events for Orch.

Enabled. Not recommended in Production as they can impact perfomances. ?!

Query Report

 

395







June 11th, 2013 1:39pm

DTA Tables

 

Summary Report

 

 

DTA Orphaned Instances (Incompleted Instances in DTA but not in Msgbox)

65 (Large number can impact DTA Size and so perfs) - one possible cause is described in KB 978796 - Contact Microsoft CSS for more info !

Query Report

Get More Info on 'TERMINATOR' Tool to clean or repair this issue

110

BizTalk Jobs

 

Summary Report

 

 

BizTalk Job 'MessageBox_Message_Cleanup_BizTalkMsgBoxDb'

Should be Disabled (Job executed in a loop by 'MessageBox_Message_ManageRefCountLog' to clean proceeded msg in MsgBox Db - MsgBox Db can grow if not running) ! !

Query Report

Jobs Description

118

Other Checks

 

Summary Report

 

 

Server A

Running in VMware Virtual Platform (Supported on ESX >=3.5) !

Query Report

Support policy for Microsoft software running in non-Microsoft hardware virtualization software, BizTalk supportability on a virtual machine, Support Policy Wizard

173

Server B

Running in VMware Virtual Platform (Supported on ESX >=3.5) !

Query Report

Support policy for Microsoft software running in non-Microsoft hardware virtualization software, BizTalk supportability on a virtual machine, Support Policy Wizard

173

Server A

Running in VMware Virtual Platform (Supported on ESX >=3.5) !

Query Report

Support policy for Microsoft software running in non-Microsoft hardware virtualization software, BizTalk supportability on a virtual machine, Support Policy Wizard

173

Special TCPIP Settings

SynAttackProtect enabled on SQL Server UKESIRTISQL03 - can generate General Network errors !

Query Report

'General Network'errors, BizTalk Operation Guide - p88

210

Tunning

 

Summary Report

 

 

'maxconnection' property

Is not present in some BizTalk process config files - You can configure the number of concurrent connections that the SOAP adapter opens for a particular destination server by adding "maxconnection" entry !

Query Report

SOAP Adapter Configuration and Tuning Parameters

404

EventLog

 

Summary Report

 

 

MsgBox

Communication problems with MsgBox found - Check the Network, SQL or MSDTC !

Query Report

 

256

SQL Servers

 

Summary Report

 

 

Current Error log on A

Error(s) found in SQL Error log - Check if these issues were fixed !

Query Report

 

378

 

 


Free Windows Admin Tool Kit Click here and download it now
June 11th, 2013 1:39pm

MSGBox Report (In parts coz of Size limit in a single post)

Note

1) CU5 is already installed in BTS environments.

2)  Used BizTalk Terminator Tool and executing the option Repair Refcounts for All Messages.

WARNINGS REPORT      (See "How to maintain and troubleshoot BizTalk Server databases"- CRITICAL WARNINGS - Need to be fixed asap

Item Caption

Item Value

Query Report

URLs

Rule ID

BizTalk Jobs

 

Summary Report

 

 

BizTalk Job 'Monitor BizTalk Server (BizTalkMgmtDb)'

Failed (Job to find known issues in all MessageBox and DTA) ! !!

Query Report

Jobs Description

118

'Monitor BizTalk Server' Job

This job failed - maybe a known issue if CU 1 for BizTalk 2010 is not installed !!

Query Report

Monitor BizTalk Server" SQL Server Agent job might not work after you add seven or more BizTalk hosts to a BizTalk group

117

 

 

 

 

 

June 11th, 2013 1:41pm

I see the following issues that IMHO need addressing on Priority

  1. Special TCPIP Settings
    SynAttackProtect enabled on SQL Server UKESIRTISQL03 - can generate General Network errors !
  2. MsgBox
     Communication problems with MsgBox found - Check the Network, SQL or MSDTC !
  3. Current Error log on UKESIRTISQL03
     Error(s) found in SQL Error log - Check if these issues were fixed !

It is possible that because of the "SynAtackProtect" on SQL, you're having refused connection and that is causing MSDTC transaction timeouts and failures when more that ONE Front-end is connecting and functioning.

Regards.

Free Windows Admin Tool Kit Click here and download it now
June 11th, 2013 2:54pm

It actually seems like the SynAttackProtect is the reason why you decreased performance when two hosts are running, the SQL machine think you're DDOSing it.

Disable that, then verify MSDTC, you can do this by running the MSDTCTester and MSDTCPing.

Also check the event logs for any errors.

Best regards

Tord Glad Nordahl
Bouvet ASA, Norway
http://www.BizTalkAdmin.com | @tordeman

Please indicate Mark as Answer if this post has answered the question.

June 11th, 2013 4:11pm

Thanks Tord and Shanky for reply

I have disabled SynAttackProtect  in registry of SQL Server only and now working on MSDTCPing.

I can DTC ping Biztalk server node A from SQL  but not Node B.

I cannot ping sql from both Node A and B.

Having classic RPC issue.

Enabled all the Network DTC settings and no Firewall between these servers exist.

Error(0x6D9) at dtcping.cpp @303
-->RPC pinging exception
-->1753(There are no more endpoints available from the endpoint mapper.)
RPC test failed

Free Windows Admin Tool Kit Click here and download it now
June 12th, 2013 12:06pm

So now the source of your issue is the MSDTC connectivity between Node B and SQL. Did you setup Node B by Cloning Node A (or did you do a fresh installtion from OS upwards)?

I have seen issues with MSDTC crop up because people setup System A, take a mirrored disk and create System B by renaming the host. This does not change the basic system GUID (the correct way is to SYSPREP) and then when these cloned systems connect to the domain and MSDTC is tried to setup, the base system GUID being the same causes such issues.

Regards.

June 12th, 2013 12:41pm

Hi!

It seems like there may be some RPC or MSDTC problems, please verify that MSDTC is not blocked in the firewall, you also may want to take a look at updating the registry. Ensure that all machines have a unique CID for MSDTC.

Remember to also add exception to port 135 in the firewall, I've seen this exact case where firewall was disabled on all machines however for some reason port 135 was blocked on my SQL machine making BizTalk not usable.

Also Ensure that MSDTC is configured correctly, you can take a look at the following TechNet Wiki article.

Best regards

Tord Glad Nordahl
Bouvet ASA, Norway
http://www.BizTalkAdmin.com | @tordeman

Please indicate Mark as Answer if this post has answered the question.

Free Windows Admin Tool Kit Click here and download it now
June 12th, 2013 1:44pm

Hello

So entire day was spent in sorting this MSDTC and RPC issue but no luck.

Here is the scenario

BizNode A and BizNode B have BTS

SQLNode C have SQL.

DTCTest is successful only when I ping from Node C to Node A.

In any other scenario, it does not work.

I tried all the options in this 

http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q306843&wa=wsignin1.0

When I updated the RPC registry setting for dynamic port, The DTC service failed to start after a reboot with following error :- 

Windows could not start the Distributed Transaction Coordinator on Local Computer. For more information, review the System Event Log. If this is a non-Microsoft service, contact the service vendor, and refer to service-specific error code -1073737669.

Only after deleting the RPC registry settings, DTC service came back up again.

All the servers can identify each other , I tried that by pinging each other by name.

All the firewalls are disabled between all servers.


I checked the HKEY_CLASSES_LOCAL -> CID -> MSTDC key

Both the Biztalk node A and B had same GUID, so I changed the GUID of B.

Still cannot ping from SQL node, when I tried changing GUID of node A. RPC failed from SQL node C which was working before so I reverted back.

I guess I have tried every thing that I can think of........

June 12th, 2013 6:56pm

Well I restarted the SQL server and Tried again and I guess I have fixed it 

here are the DTC ping results

From Node B -> C

Invoking RPC method on Node C
RPC test is successful
++++++++++++RPC test completed+++++++++++++++
Please start PING from Node C to complete the test
++++++++++++Start DTC Binding Test +++++++++++++
Trying Bind to Node C
Binding success: Node B-->Node C
++++++++++++DTC Binding Test END+++++++++++++


From Node A -> C


Invoking RPC method on Node C
RPC test is successful
++++++++++++RPC test completed+++++++++++++++
++++++++++++Start DTC Binding Test +++++++++++++
Trying Bind to Node C
Binding success: Node A-->Node C
++++++++++++DTC Binding Test END+++++++++++++


From Node C -> A
Invoking RPC method on Node A
RPC test is successful
++++++++++++RPC test completed+++++++++++++++
Please start PING from Node A to complete the test
++++++++++++Start DTC Binding Test +++++++++++++
Trying Bind to Node A
Binding success: Node C-->Node A
++++++++++++DTC Binding Test END+++++++++++++


From Node C -> B

Invoking RPC method on Node B
RPC test is successful
++++++++++++RPC test completed+++++++++++++++
++++++++++++Start DTC Binding Test +++++++++++++
Trying Bind to Node B
Binding success: Node C-->Node B
++++++++++++DTC Binding Test END+++++++++++++

Free Windows Admin Tool Kit Click here and download it now
June 12th, 2013 7:12pm

I just performed the test by dropping 90 files and still the performance is slow when hosts on both nodes are avtive... :(

June 12th, 2013 7:58pm

What size are these files? Can you set up performance logging on the bizTalk machines and parse them through PAL after you've done the dropping of 90 files? - Tord

Edit: the reason why the DTC Failed is because you've added the registry settings wrong, the data type, capitalized letters and values have to be correct, if not it will fail with the error you referred to.
Free Windows Admin Tool Kit Click here and download it now
June 12th, 2013 9:46pm

Did we rule out flooding of the target systems?
June 13th, 2013 12:40am

Did we rule out flooding of the target systems?

Yes, I was running another application on our server which was reading an email attachment, dropping attachment in a network folder, updating the SQL tables & send email.

So Target system was not involved in this and performance was slow when both servers were actively participating.


Free Windows Admin Tool Kit Click here and download it now
June 13th, 2013 6:19pm

What size are these files? Can you set up performance logging on the bizTalk machines and parse them through PAL after you've done the dropping of 90 files? - Tord

Edit: the reason why the DTC Failed is because you've added the registry settings wrong, the data type, capitalized letters and values have to be correct, if not it will fail with the error you referred to.

Cheers Tord

Size of files were just 2 KB Flat files. I am going through your white paper now and will update here once I am done with performance logging and parsing with PAL. (This might take a while ETA Monday Eve.)

Apologies as I was pulled out to resolve some production issue.


June 13th, 2013 6:23pm

Hi!

I got the performance logs sent to my email, it indicated problems related to the spool table and the host queue length for the hosts.

In a setup like this where it starts running slow is usually due to disks on the SQL side.

BizTalk uses the databases a lot and generates a ton of disk write and read operations, it's important to keep these disks as fast as possible to handle the load coming from BizTalk. In a scenario where you have one BizTalk machine running the load may be acceptable for the SQL server however when two machines are "hammering" the SQL server the disks often ends up as being the problem.

The Spool table keeps a footprint and references for all messages going through BizTalk, when this table grows it can drastically decrease overall performance in the environment since queries takes a longer time to finish since the queries times to retrieve information takes longer.

When it comes to the host queue length its the length of the messages in for that specific host, these may be delayed due to disks running to slow and can't handle the load a temporary solution is to create more hosts and split up the tasks between these hosts, and obviously you need to investigate further to find out the actual cause of this problem (which usually is disk related in a scenario like this).

I hope the problem will be resolved and feel free to update this thread and let us know if you need any more help or the problem is resolved.

Best regards

Tord Glad Nordahl
Bouvet ASA, Norway
http://www.BizTalkAdmin.com | @tordeman

Please indicate Mark as Answer if this post has answered the question.

Free Windows Admin Tool Kit Click here and download it now
June 14th, 2013 12:08pm

Ok I generated the PAL report as suggested and It might be a issue with disk input/output.

I am working with the network team as we know that we are using a shared network drive.

I will update my investigation here in 3-4 working days.

Thanks

Bharat


June 14th, 2013 12:11pm

Ok I generated the PAL report as suggested and It might be a issue with disk input/output.

I am working with the network team as we know that we are using a shared network drive.

I will update my investigation here in 3-4 working days.

Thanks

Bharat

Hi,

Do you get any result? If you have any issue about your problem, please tell us.

Regards,

Free Windows Admin Tool Kit Click here and download it now
June 20th, 2013 6:16am

Ok I generated the PAL report as suggested and It might be a issue with disk input/output.

I am working with the network team as we know that we are using a shared network drive.

I will update my investigation here in 3-4 working days.

Thanks

Bharat

Hi,

Do you get any result? If you have any issue about your problem, please tell us.

Reg

June 20th, 2013 3:16pm

Just to update, we are getting a new powerful  physical box to host various virtual machines.


I have not got much detail about the new hardware we are getting , but its a 8 dual core cpu, 64 gigs of ram with 2 physical disks (Raid 1 +0).

Since its a new machine, we will do our test by making that machine as a single physical sql box and see how it impacts our performance.

I will be able to place message-box database in the 2nd drive as suggested in best practice.

I would appreciate any suggestion to improve what I am planning to do.

Thanks


Free Windows Admin Tool Kit Click here and download it now
June 26th, 2013 2:06pm

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics