Receiving a time out error while trying to invoke a BizTalk Orchestration exposed as a service (WCF)

Today I want to talk about a very common problem that can occur when we are invoking BizTalk Orchestrations exposed as synchronous services (Request-Response Receive ports):

“System.Net.WebException: The operation has timed out”

System Net WebException The operation has timed out BizTalk Orchestration

In my case, I was trying to invoke an orchestration exposed as a WCF service. And of course, this can be a very straightforward problem, most of the time easy to detect and probably also to fix… but sometimes BizTalk tries to play with us and throw a few good surprises…

Typical Causes and Solutions

Typically this problem is usually associated with network problems or lack of Error Handling inside Orchestrations:

You are trying to invoke an external system inside the orchestration and it takes too much time to respond and naturally, we get a timeout error.
- The SOAP.ClientConnectionTimeout property can be used on a Web service that takes a long time to return a response to try to solve or address this problem.
You can have a High Volume of requests and you can overload the external service with too many concurrent calls or you can have limits of max connections to a certain address and will naturally this can cause affect the performance and probably will take too much time to respond.
- There’s a nice post from Richard Seroter on How to avoid Service Timeouts In High Volume Orchestration Scenarios but you can find several ways to solve this address.
This error can also occur when the BizTalk performance gets degraded and it starts responding slowly. If the BizTalk jobs are not well configured and running the size of BizTalk Databases can grow extremely and respond very slow and we get stuck with this error.
- You need to validate the jobs and the database and if necessary you need to configure the jobs and clean the databases with the Terminator tool and probably shrinking also the databases to resolve the problem.
Also you can get this error because you don’t want to handle errors inside orchestrations, so for example if you don’t handle WCF Fault messages from your external service or you are invoking a C# code and it raises an exception, and you are not handling these situations in your orchestration, the orchestration will be suspended and you will get a timed out exception.
- You can prevent this by handling errors inside orchestrations. See BizTalk Training – Handling Exceptions inside orchestration

Nothing fits in the problem. What can it be?

But … and when all this is checked and doesn’t fit in the problem that is happening. What can it be? Before explain let me describe my scenario:

I have a simple demo service that receives a small message, invoke an external service and return the message to the source system.
Because of API limitation, I decided to invoke these external services with C# and control Exception inside orchestrations.
I know by using HAT and debugging in DEV environment that the external service was giving me a known error that I was controlling and throwing the error in order to create a response with the error description, inside the orchestration, to be returned to the original system.

So nothing to fancy and very simple stuff. However every time I tried to test the orchestration was stuck in the message box in suspended state… the external service was invoked, the error was a catch in the code, logged and the exception was raised for the orchestration and after that nothing … the engine seemed unaware and did not understand what was happening, crazy I know.

When I investigated the Event Viewer to try to find more information I found this description:

“xlang/s engine event log entry: Uncaught exception (see the ‘inner exception’ below) has suspended an instance of service ‘MyOrchestration(728766c9-9df2-609b-004e-fa2e7c3079c4)’.
The service instance will remain suspended until administratively resumed or terminated.
If resumed the instance will continue from its last persisted state and may re-throw the same unexpected exception.
InstanceId: 4dfd6041-d52f-4110-8d0d-92efc48f0c38
Shape name: InvokeExternalService Shape
ShapeId: eff276fd-f289-4778-ba47-ff66309ae8c6
Exception thrown from: segment 2, progress 8
Inner exception: Fault Response: My Error Description”

The first thing I thought was that I incorrectly published some resource (DLL) … but after validating and publishing the solution again, the problem prevailed.

Cause

The Orchestration Designer can play some tricks to developers. And be very careful when you copy shapes from one orchestration to others!

What I did was open a similar solution and copy the main scope (body of the orchestration) to a new solution that I had created. This will also copy all the shapes inside the scope… and of course, I change the shapes to fit my new requirements, by deleting some shapes and change the code inside others… Why? To be faster and not lose much time creating the main sketch of the orchestration.

But be aware that the Orchestration design doesn’t like some of these operations (for me it’s a bug inside Orchestration design), and for some reason, the designer doesn’t interpret well the shapes (“refactoring” or the “graphical interpretation”). It compiles well the solution but in runtime, we get stuck with the error “The service instance will remain suspended until administratively resumed or terminated. If resumed the instance will continue from its last persisted state and may re-throw the same unexpected exception.”

I don’t know if some also detect this behavior before but I already experience this twice.

Solution

To solve this strange behavior you need to redesign the same Orchestration flow by:

Dragging new shapes to the Orchestration design and copy the exact same code inside the existing shapes to the new ones… you can also give the same names!
At the end delete the existing shapes (which had been copied).
Compile and deploy the project again.

Without doing anything more this solved my problem.

Author: Sandro Pereira

Sandro Pereira lives in Portugal and works as a consultant at DevScope. In the past years, he has been working on implementing Integration scenarios both on-premises and cloud for various clients, each with different scenarios from a technical point of view, size, and criticality, using Microsoft Azure, Microsoft BizTalk Server and different technologies like AS2, EDI, RosettaNet, SAP, TIBCO etc. He is a regular blogger, international speaker, and technical reviewer of several BizTalk books all focused on Integration. He is also the author of the book “BizTalk Mapping Patterns & Best Practices”. He has been awarded MVP since 2011 for his contributions to the integration community. View all posts by Sandro Pereira

3 thoughts on “Receiving a time out error while trying to invoke a BizTalk Orchestration exposed as a service (WCF)”

Leonid Ganeline says:

October 29, 2013 at 7:36 pm

Hi Sandro,
I mentioned this “broken copy” behavior of the Orchestration Editor with Send shapes.
Also the Port types in the Orchestration View do not show the full qualified .NET name. And when we copy them they save the old names (names with previous orchestration name prefix in it, something like that). And we got a naming problem.
I worked this out to look to the Xlang orchestration code and check the full names.

All this the bugs, 100%

1. Sandro Pereira says:
  
  October 30, 2013 at 9:30 am
  
  Hi Leonid,
  Nice to know and thanks for the comment!
  I like the name you gave it and we can call this the “broken copy” bug 😀
  
Alejandro Fernandez A says:

October 30, 2013 at 1:03 am

Thanks Sandro, this information its very clean.