TFS 2010 Build: Sporadic Crash in Process

We have a situation where our assemblies ceased to be performed stably. At a speed of about once every three, we get either TF215096 or TF215097 errors, and the assembly does not work.
If we restart the Build controller, it will work again - until next time.

We get errors:

TF215096: an error occurred while connecting to the vstfs: /// Build / Controller / 1 controller: there was no listening to the endpoint in ht * p: // XXXX that could receive the message. This is often caused by the wrong address or SOAP action. See InnerException, if present, for more details.


TF215096: Error connecting to controller XXX - Controller: Failed to connect to ht * p: // XXX. Error code 10061 TCP: The connection could not be completed because the target computer actively refused it 192.168.XXX.XXX:XXX.


TF215097: An error occurred while initializing the assembly to determine the assembly. \ XXX: Team Foundation services are not available from the ht * p: // XXX server. Technical information (for the administrator): the basic connection was closed: the connection that was supposed to be saved was closed by the server.


TF215097: An error occurred while initializing the assembly to determine the assembly \ YYY: An error occurred while receiving an HTTP response to ht * p: // XXX. This may be due to a non-HTTP service endpoint binding. This could also be due to the HTTP request server being interrupted, and possibly due to a service outage. See Server Logs for more information.

Server logs provide little information, at least we did not find anything, which helps us resolve the situation. Various web searches were also not effective.

Does anyone have these / similar problems? Any ideas on how / where to look for permission?
Thank you in advance for any input!

+2
source share
3 answers

Today is a happy day, since we managed to figure it out. Sorry @Duat that I uncheck the “answer” box, but it turned out that the problem is very different from what you (and anyone else) predicted.

In my last update, I was about to redirect this question to MS when we realized that our firewall was mistaken for name resolution. Therefore, we assumed that it was a criminal and waited for this to be resolved. After it was resolved, we ALL had the same problems, and we again reviewed the situation.

We isolated the problem as part of our build process, more specifically with custom code activity included in our build solution.

I implemented a work with code that will work at the last stages of each assembly. This work consisted of collecting BuildDetails about the current build and adding them to a new line in "BuildLog.xls".
The implementation is implemented using Microsoft.Office.Interop.Excel .
This excel sheet is on another server (NOT on the servers where the controller / agents are located).

During the development of this activity, I encountered such problems as, but after I finished, there were no EXCEL examples. So I thought it was done and it was decided.

With an attempt and a mistake, we noticed that when this activity will not work, there will be no problems. With this start-up, the first assembly after the reset controller assembly was successful, any next assembly would have a definite chance to fail. As soon as any assembly fails, no one else will succeed until the other assembly controller is reset.

I have only a general idea of ​​what the problem is (the Excel DCOM call, the TFS services are WCF: how would they intervene ?! Why does this sometimes succeed and sometimes fail?). The provided diagnostics also did not help, in fact they mislead us, which lasted for several months.
If I ever find the time, I would like to cleanly reproduce the error and make from it the question of server failure ...


After removing this activity it works! Now I searched in SO and found this one , where J. Saunders comments: "In general, you should never use Office Interop from a server environment."
It is ironic that as soon as you get to the bottom of any complex problem, the whole universe seems to know about it except you ...

0
source

Yes, it looks like you have connection problems. You can try enabling SOAP tracking on both the build machine and the server (if possible) to see if there is any error. If it still does not give you any new information, contact Microsoft by writing Connect Bug for help.

+1
source

I'm not sure if this will help you, but I ran into similar problems with build agents and ended up just uninstalling and re-creating the agent. You can try to remove your controller / agent and add it back. Brute force solution, but a good starting point. If this does not solve the problem, at least you can fix the controller / agent as a problem and take a look at the problems associated with the network / server.

0
source

Source: https://habr.com/ru/post/912442/


All Articles