So, we have a long-standing commercial product that has worked well and I have never seen this before. We use a client program to send data to the server. Sometimes, due to firewalls in client environments, we allow the end user to specify the ranges of outgoing ports for binding, however, in this particular problem that I see, we do not do this and use port 0 to perform the binding. From everything I've read, this means choosing a random port. But I can not understand what this means for the kernel / OS. If I ask for a random port, how can this already be used? Strictly speaking, the only single connection between the src ip / src port and dst ip / port makes the connection unique. I believe that the same port can be used if you talk to a different destination IP address, but maybe that doesn't matter.
In addition, this does not occur on all client systems, only a few. Therefore, it could be a load problem. They tell me that the systems are pretty busy.
Here is the code we use. I left some ifdef code for windows and left what we do after binding for short.
_SocketCreateClient(Socket_pwtP sock, SocketInfoP sInfo )
{
int nRetries;
unsigned short port;
BOOL success = FALSE;
BOOL gotaddr = FALSE;
char buf[INET6_ADDRSTRLEN] ="";
int connectsuccess =1;
int ipv6compat =0;
struct timeval time;
nRetries = sInfo->si_nRetries;
sock->s_hostName = strdup(sInfo->si_hostName);
LogWrite(LogF,LOG_WARNING,"Socket create client");
LogWrite(LogF,LOG_WARNING,"Number of retries = %d", nRetries);
ipv6compat = GetIPVer();
if (ipv6compat == -1)
gotaddr = GetINAddr(sInfo->si_hostName, &sock->s_sAddr.sin_addr);
else
gotaddr = GetINAddr6(sInfo->si_hostName, &sock->s_sAddr6.sin6_addr);
if (!gotaddr) {
if ( sInfo->si_logInfo && ( sInfo->si_nRetries == 1 ) )
{
LogWrite(LogF, LOG_ERR,
"unable to resolve ip address for host '%s'", sInfo->si_hostName);
}
sock = _SocketDestroy(sock);
}
else {
if (ipv6compat == 1)
{
LogWrite(LogF, LOG_DEBUG2, "Before call to inet_ntop");
inet_ntop(AF_INET6, &sock->s_sAddr6.sin6_addr, buf, sizeof(buf));
LogWrite (LogF, LOG_DEBUG2, "Value of sock->s_sAddr6.sin6_addr from GetINAddr6: %s", buf);
LogWrite (LogF, LOG_DEBUG2, "Value of sock->s_sAddr6.sin6_scope_id from if_nametoindex: %d", sock->s_sAddr6.sin6_scope_id);
LogWrite (LogF, LOG_DEBUG2, "Value of sock->s_type: %d", sock->s_type);
}
while (sock && sock->s_id == INVALID_SOCKET) {
int socketsuccess = FALSE;
if (ipv6compat == -1)
socketsuccess = sock->s_id = socket(AF_INET, sock->s_type, 0);
else
socketsuccess = sock->s_id = socket(AF_INET6, sock->s_type, 0);
if ((socketsuccess) == INVALID_SOCKET) {
GETLASTERROR;
LogWrite(LogF, LOG_ERR, "unable to create socket: Error %d: %s", errno,
strerror(errno) );
sock = _SocketDestroy(sock);
}
else
{
port = sInfo->si_startPortRange;
while ( !success && port <= sInfo->si_endPortRange ) {
int bindsuccess = 1;
if ( ipv6compat == -1)
{
sock->s_sourceAddr.sin_port = htons(port);
bindsuccess = bind(sock->s_id, (struct sockaddr *) &sock->s_sourceAddr,
sizeof(sock->s_sourceAddr));
}
else {
sock->s_sourceAddr6.sin6_port = htons(port);
inet_ntop(AF_INET6, &sock->s_sourceAddr6.sin6_addr, buf, sizeof(buf));
LogWrite(LogF, LOG_DEBUG,
"attempting bind to s_sourceAddr6 %s ", buf);
bindsuccess = bind(sock->s_id, (struct sockaddr *) &sock->s_sourceAddr6,
sizeof(sock->s_sourceAddr6));
}
if (bindsuccess == -1) {
GETLASTERROR;
LogWrite(LogF, LOG_ERR,
"unable to bind port %d to socket: Error %d: %s. Will attempt next port if protomgr port rules configured(EAV_PORTS).", port, errno, strerror(errno) );
port++;
}
else {
if (port != 0)
{
if ( sInfo->si_sourcehostName ) {
LogWrite(LogF, LOG_DEBUG,
"bound outbound address %s:%d to socket",
sInfo->si_sourcehostName, port);
}
else {
LogWrite(LogF, LOG_DEBUG,
"bound outbound port %d to socket", port);
}
}
success = TRUE;
}
}
}
}
}
return(sock);
}
The errors that we see in our log file are as follows. It makes 2 attempts and both do not work:
protomgr [628453]: ERROR: cannot connect port 0 to socket: error 98: address is already in use. It will try the next port if protomgr port rules (EAV_PORTS) are configured.
protomgr [628453]: ERROR: cannot connect port to socket: Error 98: address is already in use. Consider increasing the number of EAV_PORTS if this msg is from protomgr.
protomgr [628453]: ERROR: cannot connect port 0 to socket: error 98: address is already in use. It will try the next port if protomgr port rules (EAV_PORTS) are configured.
protomgr [628453]: ERROR: cannot connect port to socket: Error 98: address is already in use. Consider increasing the number of EAV_PORTS if this msg is from protomgr.