Today an interesting problem occurred with a customer. The customer complained that no mails could be sent for some time. Mails were actually getting stuck in the outgoing queue. The Exchange 2010 server was supposed to send all outgoing mails to a relay server operated by the customer's provider. This is not an unusual configuration. The response from the email provider's relay server was the SMTP error 503 bad sequnce of commands. An undeliverability report was therefore sent to the client.
In such cases, it is always advisable to take a close look at the NDR to get a starting point for what is going wrong. The important part of the NDR in this case was the following:
Diagnostic information for administrators:
Generating server: EXSRV.frankysweb.local
frank@frankysweb.de
relay.provider.de #503 5.5.1 Error: Bad sequence ##
Hier gilt zu beachten, das nicht EXSRV.frankysweb.local der Server ist der dem Client mitteilt das SMTP Fehler 503 vorliegt, sondern der Server relay.provider.de. Soweit so gut, was sagt der Fehlercode 503 aber aus? Nach der Fehlercode Definition des SMTP Protokolls steht Fehler 503 für „Falsche Kommando Reihenfolge“. Dazu muss man wissen, wie die Zustellung einer Mail abläuft. Die Befehlsreihenfolge des SMTP Protokolls sieht dabei folgendes vor:
HELO Hostname
MAIL FROM:
RCPT TO:
DATA
QUIT
This representation is somewhat simplified, for example there is also the option of sending the EHLO command instead of HELO and a few other commands that are used for authentication, for example. It is therefore advisable for Exchange administrators to familiarize themselves with the SMTP protocol.
Back to the problem: Error 503 says that something is wrong in the command sequence of the server that wants to send the mail. Unfortunately, it does not specify what exactly. To find out, you can install a sniffer and log the traffic on port 25, for example. In this case, I proceeded in the same way to find out which commands were sent in the wrong order or which command was not sent. Here is an excerpt from the Capure Log of Microsoft's Network Monitor:
I have shortened the log a little to make it easier to read. So let's take a look at what happens:
Line 141: EHLO
Line 144: AUTH LOGIN
Line 151: Authentication successful (response from the relay server)
Line 152: RCPT TO
Line 153: SMTP error 503
In this sequence it is noticeable that the command MAIL FROM: is missing in the transmission. In this case, the commands are not in the wrong order but one command is missing completely, as the relay server at the provider does not accept this behavior and responds with error 503.
So now we know where the problem lies and can think about what can cause this problem:
Did I do something wrong when setting up the send connector?
Have I entered the correct access data for the provider's relay server?
Have I set the HELO entry for the send connector correctly?
Have I entered the correct host name or IP for the relay server?
In this case, I was able to answer all of these questions with Yes! I could answer them all with yes! Nevertheless, it usually helps to retrace the path in your head. When configuring the send connector, it is not possible to influence the SMTP protocol. Why should you!
The cause must therefore lie elsewhere:
What other software is installed on the Exchange Server?
How does the Exchange Server connect to the Internet? Are there proxies or application layer firewalls?
Is there any additional software that protects Exchange from viruses or SPAM?
In my case, I was able to answer all the questions with No! except for the last one: Is there any additional software that protects Exchange from viruses or SPAM?
A virus scanner was installed on the Exchange server. In my case GData Antivirus. Although the customer assured me that the virus scanner was compatible and correctly configured (exclusion of processes and file extensions), I deactivated the processes of the virus scanner for testing.
The end of the story: After switching off the virus scanner, the mail delivery worked straight away and the SMTP command MAIL FROM: was also transmitted again. The cause was therefore the virus scanner. The customer will now check the configuration again.
Why am I writing a long post about a small virus scanner problem? I could also have written: If error 503, then disable virus scanner.
However, I thought that I would rather point out a solution for such problems than simply advise you to deactivate any processes. I have encountered similar problems with application layer firewalls that do not run directly on the Exchange server. In one such case, it saved me a lot of trouble as I was able to prove that the transmission of the mail on the Exchange server side was still OK, but that a firewall filtering the traffic was a little too eager.