Troubleshooting Mail Delivery Problems

Product:
ModusMail (with some support for ModusGate)

Version & Build: All


 
Problem:

Local users are not receiving mail from the outside world or users are trying to send mail through your server TO the outside world.  This assumes that the user is able to POP his/her mailbox but, for whatever reason, no new mail is coming in or going out. 

This how-to applies mainly for ModusMail but can be used to troubleshoot some ModusGate issues.
 


Step #1 - Check to see if your server is responding on port 25

You should repeat this test at least twice - once from the Modus server and once from another PC on the same network or, ideally, from a machine outside of your network (via a dialup or high-speed connection)

 

  • From a Command Prompt, type telnet <mail server IP> 25 <enter>
  • Your mail server’s banner should appear:

220 <domain> ModusMail ESMTP Receiver version x.xxx.xx ready

  • If the banner fails to appear within 30 seconds, you could be using RBL/DNSBL lookups or reverse DNS lookup and your DNS server or one of the blacklist servers is not responding quickly enough
  • Solution:

    • In the Console, go to Security – Properties – Sender Validation & Accreditation
    • If Perform a lookup for the SMTP host in the DNS is enabled, disable it
      • Stop/start the SMTPRS service to clear the cache
    • From a Command Prompt, type telnet <mail server IP> 25 <enter>
      • If the banner appears quickly, this problem has been resolved by disabling Reverse DNS lookups
    • If the problem persists, in the Console, go to Security – Properties – Real-Time Blacklist and click on RBL Servers
    • Remove one of the RBL servers you are using
      • Stop/start the SMTPRS service to clear the cache
    • From a Command Prompt, type telnet <mail server IP> 25 <enter>
    • Keep removing RBL servers until you determine which RBL server is causing the problem
      • Stop/start the SMTPRS service each time and replace the RBL servers that were removed during the testing (those not causing the problem)
  • If the banner appears immediately, it is possible that your IP address is whitelisted or cached
    • Repeat the telnet step from a PC outside of your network (if possible)
  • After you have tested that the SMTP banner responds quickly but people still do not receive email, proceed to the following step 
 
Step #2 - Use Telnet to Send an Email to a User
 




  • Check the recipient’s mailbox
  • If the user received the email, try sending an email for this user from an outside mail service (e.g. Hotmail)
  • If the person does not receive the email, something is preventing mail traffic from getting to your server from outside of your network OR the domain you are trying to send mail to from the outside world has a DNS configuration problem with regards to the MX record settings.
  • To rule out an invalid DNS setting, perform an nslookup of the domain that you are trying to deliver mail to from the outside world.
  • Example:




  • In the above example, we try to resolve vircom.com
  • set q=mx tells nslookup tells Windows to look for MX records
  • The result in the above example shows that mail goes to gate.vircom.com (pref level 0) and, if gate.vircom.com goes down, the mail goes to smtp.vircom.com (pref level 1)
  • It is always possible that your DNS server knows the destination domain but the outside world does not
    • You can do a lookup using a foreign DNS server by using nslookup - <remote DNS server> to see if the outside world sees your domains
  • Example:

nslookup – 207.96.243.93

  • This allows you do a lookup using Vircom’s DNS server instead of your own to find out if your domain resolves on outside DNS servers
  • Assuming that you reach this point and it is certain that DNS is properly configured for your domain, then there is a network issue between the outside world and your ModusMail server.  Chances are you have some sort of mail firewall that is causing the problems (e.g. Cisco Pix firewall)
  • If the user did not receive the email, proceed to the following step


Step #3 –  Determine What Happened to the Email

 

In the previous step, you tried to send an email to one of your users using telnet.  You ascertained that there was no network issue and that the domain was configured properly for DNS.  At this point, the message you sent manually is, most likely, stuck in one of the queues.

 

How the Modus server works:

  • Email comes into Modus on Port 25
  • SMTPRS (SMTP Receiver Service) takes the email and stores it in the spool\invirus folder as an MSG and RCP pair.  The MSG file is the body of the message and the RCP file is the “envelope” (the actual SMTP transaction as recorded during the “mail from” and “rcpt to” phase)
  • Once the message is scanned by the MODUSCAN process and, if it is clean, it is moved, by MODUSCAN, to the spool\incoming folder (both the MSG and RCP files)
  • If the message is not clean, it is moved to the spool\spam or spool\virus folder.  Afterwards, MODUSADM takes the messages from the spool\virus and spool\spam folders and quarantines them
  • The SMTPDS (SMTP Delivery Service) has a sub-thread (process) that continuously checks the spool\incoming folder.  As soon as it sees an MSG/RCP pair, this sub-thread grabs the MSG file and moves it to spool\holding.  The RCP file is moved to spool\domains\<destination.domain>
  • Modus does this to make it possible to try to deliver messages to multiple destinations at the same time.  Messages bound to local domains have their RCP files go to a special folder called spool\domains\$local$
  • Finally, the SMTPDS main delivery threads loop through the various domains under spool\domains\<domainname> and attempts to deliver the messages to their destination.  Local deliveries have priority over remote deliveries.  Messages (the MSG file) are moved to the end-user's Inbox folder.  The RCP file is deleted

The above explains how the spool works.  The following will attempt to explain why the message was not delivered:

  • Check the ..\spool\invirus folder:
  • Go to …\Vircom\Modus<Mail or Gate>\Spool\Invirus and check if there is a large backlog of mail (greater than several hundred MSG/RCP pairs)
    • Use Windows Find to determine if the message you sent is here

Backlogs with the MODUSCAN engine can be caused because of the following:

 
 

a) The Quarantine is too slow or is corrupted

  • The server simply cannot put more mail into the Quarantine because of slowness or a connectivity outage
  • If you are using the native Quarantine database (available with Modus) it is possible that you have exceeded the maximum file size of the database.  Since the default Quarantine database uses Microsoft Access, if the Quarantine reaches a size of 2 gigabytes, it crashes dies and causes MODUSCAN to spike to 100%.  Processing of messages slows considerably.
  • Check the mailstore.mdb file size in …\Vircom\Modus<Mail or Gate>\mailbox\@quarantine.  If it is at 2GB or close to it, proceed with the following:
    • In the Console, stop all services
    • In Windows Explorer, go to …\Vircom\Modus<Mail or Gate>\mailbox\@quarantine
    • Rename the mailstore.mdb file to mailstore.old
    • Rename the inbox folder to inbox.old
    • In the Console, start all services
    • Assuming that this was the cause of problem, Modus should process the backlog quickly

      

NOTE: Versions 3.x and above provide the option to compact the Access database to prevent reaching the 2Gg limit.  This option is on, by default, when upgrading from an older version if Access is used.  Go to System – Properties – Quarantine Database to change the time Modus compacts the database.  Stop and start the MODUSADM service after any changes.

This step is not a permanent solution.  A long term fix would be to switch to a SQL Server quarantine database.  For more information, see
How-To: Use SQL Server Instead of the Built in MDB Database for Quarantine Storage.
 
Another work-around would be to delete spam instead of quarantining it.  Go to Spam – Preferences – Options and, under When a Spam is detected, select Delete the message immediately under.
 
 
 

b) Sievestore Database is Corrupt

The SieveStore database sievestore.mdb, located in …\Virocm\ModusMail\SieveData contains all of the Sieve script updates.  To determine if the sievestore.mdb file is corrupt, search for the word corrupt in the latest OPR*.LOG and ERR*.LOG files in the Log folder.

If the database is corrupt, there is no way to recover your custom sieve scripts unless you have backed them up (by pasting them into a text file). 

To reset the sievestore database:

 

  • In the Console, stop all services
  • In Windows Explorer, go to the …\Sievedata folder
  • Rename sievestore.mdb to sievestore.old
  • Download the empty sievestore.mdb file (attached to this article) to your Modus server
    • Copy it to the …\Sievedata folder
  • Open the Registry Editor and go to HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\MODUSADM\Parameters
  • Double-click on SieveUpdateLUSN and set the Data value to 0 (zero)
  • Exit the Registry Editor
  • In the Console, start all services
  • Go to Spam – Properties – Auto-Updates and click on Update Now
  • Click on Update Now until no new updates are received
  • Go to Spam – Preferences – Options and click on Categories to ensure that they are OK
  • At this point, MODUSCAN should be catching up with itself
 

c) Sieve script catching messages sent to the test user

Check the quarantine to see if the message you sent in the previous step was caught.  If it was, check your custom sieve scripts to see if there is a script that is capturing more messages than it should.  Make sure your quarantine is not set to Delete Spam.  If it is, the test messages will be deleted.
 
 
 

d) Third-Party anti-virus is blocking messages prematurely and locking files

Some customers run a third-party anti-virus on the same machine as Modus which is, normally, a good thing.  However, since some versions of Modus provide virus scanning and infected messages are moved to a folder first, it’s possible that the third-party anti-virus locks the file while Modus is working on it and causs problems for Modus.

Make sure the following folders are not scanned by your anti-virus:

  • …Vircom\Modus<Mail or Gate>\spool folder and sub-folders
  • …Vircom\Modus<Mail or Gate>\mailbox\@quarantine folder and sub-folders
  • c:\winnt\temp folder and sub-folders


e) Slow lookup on the user setting validation with extended database format


When a message comes in and is scanned by the MODUSCAN engine, Modus checks if the user has overridden the System/Domain settings and has opted not to scan messages for spam, viruses or attachments.  If you are using registry (i.e. Generic) mailboxes, this is not an issue.  However, if you are using the Extended Database format (either in ModusMail or ModusGate Cluster), it is possible that the query takes too long to do the lookup. 

 

A possible reason for this slowdown is the ODBC call settings in the registry.  Therefore, the first step is to change the following settings:

 

  • Open the Registry Editor and go to HKEY_LOCAL_MACHINE\Software\Vircom\VopMail
    • Double-click on autodelmailboxes and set the Value data to 1
  • Go to HKEY_LOCAL_MACHINE\Software\Vircom\VopMail\ databasemailboxconfig\generic
    • Double-click on caseInsensitiveSearch and set the Value data to 0
    • Double-click on LowercaseSearch and set the Value data to 0
  • Stop and start the Modus services

  

If the above changes do not resolve the problem, temporarily switch to the standard Generic format.  The “standard” generic format only makes use of four fields.  Proceed with the following:

 

  • In the Console, go to AUTH – Properties – ODBC Database
  • Remove the check-mark at Use Extended Database
  • Click on Table Configuration and enter the following information:
    • Table Name: VopMail
    • Column Name mailbox: USERNAME in the table
    • Column Name User Name: FULLNAME in the table
    • Column Name Domain: DOMAIN in the table
    • Column Name Password: VOPPASS in the table
  • Click on Close and Apply
  • Stop and start the Modus services

 

If the backlog in spool\spam & spool\virus begins to decrease, this resolved the problem,  Contact Support with your findings to see if there are any long term solutions or to evaluate any optimizations that could be done with SQL database.

 
 
 
f) Hardware performance is taxed by the demand on the scan engine

It can happen that the load generated by the mail traffic coming into your server has reached a point where MODUSCAN cannot keep up.  The MODUSCAN engine causes 100% of CPU usage (as seen in Task Manager).   Contact Vircom Technical Support for additional help if this is case.  Please ensure that you provide the necessary information to effectively provide troubleshooting.
 


Step #4 - Check the …Vircom\Modus>Mail or Gate>\spool\incoming folder

The spool\incoming folder will rarely get backed up.  If it does get backed up, it is usually because Modus is too busy working on traffic under spool\domains and spool\holding.  SMTPDS handles both the spool\incoming folder and the spool\holding and domains folder.  Search this folder to see if the message is stuck.  However, if there is a backlog, chances are the problem is actually with the …\spool\domains folder (see next step).



Step #5 - Check the ...\spool\holding and ...\spool\domains folders

The …\spool\holding folder stores the messages that are bound to local and outbound delivery.  Normally, this folder can contain from a few hundred messages to a few thousand, depending on the size of your installation.  If there are more than 2000 messages, there is a problem.  Also important is what goes on in the ...\spool\domains folder since this is where Modus coordinates mail delivery.

Most important are the local deliveries, so go to the ...\spool\$local$ folder to determine what is going on.  If there is a large backlog of mail, something is preventing the timely processing of messages going to your local domains.

In the folder, you should see one of four types of files.  These are, actually, all of the same type (envelope files) but the extension of the file indicates what processing has completed so far.

 

  • .RCO files are recently arrived .RCP files that have yet to be scheduled for delivery
  • .RCP files are scheduled for delivery and are awaiting processing
  • .LCK files are .RCP files that are currently being delivered (or Modus is attempting to deliver them)
  • .DEF files have already been attempted once for delivery and are awaiting retry (deferred files)
If, when refreshing the folder in Windows Explorer, the number of files does not decrease (or increases), even after clicking on Deliver Now in the Console under System – Properties – Mail Delivery, there is a problem.


Possible cause for the backlog:


Database authentication failing for local domains

If you use database mailboxes and if the authentication database is down or corrupt or your system DSN is invalid, delivery to local mailboxes will fail.  To test this, in the Console, go to Users and click on a valid user.  If there is an error message, there is a problem with user authentication.

Ensure that:

System DSN points to correct database

  • On the server, go to Administrative Tools > Data Sources (ODBC) and check your system DSNs to ensure that they are still valid
    • In the case of SQL authentication, ensure that the datasource username & password are valid as well

 

Modus uses the correct DSN in the Modus Authentication subsystem

  • In the Console, go to Auth – Properties and select the database type that Modus is trying to authenticate against and make sure that it is pointing to the correct DSN name with the proper DSN username & password (if applicable)

 

You have the correct schema/stored procedures

  • Depending on the type of database Modus is authenticating against (RODOPI, Platypus or the extended database format), ensure that you are running the correct stored procedures for your particular version of Modus
  • If you recently upgraded your software, it is possible that the new package requires new fields (extended database) or new stored procedures
  • For RODOPI or Platypus users, check with the RODOPI or Boardtown (makers of the products) to see if new stored-procedures are available
  • For the extended database format, contact Vircom’s Support for the latest database schema