[ngw] GW Crashing

Matt Schlawin mschlawin at FVLHS.ORG
Sat May 2 12:20:42 UTC 2020

Hello everyone,
With all the distance learning going on, this is a terrible time for GW to start acting up.  Here is what I have been dealing with.

I'm running GW 18.1 on SLES 12.4.  All volumes are XFS.  I am running one PO, MTA, and gwia on this box for about 100 users.  I have separate servers for GMS, SMG and WebAccess.

The server has been running solid for months, but yesterday the entire OS was locked up.  It would not even ping.  When I rebooted, it got stuck in a boot loop.  I booted into recovery and it seemed to be OK for a while, but then then entire server spontaneously rebooted several times yesterday.  I was able to get it to boot and run in normal mode.

I did find some old stuck files in wpcsin and wpcsout that I moved out.  I then had no inbound email because my MTA was not processing email.  I found almost 300 email in <domain>/mslocal/mshold/<postoffice>/4  When I moved them out, everything worked again.

It ran all day yesterday but this morning the PO was crashed.  I booted from a rescue disk and did an xfs_repair on all the volumes to make sure there was no file corruption.  

The server came up but the PO would not start.  I followed TID 7017465 and moved the ngwdfr.db from the ofmsg directory and that worked.

The server has been running for about an hour, but I am not optimistic it will stay running.  I still have almost 300 messages again in <postoffice>/ofmsg.  I'm guessing they should not be there?  I also found over 400 *.ckl files in <postoffice>/wpcsout/chk   Not sure about those either.

Can someone point me in a direction of what to do next.  Server is running, email is flowing, but based on yesterday and today, I don't think this will last.


More information about the ngw mailing list