Tuesday, July 24, 2007

PURSe Portlets 1.1.0 Released

Just finished tagging and releasing a new version of the PURSe Portlets. There are several great things in this release. The one I am most proud of is also the one least apparent to the casual observer: the PURSe portlets are now synced up with the mainline PURSe codebase. In PURSe Portlets 1.0.x, I had made modifications to the PURSe 1.0 release, but now I am a PURSe committer and I've committed some of those modifications and bugfixes into the PURSe trunk. Also, I've developed an install script for PURSe that will install PURSe and it's dependencies, including a minimal deployment of Globus. This should take a lot of the pain out of getting started with PURSe for the new user.

Additional new features include:
  • Added an AJAX "Check availability of username" to registration portlet
  • Forgot password portlet now resets the user's password instead of revealing it
  • Administrative portlet has a paged table of users
  • Administrative portlet also allows to add a single user or multiple users at once
  • Portal/portlet "branding" greatly simplified and aligned with PURSe
See the release notes for more information, or the download page to get it now.

Monday, July 09, 2007

Apache MaxClients reached, all connections in CLOSE_WAIT

Our Extreme Lab web server (Solaris running Apache 1.3) has recently developed a problem. Twice yesterday it got into a state where it had hit its MaxClients limit (of 128, apparently) and then was unable to service any further requests. Running netstat -f inet showed the all existing connections were in the CLOSE_WAIT stage. I can't tell at this stage if there is some denial of service attack going on or just a problem with the server preventing it from finally closing these connections.

Update: Googling around, and with some help from our local Unix guru, Rob Henderson, I found out that if you have connections stuck in the CLOSE_WAIT stage, this usually indicates that the server side is having trouble closing the connection. Rob says that usually when he sees this problem it is because of an NFS server being down. You get requests for something on that NFS mount, and the process hangs there for a long time. With that NFS server remounted, things are much better. Whereas before I was seeing steadily increasing numbers of CLOSE_WAIT connections, I now see none.