• Processing (Info)

  • MPC Status Page: Archive (2009 January-December)

    This page describes enhancements to or problems that have occurred with the MPC and scripts and the fixes that have been made.

    Recent problems are listed elsewhere. Index of other older problems..


    Older Enhancements and Resolved Problems

    • Internal Network Problems?
      2009 Dec. 30: 21:55. There was a hang of (at least part of) the CfA's internal network from about 18:44:15 to 21:29:30. All high-level network protocol traffic between the MPCs computers froze. Low-level protocols continued to function allowing the MPCs cluster to remain intact. It is very probable (though we didn't actually check) that some or all of the web services were off-line during this period. We have queried this event with the Computation Facility but have not yet received a response.

    • Problem with MPCOBS
      2009 Dec. 19: 18:50. Two independent reports have been received of errors (specifically, stack dumps) being returned by the MPCOBS service. We are investigating.
      • 19:15. We've located the problem and fixed it. We've also made the error message more "friendly" in case a similar problem arises in the future.

    • Maximum version limit reached in MPES
      2009 Oct. 21: 10:00. Some debugging output that was added in order to fix the Oct. 1 problem was left in. This caused one debugging output file to reach the maximum allowable version number. This code has now been deactivated.

    • Missing uncertainty information in MPES pages
      2009 Oct. 11: 01:40. A problem that caused the uncertainty information to be omitted from MPES pages returned by the MPEph.COM script has been fixed. This problem first appeared when the Oct. 1 problem was fixed.

    • Strange behavior of MPC webserver
      2009 Oct. 1: 20:30. We have noticed strange behavior of the MPC webserver today. On five occasions, cgi jobs started executing but then simply hung. Once the maximum number of allowable processes are in the hung state the webserver stops responding to requests. We are investigating.
      • Oct. 3: Debugging/logging output has been enabled in the command file that calls the executable, the C front-end of the executable and the Fortran code. The jobs that are stalling create log files from the command file and the C front-end, but the Fortran log files, although created, are empty. The reasons for this are still baffling.
      • Oct. 3: 23:45. We've found the reason the script is hanging. An array used by a library routine is being overwritten. The reason why this overwriting is occurring is not yet clear. Particularly why it has only surfaced in the past few days.
      • Oct. 4: 00:30. We believe that the hanging script problem is now fixed and we understand why the problem appeared suddenly a few days ago.

    • Free access to previously-restricted MPC data
      2009 Oct. 1: 00:20. The steps necessary to remove the subscription blocks to previously-restricted MPC data have been started. An issue has arisen that requires a fresh mind. The blocks will be removed in the morning. Please note that access blocks will be put in place when we are updating the files or when we need to do other work on the pages.

    • MPEC e-mail problems
      2009 Sept. 30: 09:00. A problem with a number of recent MPEC mailings has been identified that caused many (all?) outside subscribers to not receive the circular. The underlying immediate problem has been fixed and a procedure has been modified to try and prevent this happening again. The problem affected only the three most recent circulars.

    • Incomplete files on the MPCORB anon-ftp site
      2009 Sept. 29: 20:30. A user reported that a specific file on the MPCORB anon-ftp site was incomplete. A check showed a large number of blank files. We are checking the logs.
      • 20:40. A look at the OpenVMS-side logs shows that part way through the copying of the files the error message "write failed" started appearing. No further information was returned by the sftp system, so it is unclear where the problem lies. It is unlikely that the available disk space dropped to zero (the ftp directory sits on a 3 TB device and there are currently 585 GB free). We are running the copy routine manually. We will check the situation after the next DOU MPEC.
      • 21:05. Some of the zero-length files have been fixed. The remaining problem cases will have to await the issuance of the next DOU MPEC.

    • Problem with NEOCP comments
      2009 Sept. 28: 11:45. A number of users reported problems with the latest version of the NEOCP comments cgi script rejecting all comments. An updated executable has been put in place.

    • CfA anon-ftp disk full
      2009 Sept. 19: 14:15. The disk used for storage of the files on the the anon-ftp server (e.g, the MPCORB files) is full. Until space is cleared updates of MPCORB will be incomplete.

    • Accessing mpn.arc via browsers
      2009 Sept. 8: 10:40. Since the mpn.arc file available for download in the ECS is now over 4 GB in size, a number of browers fail to download the entire file. That these browsers are limited to 32-bit filesizes is clear from the fact that the partial downloads consist of filesize-4GB bytes. Until the affected browsers can cope with >32-bit filesizes, this file will have to be downloaded via ftp.

    • Inability to access cfa-ftp.harvard.edu
      2009 Sept. 6: 17:30. An outside user reported his inability to access the MPCORB files on cfa-ftp.harvard.edu. We can confirm that there seems to be a problem with ftp access as we see the same problem from our home machines. We have queried this issue with the Computation Facility (of course, it's a long holiday weekend...).
      • Sept. 7: 14:00. It seems that the outage was intermittent. Access to the MPCORB files on cfa-ftp is again possible.

    • Network Interruptions (September 4)
      2009 Aug. 26: 17:00. We have been informed that there will be a network outage from 17:15-21:00 EDT on Friday, September 4. The CF will be upgrading core networking equipment. The whole complex will be off-line during the upgrade.

    • SMTP E-mail Interruption (August 7)
      2009 Aug. 5: 14:00. We have been informed that the CF's SMTP server will be off-line for approximately 30 minutes beginning at 06:05 EST on Friday, August 7, to allow OS patches to be installed. Attempts to send e-mail to us during this period will fail, but the mail should queue up on the sending computer for a subsequent resend attempt.

    • MPN.ZIP File In MPCAT-OBS
      2009 July 9: 17:50. The procedure that builds the zip'ed version of the MPN.ARC file in the ECS service MPCAT-OBS has failed as MPN.ARC is now over 4 GB in size. We had a similar problem ~ 2.5 years when the filesize grew over 2 GB. We are attempting to use gzip to compress the file. The file extension will remain .zip for the forseeable future.

    • Possible Disruptions (July 11)
      2009 July 1: 11:26. We have been informed that there will be a scheduled power shutdown at a remote CfA site on Saturday, July 11, from 07:00 to 13:00. A number of CF machines are located at this site and they will begin to be powered down at 16:00 on July 10. It is unknown at this point if this outage will affect internal MPC operations or external services.

    • Network Interruptions (July 9)
      2009 June 30: 16:12. We have been informed that Harvard will be doing work on a core router from 05:00 to 07:00 on Thursday, July 9. This will cause the CfA to be off-line for periods during the work window.

    • Network Outage
      2009 June 28: 22:00. All connections to/from the CfA network failed around 15:35 when a router at Harvard failed. Connectivity was restored several hours later.

    • June 26 DOU MPEC
      2009 June 26: 11:15. It seems that the fall-out from the June 24 problem continues. Two versions of last night's DOU MPEC were partially prepared. The correct circular number should be 2009-M38, but the version numbered 2009-M39 was mailed out. Neither process completed. We are fixing the bits that failed.
      • 12:15. It has been decided to redo the preparation of last night's DOU MPEC. It was simply too fiddly to fix just those bits that failed. The DOU MPEC will be MPEC 2009-M38.

    • Mis-mailing of old DOU MPEC
      2009 June 25: 11:05. The problems overnight with access to the CF cluster caused a script running there to become "confused". It proceeded to send out MPEC 2008-F58, rather than MPEC 2009-M35. We (think we) have killed the processes that are mailing out the old circular, but a large number of subscribers may receive MPEC 2008-F58.

    • MPEC Issuance and Website Updating
      2009 June 24: 22:17. We are currently unable to issue MPECs, apparently a result of the OS upgrades mentioned in the item below. We have no automated transfer of files from our cluster to the CF webserver or the CF cluster. The problem also affects the automated updating of all our pages that are hosted on the CF webserver.
      • June 25: 09:30. The problem was apparently cleared some time this morning. We were not notified that it was fixed.

    • CF Webserver Outage on June 24
      2009 June 24: 12:17. We have been informed that the CF will be doing OS upgrades on the NetApp filers which, amongst other things, serve up the CF webpages. The CF webserver will be off-line during this period. The upgrades will start between 17:15 and 17:30 and are expected to take 45 minutes. Access to MPC cgi scripts will continue to be possible during this outage via our mirror pages.

    • URLs in the "New Look"
      2009 June 18: 12:45. It is worth remarking that, except as noted below, all URLs in the old look MPC website continue to work unchanged in the new look. The exceptions are:
      • The MPC Status reports covering the period 2003-2004, which were previously stored in monthly files, have been amalgamated into six-monthly files. This cuts down on the number of files.

    • Checker Services
      2009 May 15: 21:20. A user reported that MPChecker failed to locate a recently-designated NEO. The object was missing from the datafile used for the NEO/NEOCMT/MPChecker services. The problem has been traced to a disk that is used by the procedure that generates the special-epoch elements needed for these services getting filled up. The problem has apparently existed for the past three days. A slight addition to the procedure has been made to try and prevent this happening in the future. The affected files are being regenerated.
      • 21:38. The "problem" object mentioned by the user is now found by NEOChecker.

    • Webserver
      2009 May 6: 10:00. We have received a number of reports about perceived problems with the MPC webserver overnight. The webserver has been functioning normally, it was simply being swamped by a very large number of MPChecker requests from a machine at ESO.

    • Non-emailed MPECs
      2009 May 2: 11:00. Two subscribers have reported non-delivery of MPECs 2009-J04 through -J07. Later MPECs were delivered without problem. It seems that the machine that mails out the circulars was rebooted (probably to do patch installation) around the time that the non-delivered circulars were issued. The mailing of these circulars is being redone. Some subscribers who are at the start of the mailing lists may thus receive a second copy. We can't easily avoid this problem.

    • AUTOACK Problem Overnight
      2009 April 22: 10:40. A problem with an observer submitting some previously-reported observations caused AUTOACK to stall around 02:43 this morning. AUTOACK kept claiming that the stalled batch had been removed, but it never got cleared. The procedure has been modified to prevent this happening again.

    • Lists of NEOs and TNOs
      2009 March 18: 22:00. We are aware that there are some problems with the program that generates the lists of NEOs and TNOs on our site. A new version of the program is in preparation and we expect it to be on-line by early next week.

    • SMTP E-mail Interruption (March 12)
      2009 March 10: 11:20. We have been informed that the CF's SMTP server will be off-line for approximately 30 minutes beginning at 06:05 EST on Thursday, March 12, to allow OS patches to be installed. Attempts to send e-mail to us during this period will fail, but the mail should queue up on the sending computer for a subsequent resend attempt.

    • MPC Webserver Problem
      2009 Mar. 9: 16:00. The MPC webserver is experiencing hangs for an unknown reason. Stopping and restarting the webserver clears the problem but within minutes the webserver stops serving pages. We are investigating.
      • Mar. 10: 12:10. It appears that the hangs are being caused by "resource wait for system communication services" (RWSCS). An obvious cause of this is errors in cluster communications caused by failing intranet components. Investigations are continuing.
      • 14:30. The problem is proving hard to track down. It is possible that it related to one of the cgi scripts and the fact that the Minor Planet Centre is in "preparation" mode. We should go back to "processing" mode late tonight. If the problem disappears at that time, it will give us a bit hint that the afore-mentioned cgi script is the culprit.
      • 10:30. The MPC went back to "processing" mode in the early morning. The MPC webserver has been stable since late last night.

    • CF Webserver Outage on Feb. 26
      2009 Feb. 19: 12:17. We have been informed that the CF will be doing OS upgrades on the NetApp filers which, amongst other things, serve up the CF webpages. The upgrades will start between 17:15 and 17:30 and are expected to take 30-40 minutes. Access to MPC cgi scripts will continue to be possible during this outage via our mirror pages.

    • Issue with Missing Orbits During MPC Preparation
      2009 Feb. 8: 11:45. An alternate fix for the long-standing problem with elements for object that are being numbered being unavailable in the MPES for a period during MPC preparation has been found and implemented.

    • Funding for MPC restored
      2009 Feb. 3: 13:00. Funding for the MPC has been restored. The previously-posted statement on this topic (dated Jan. 15) has been removed. The files associated with the January MPCs are being posted right now.

    • Version Number Overflow
      2009 Jan. 7: 11:00. A number of cgi scripts started failing overnight when they could no longer produce temporary output files due to those files reaching the maximum version number. These temporary output files were used for debugging and are no longer needed. They are being deleted.

    • Recent MPECs
      2009 Jan. 2: 13:23. Recent MPECs (i.e, those issued after MPEC 2009-A05) are not accessible due to a protection problem on the directory containing the 2009 circulars. We are attempting to fix the problem.
      • 13:44. We believe the protection problem is fixed and the missing circulars have been replaced.