Dear colleagues, On Monday, 9/19/2011, from 9:00am to 12:00pm the SDSC Project Storage services, SDSC Cloud Storage, Data Oasis Parallel files systems, and commodity internet connections to HPC hosts will be unavailable. SDSC will be adding a second Arista switch to the Data Oasis and Cloud/Project Storage infrastructure. This will provide additional redundancy and failover …
Monthly Archive: September 2011
SDSC PM Notification, 9/20/2011 6:00-9:00pm, SDSC Networking
On Tuesday, September 20th, 2011 from 6:00pm-9:00pm we will be performing a major upgrade to the SDSC network environment. In preparation for the upcoming migration to the new Juniper MX-960 network equipment we must upgrade from HSRP, a Cisco proprietary routing protocol, to VRRP, an IEEE industry standard protocol. These new and more reliable Juniper …
SDSC Event Summary, 9/9/11, 9/8 Power Outage
On Thursday 9/8/11 at approximately 3:40pm San Diego Gas and Electric experienced a power outage that affected large portions of Southern California. At that time the SDSC datacenter switched to UPS backup power and the generator was activated. The UPS backup provided time to cleanly shut down core systems, and systems on generator power were …
SDSC/UCSD/San Diego Power Outage Sept 8 & 9, 3:40PM – 6:00AM
6:00 AM Sept 9th: Good Morning, SDGE says most of the county is back up and we again appear to be stable. At this time cooling is online and we are beginning to bring services back up. If you are a collocation customer, please feel free to begin turning on your equipment also. 1:34 AM …
SDSC Incident Notification, Network Interruption, 9/7/2011 8:30pm
Title: SDSC Incident Notification, Network Interruption, 9/7/2011 8:30pm Description: Additional hardware issues with the Colocation Arista switches have caused limited network outages. Staff is investigating and more information will be posted as available. update 9/8, 01:18 : Problem Resolved. Links between East/West Arista and Thunder were cycled after spanning-tree issues were seen. Secondary links to …
SDSC Incident Notification, Network Interruption, 9/7/2011 3:45pm
At approximately 3:41 PM 9/7/2011, one of our Colocation Arista switches experienced a hardware reset. The heartbeat from the FocalPoint chip reset, causing the process manager on both peer switches to restart the process and terminate all MLAG connections. This self-healing feature reset the interfaces and the primary switch election took place, returning the mastership to …
SDSC Incident Notification, External Connectivity, 9/6/2011 3:45pm
At approximately 2:45 pm today, 9/6/2011, SDSC experienced a routing issue that was preventing connections to SDSC from external, as well as internal connectivity to many hosts. SDSC ENS staff has resolved the issue and all network connectivity has been restored as of 3:40 pm. We apologize for this unexpected interruption.
SDSC Incident Notification, 9/3/2011, Commvault media agent ‘cvma1.sdsc.edu’
Title: SDSC Incident Notification, 9/3/2011, Commvault media agent ‘cvma1.sdsc.edu’ Description: Connection between Commvault media agent ‘cvma1.sdsc.edu’ and the archival storage system is current unavailable. Backup jobs against this media agent are paused and will resume automatically once the media agent is returned to operation. Jobs running against ‘cvma2.sdsc.edu’ continue to operate normally Date: 2011-09-03
SDSC Incident Notification, SAM QFS Filesystems, 9/3/11, 12:25pm
Last night the SDSC SAMQFS archival storage system showed additional errors — those have been resolved and all file systems are operational. We apologize for the inconvenience these outages may have caused. If you are still having problems please contact SDSC operations at 858-534-5090 or operations@sdsc.edu. SDSC Operations staff are available 24/7 to assist you …
SDSC Incident Notification, 9:47am 9/02/2011, Sam-QFS /Archive Filesystems
Title: SDSC Incident Notification, 9:47am 9/02/2011, Sam-QFS /Archive Filesystems Description: At approximately 8:30am 9/2/2011 SDSC began seeing issues with a component of the SamQFS system which is affecting /archive/science and /archive/users shares. The system has failed over to the secondary server, however archiving and staging files to/from tape is unavailable. You may experience delays when …