On Wednesday, 15 August 2018, at approximately 1pm, the primary VMware cluster at SDSC experienced a complete outage. During the outage all VM guests were unable to access disk resources, and eventually needed to be powered off to allow cluster recovery. At 10pm the cluster was stable, at which point all VM guests were placed …
Category Archive: Incident Notification
SDSC Outage Notification – multiple networks in building/datacenter – 15 Aug 2018, 13:15
[Update – 22:59] All guests are online and appear to be acting normally. Please contact support with any questions or concerns. [Update – 22:00] Engineers are slowly booting guests and monitoring stability of the VMware cluster. Additional updates will be posted once all guests are on and the environment is operating normally. [Update – 18:42] …
SDSC Outage Notification – Project Storage Hotel Node – 2 Aug 2018, 15:00
At approximately 15:00 on 2 August 2018 a single project storage hotel node froze and required a hard reboot. Following the reboots, some NFS clients may experience stale NFS file handles which a remount should correct. The services appear up and functioning. Admins of systems may need to be contacted to perform remounts. Please contact …
SDSC Outage Notification – Project Storage Hotel Node – 6 June 2018, 04:00-07:00
At approximately 04:00 on 6 June 2018 a single project storage hotel node lost connectivity with the attached disk. At approximately 07:00 a storage engineer reattached the disk pool and repaired the disk export. The services appear up and functioning. Please contact support if problems continue. The affected exports are listed below. ps-004:/ps-data/roylab ps-013:/ps-data/dsc ps-015:/ps-data/ucm_engineering …
SDSC Outage Notification – Citrix remote desktop – 12 Apr 2018, 18:30
The Citrix remote desktop and application service is currently not working. Engineers are investigating and this post will be updated as more information is gathered.
SDSC Outage Notification – 21 Feb 2018, 12:45-13:00
[Update 13:22] During routine system rack maintenance, engineers removed a power distribution unit to a network switch believed to be connected to redundant power. The switch was incorrectly single connected and lost power during this work. The switch has been reconnected to verified dual power and all other storage switches will be verified for correct …
SDSC Outage Notification – intranet.sdsc.edu (Sharepoint) – 21:12, 19 Oct 2017
[Update: 22:43, 19 Oct 2017] The Sharepoint site is now up and sites have been verified. —- The SDSC Sharepoint site, intranet.sdsc.edu, is unavailable. Engineers are investigating the cause.
SDSC Incident Alert – Partial West Datacenter Power Outage – 11:40, 19 Oct 2017
[Update: 12:32, 19 Oct 2017] Power has been restored and services restored. —- A power circuit breaker has failed causing loss of power to six racks in the West datacenter. The affected rack customers include: LJI UCLA UCSD Library Campus electricians have been notified.
SDSC Outage Notification – Sharepoint slow, some features unavailable – 12:00, 17 May 2017
[Update : 14:38] Engineers have found a number of stuck Sharepoints jobs from last night and are attempting to kill those jobs. Performance and feature outages are still unchanged. —- At approximately noon on 17 May 2017 we discovered that the hosted Sharepoint services were slow for rendering pages, and certain features like search were …
SDSC Outage Notification – virtual machine guests – 6 Mar 2017, 10:00
[Update – 6 Mar 2017, 10:25] Networking has been fixed and virtual guests are back online. Engineers are continuing to investigate any remaining issues. — At approximately 10:00 on 6 Mar 2017 during routine network recabling on a redundant link, a VMWare hypervisor node stopped responding. System engineers are investigating and updates will be posted. …