The Register had a chance to conduct a brief interview with the Windows Azure general manager, Mike Neil, about what caused the recent global Azure failure.  The beginning was an update pushed to the Red Dog front end software which customers interface with and which communicates to load balancers for resource scheduling which started to break the ability of some admins to move VMs from staging to production.  While the problems were limited and intermittent, they were occurring in all regions of the globe which did not speak well of the systems partitioning.  Microsoft has realized that Red Dog is a single point of failure and will be working to modify that for the future and also discussed some of the other underlying technologies here.

