Tuga:
Um colaborador do projeto KDE disse num artigo no seu blog "Quase se tornou O Grande Desastre do KDE de 2013" - O projeto por pouco não perdeu todos os seus repositórios do Git. O Projeto KDE aloja 1500 repositórios de código para uma quantidade de aplicações open-source que estão relacionadas com o ambiente de trabalho, e quase todas iam sendo apagadas na ultima semana devido a uma combinação de erro de software, e um problema no mirror.
O KDE utiliza o servidor de Git git.kde.org e um numero de servidores sincronizados para fornecer acesso a utilizadores na web. Os dev's estavam a utilizar esses servidores como um backup ao repositório principal, o que acabou por se tornar um erro.
Quando os dev's desligaram as maquinas virtuais no servidor principal para receber atualizações de segurança algo correu mal e causou corrupção no sistema de ficheiros das VM's, o que inutilizou os repositórios.
Como os servidores de backup estavam configurados para sincronizar com o servidor principal sem verificar por erros de corrupção, eles procederam com as sincronizações através do repositório defeituoso.
Quando a manutenção descobriu o que se passava já seria tarde demais. Os servidores secundários já tinham apagado maior parte do repositório, até ao ponto de o Git considerar legitima a informação restante ignorando o que fora apagado. Nesta altura os Dev's do projeto julgaram ter perdido todos os repositórios. No entanto tiveram sorte, um servidor novo com destino á migração para outro datacenter foi descoberto com uma copia em perfeito estado dos repositórios. Este servidor não sincronizou os dados, por a altura da sincronização coincidir com a hora da atualização do servidor principal.
Foi esta coincidência, e a migração do servidor que salvou a totalidade do projeto KDE, o qual podia ter desaparecido na totalidade após a destruição de todos os seus repositórios.
Os dev's do KDE procuram por formas de evitar potenciais desastres no futuro.
Original:
KDE contributor Jeff Mitchell has written a blog post on what he says "almost became The Great KDE Disaster Of 2013" the project narrowly avoiding the loss of all of its Git repositories. The KDE Project hosts 1500 code repositories for a number of open source applications that are affiliated with the desktop environment and all of them were nearly wiped out last week due to a combination of a software fault and a problematic mirroring setup. The KDE developers are now looking into ways of avoiding such a potential disaster in the future.
KDE uses a main Git server at git.kde.org and a number of mirrored servers that provide the Git access for anonymous web users. The developers were treating these servers as backups of the main repository which turned out to be a mistake. When the developers shut down the virtual machines on the main server to perform security updates on the host, something in the process caused filesystem corruption in the VMs that made the Git repositories stored there unusable. Since the secondary servers for the anonymous Git access were configured to sync with the main server and did not check for corruption, they proceeded to synchronise their repositories with the faulty repositories on the main server.
When the KDE maintainers discovered this, it was already too late. The secondary servers had already deleted most of the corrupted repositories as the corruption meant most of them were not listed in the projects file any more; at this point Git assumed they had been deleted legitimately. Mitchell describes this situation as "too perfect a mirror" as the secondary servers had inadvertently duplicated the destroyed data on the main system. At this point, the KDE developers thought they had lost all of their Git repositories.
The developers got lucky however, as a new server they had set up for data centre migration was discovered to have retained a pristine copy of all of the project's repositories. This server did not manage to sync with the corrupt copy of the Git data as its synchronisation window fell into the timespan in which the main server was down for the security update. This lucky coincidence and the completely unconnected data centre migration the server had been prepared for therefore saved all of KDE's Git repositories.
In a follow-up post, Mitchell explains that the developers had other backups in place, specifically tarballs of the source code for all of the hosted projects, but that these backups do not retain the Git metadata and other important information. He also details the changes the project will be making to its backup strategy, which mostly involves including sanity checks in the processes that clone the Git repositories to the mirror servers. By adding those checks, corrupted data will not be duplicated in the future and missing project files will not cause repositories to be dropped automatically when they should not have been. As part of this the new main server will also store the Git information on a ZFS filesystem that can restore previous versions of the data through its internal snapshotting mechanism.
Source: http://h-online.com/-1829776
Um colaborador do projeto KDE disse num artigo no seu blog "Quase se tornou O Grande Desastre do KDE de 2013" - O projeto por pouco não perdeu todos os seus repositórios do Git. O Projeto KDE aloja 1500 repositórios de código para uma quantidade de aplicações open-source que estão relacionadas com o ambiente de trabalho, e quase todas iam sendo apagadas na ultima semana devido a uma combinação de erro de software, e um problema no mirror.
O KDE utiliza o servidor de Git git.kde.org e um numero de servidores sincronizados para fornecer acesso a utilizadores na web. Os dev's estavam a utilizar esses servidores como um backup ao repositório principal, o que acabou por se tornar um erro.
Quando os dev's desligaram as maquinas virtuais no servidor principal para receber atualizações de segurança algo correu mal e causou corrupção no sistema de ficheiros das VM's, o que inutilizou os repositórios.
Como os servidores de backup estavam configurados para sincronizar com o servidor principal sem verificar por erros de corrupção, eles procederam com as sincronizações através do repositório defeituoso.
Quando a manutenção descobriu o que se passava já seria tarde demais. Os servidores secundários já tinham apagado maior parte do repositório, até ao ponto de o Git considerar legitima a informação restante ignorando o que fora apagado. Nesta altura os Dev's do projeto julgaram ter perdido todos os repositórios. No entanto tiveram sorte, um servidor novo com destino á migração para outro datacenter foi descoberto com uma copia em perfeito estado dos repositórios. Este servidor não sincronizou os dados, por a altura da sincronização coincidir com a hora da atualização do servidor principal.
Foi esta coincidência, e a migração do servidor que salvou a totalidade do projeto KDE, o qual podia ter desaparecido na totalidade após a destruição de todos os seus repositórios.
Os dev's do KDE procuram por formas de evitar potenciais desastres no futuro.
Original:
Citação:
KDE contributor Jeff Mitchell has written a blog post on what he says "almost became The Great KDE Disaster Of 2013" the project narrowly avoiding the loss of all of its Git repositories. The KDE Project hosts 1500 code repositories for a number of open source applications that are affiliated with the desktop environment and all of them were nearly wiped out last week due to a combination of a software fault and a problematic mirroring setup. The KDE developers are now looking into ways of avoiding such a potential disaster in the future.
KDE uses a main Git server at git.kde.org and a number of mirrored servers that provide the Git access for anonymous web users. The developers were treating these servers as backups of the main repository which turned out to be a mistake. When the developers shut down the virtual machines on the main server to perform security updates on the host, something in the process caused filesystem corruption in the VMs that made the Git repositories stored there unusable. Since the secondary servers for the anonymous Git access were configured to sync with the main server and did not check for corruption, they proceeded to synchronise their repositories with the faulty repositories on the main server.
When the KDE maintainers discovered this, it was already too late. The secondary servers had already deleted most of the corrupted repositories as the corruption meant most of them were not listed in the projects file any more; at this point Git assumed they had been deleted legitimately. Mitchell describes this situation as "too perfect a mirror" as the secondary servers had inadvertently duplicated the destroyed data on the main system. At this point, the KDE developers thought they had lost all of their Git repositories.
The developers got lucky however, as a new server they had set up for data centre migration was discovered to have retained a pristine copy of all of the project's repositories. This server did not manage to sync with the corrupt copy of the Git data as its synchronisation window fell into the timespan in which the main server was down for the security update. This lucky coincidence and the completely unconnected data centre migration the server had been prepared for therefore saved all of KDE's Git repositories.
In a follow-up post, Mitchell explains that the developers had other backups in place, specifically tarballs of the source code for all of the hosted projects, but that these backups do not retain the Git metadata and other important information. He also details the changes the project will be making to its backup strategy, which mostly involves including sanity checks in the processes that clone the Git repositories to the mirror servers. By adding those checks, corrupted data will not be duplicated in the future and missing project files will not cause repositories to be dropped automatically when they should not have been. As part of this the new main server will also store the Git information on a ZFS filesystem that can restore previous versions of the data through its internal snapshotting mechanism.