Egregoros

Signal feed

Timeline

Post

Remote status

Context

1
Sad story: the world's largest TV show/movie collection including unreleased films, pilots, etc at ~ 5PB has been lost. People in Hollywood would mail this person stuff to make sure it didn't get lost. Welp, at least that copy is lost now.

a bad storage controller wiped it all out. I don't know the specific controller or configuration, but ZFS was used and I'm supposing the root cause was a ZFS scrub making ZFS think there were errors to correct, so it tried, but it just kept corrupting more and more data because even the corrected writes were wrong once they finally hit the disk

It was being backed up, but the backups got corrupted too because it wasn't noticed in time.

Replies

1
@AngelCelt well when rclone reads the files to check for changes and they don't match anymore (every time you read them, even) it seems pretty reasonable how it was happening.

He had 10gbit fiber and was backing up to our many-hundred exabyte cluster, and was probably away traveling while his pipe was saturated just cranking out corrupted data to overwrite the old copies of the files