Chance of damage to your hard drive or memory?

I have several hundred computers running the application. On one computer, I saw that two instances of the same bit are incorrectly specified on some lines that I exit from SQLite. If it were my dev computer, I would suggest that I have an error somewhere, but there are, of course, a number of installations in which I will start to see rare errors based on hardware.

It certainly depends on how much IO I do, but are there any thumb rules when there is a decent chance of seeing such things? For example, for TCP packets, this document determined that silent, undetected corruption would occur in "about 1 in 16 million to 10 billion packets."

Unfortunately, running the mem / disk check on the machine in question is unlikely to happen.

+3
source share
5 answers

When I notice strange things, my strategy is this:

  • check if there is an error in the code
  • check if there is an error in your library / tool (SQLite, here)
  • check if there is an error in the compiler
  • and then check for hardware failures

In my 10 year career, 99.99% of errors were software related.

Hope this helps.

+4
source

- . CRC - / . , . ECC, , , ECC . , , , , .

+2

: ECC " DRAM 7 , 10 ^ -10 10 ^ -17 / · , , , , , . [7] [11] [12]"

, 100 2 , , . ( RAM. TCP-, , , ..). , - , .

+1

, , .

, , - , , , - . , , memchecker , , . , .

0

. (~ 7 ) bluescreen, . , . / / bluescreens. , .

, WAN TCP- , , CRC. , .

0

Source: https://habr.com/ru/post/1697913/


All Articles