At present, the most commonly used ECC code in SSD is BCH code. When writing data, the ECC module inside the controller will calculate the data and generate an ECC signature. Generally speaking, this step is very fast, so it will not affect the performance of the whole SSD too much. Usually, ECC signatures are stored in the SA area behind NAND pages. When reading data from NAND, ECC module returns to read ECC signature and compares whether it is the same or not to find the error.
Fixing received data errors is more complicated than finding them. The first step is to check whether the received data is wrong, as fast as the above operation of generating ECC signature. If it is detected that the received data contains error bits, a unique ECC algorithm (such as BCH) needs to be generated, which will lead to performance loss, but only when errors are detected. The generated ECC algorithm is used to repair previously detected errors.
It must be emphasized that the decoding process of ECC may fail, so the architecture of ECC system must be designed reasonably to ensure that ECC does not make mistakes. The number of error bits that ECC can repair depends on the design of ECC algorithm.
If ECC cannot be corrected, it is usually reported as ECC failure, and the user will appear as reading failure. Sometimes ECC can't even diagnose an error, which will lead to data errors.
The stability of NAND needs to be guaranteed in many ways. ECC can only be used to ensure that certain bits are repaired when an error occurs. If there is a large area error in the whole page or even the whole block, only redundant protection such as RAID can be repaired.
Enterprise products have even stricter requirements for ECC, that is, data integrity check. Before data enters NAND, all buses in SSD and FIFO data buffers should be checked to detect errors.