Skip to content

Commit e14c7f1

Browse files
authored
Update README.md
1 parent 6348840 commit e14c7f1

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

paddle/fluid/recordio/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
## Background
22

3-
RecordIO is a file format as a container of records. This package is a C++ implementation of https://github.com/paddlepaddle/recordio, which originates from https://github.com/wangkuiyi/recordio.
3+
The RecordIO file format is a container for records. This package is a C++ implementation of https://github.com/paddlepaddle/recordio, which originates from https://github.com/wangkuiyi/recordio.
44

55
## Fault-tolerant Writing
66

7-
For the initial design purpose of ReocrdIO within Google, which was logging, RecordIO groups record into *chunks*, whose header contains an MD5 hash of the chunk. A process that writes logs is supposed to call the Writer interface to add records. Once the writer accumulates a handful of them, it groups a chunk, put the MD5 into the chunk header, and appends the chunk to the file. In the case that the process crashes unexpectedly, the leftover could be that the last chunk in the file was half-written. This doesn't prevent the process, after restarted, continue writing to the same RecordIO file, because the reader will be able to identify incomplete chunks and skip them.
7+
For the initial design purpose of RecordIO within Google, which was logging, RecordIO groups record into *chunks*, whose header contains an MD5 hash of the chunk. A process that writes logs is supposed to call the Writer interface to add records. Once the writer accumulates a handful of them, it groups a chunk, put the MD5 into the chunk header, and appends the chunk to the file. In the event the process crashes unexpected, the last chunk in the RecordIO file could be incomplete/corrupt. The RecordIO reader is able to recover from these errors when the process restarts by identifying incomplete chucks and skipping over them.
88

99
## Reading Ranges
1010

0 commit comments

Comments
 (0)