1、 Storage file structure

  • topic: can be understood as the name of a message queue
  • partitionTo achieve scalability, a very large topic can be distributed to multiple brokers (i.e. servers), and a topic can be divided into multiple partitions, each of which is an orderly queue
  • segment: partition is physically composed of multiple segments
  • message: every piece of data actually stored in each segment file is a message
  • offset: each partition consists of a series of orderly and immutable messages, which are continuously appended to the partition. Each message in the partition has a continuous serial number called offset, which is used to uniquely identify a message

Kafka file storage mechanism

2、 Partition partition

In this case, the topic name istest-topic, the default setting is partition 3. After the topic is created successfully, the default storage location is:/tmp/kafka-logsNext, partitionTopic name - number of partitionsName,Regardless of replica) as follows:

//Distributed on different broker nodes  
test-topic-0  
test-topic-1  
test-topic-2

Question 1:Why partition?

For the sake of performance, if there is only one broker for each topic message that is not partitioned, then all consumers consume messages from this broker, and the single node broker becomes the bottleneck of performance. If there is a partition, the messages sent by the producer are stored on different partitions of each broker, so that consumers can be different from different brokers in parallel Read the message on the partition of and realize the horizontal expansion.

Question two:What’s under the partition file?

As follows, in fact, a lot of files are saved in each partition. Conceptually, we call it segment, that is, each partition is composed of multiple segments, among which index (index file), log (data file) and time index (time index file) are collectively referred to as a segment.
test-topic-0  
├── 00000000000000000001.index  
├── 00000000000000000001.log  
├── 00000000000000000001.timeindex  
├── 00000000000000001018.index  
├── 00000000000000001018.log  
├── 00000000000000001018.timeindex  
├── 00000000000000002042.index  
├── 00000000000000002042.log  
├── 00000000000000002042.timeindex

Question 1:Why do I need segment with partition?

The above directory shows that there are multiple segments. Since there are partitions, why save multiple segments? If segment is not introduced, then a partition only corresponds to a file (log). With the continuous sending of messages, the file is growing. Because Kafka messages are not updated, they are written in sequence. If message cleaning is done, only the front part of the file can be deleted, which does not conform to the design of Kafka sequential writing. If there are multiple segments, then the comparison method is used Therefore, deleting the entire file directly ensures the sequential writing of each segment

3、 Segment storage

The core files in segment are index files and log data files. Since index files are used to locate data more efficiently, what data are stored in index files and data files? How to find the message data quickly?

3.1 use Kafka’s own script to send test data

sh kafka-producer-perf-test.sh --topic test-topic --num-records 50000000 --record-size 1000 --throughput 10000000  --producer-props bootstrap.servers=192.168.60.201:9092

3.2 use the dafka script dump index

sh kafka-run-class.sh kafka.tools.DumpLogSegments --files /tmp/kafka-logs/test-topic-0/00000000000000001018.index --print-data-log
offset: 1049 position: 16205  
offset: 1065 position: 32410  
offset: 1081 position: 48615  
offset: 1097 position: 64820  
offset: 1113 position: 81025  
offset: 1129 position: 97230

Through dump index, we find that in fact, offset and position are saved in the index file. They are the offset of the message, that is, the specific message. Position indicates the physical address of the specific message stored in the log.

Question 1:From the above data, it can be seen that Kafka does not save every offset. It stores index data every six offsets. Why are these offset numbers not consecutive in the index file?

Because the index file does not index every message in the data file, but uses the sparse storage method to establish an index every certain byte of data. This avoids the index file taking up too much space, so the index file can be kept in memory. But the disadvantage is that messages that are not indexed cannot be located in the data file at one time, so a sequential scan is needed, but the scope of this sequential scan is very small.

3.3 use the dafka script dump log

sh kafka-run-class.sh kafka.tools.DumpLogSegments --files /tmp/kafka-logs/test-topic-0/00000000000000001018.log --print-data-log

Kafka file storage mechanism

The log data file does not directly store data, but consists of many messages, which contain the actual message data

Keyword interpretative statement
8 byte offset Each message in the partition has an ordered ID number, which is called offset. It can uniquely determine the location of each message in the partition. That is to say, the offset represents the number of messages of partiion
4 byte message size Message size
4 byte CRC32 Check message with CRC32
1 byte “magic” Indicates the version number of Kafka service program agreement released this time
1 byte “attributes” Represents a stand-alone version, or identifies a compression type, or encoding type
4 byte key length Indicates the length of the key. When the key is – 1, the K byte key field is not filled in
K byte key Optional
value bytes payload Represents actual message data

Question 1:How can consumers find message through offset?

If we want to read the message with offset = 1066, we need to find it through the following two steps.

  1. Find segment file
    000000000000000.index indicates the first file, with the start offset of 0. The message start offset of the second file 0000000000000000 1018.index is 1019 = 1018 + 1. Similarly, the start offset of the third file 0000000000000000 2042.index is 2043 = 2042+ 1. Other subsequent files are named and sorted by the starting offset as long as they are based on the offsetTwo points searchFile list, you can quickly locate specific files. When offset = 1066, navigate to 00000000000000001018. Index|log
  2. Find message through segment file
    In the first step, locate the segment file. When offset = 1066, locate the metadata physical location of 0000000000000000001018.index and the physical offset address of 0000000000000000001018.log in turn. At this time, we can only get the physical offset address of 1065, and then search through 00000000000000368769.log until offset = 1066. Each message has a fixed format, so it’s easy to determine whether it’s the next message

Kafka file storage mechanism