I'm confused about LEB128 or Little Endian Base 128 format. In the AOSP
source code Leb128.java
, its read function's return type whether signed or unsigned is int
. I know the the size of int
in java is 4 bytes aka 32bits
. But the max length of LEB128
in AOSP is 5 bytes aka 35 bits
. So where are the other lost 3bits
.
Thanks for your reply.
Each byte of data in LEB only accounts for 7 bits in the actual output - the remaining bit is used to indicate whether or not it's the end.
From Wikipedia:
To encode an unsigned number using unsigned LEB128 first represent the number in binary. Then zero extend the number up to a multiple of 7 bits (such that the most significant 7 bits are not all 0). Break the number up into groups of 7 bits. Output one encoded byte for each 7 bit group, from least significant to most significant group.
The extra bits aren't so much "lost" as "used to indicate whether or not it's the end of the data".
You can't hope to encode arbitrary 32-bit values and some of them taking less than 4 bytes without some of them taking more than 4 bytes.
See more on this question at Stackoverflow