axy\codecs\base64vlq
Codec for VLQ (variable-length quantity) Base64 algorithm (PHP)., (*1)
, (*2)
Documentation
VQL + Base64
Base64 allows us to represent a sequence of numbers in a text string
that can be stored and transmit in text formats (JSON, XML, etc)., (*3)
VLQ allows us to represent an integer in a sequence numbers with little digit capacity.
For example 6 bit is the limit for Base64.
The resulting numbers are called "VLQ digits".
Small input number is represented by fewer VLQ-digits than big number.
Thus VLQ is most effective if the input sequence is contains mainly the small numbers., (*4)
VLQ+Base64 allows us effectively represented a sequence of integers (dominated by a small number of) in the text format., (*5)
For example, it used in JavaScript/CSS source map., (*6)
The Algorithm
For example, we have a block of numbers: [12345, -12345, 0]
., (*7)
(1). VLQ only works with unsigned integers.
Transfer the sign bit to the end of the integer., (*8)
12345
in binary is 11000000111001
.
Added 0
(positive) to the end: 110000001110010
., (*9)
For -12345
take a positive form and add 1
(negative) to the end: 110000001110011
., (*10)
Result is [110000001110010, 110000001110011, 0]
., (*11)
(2). Transform to the VLQ-sequence.
For Base64 we need a block of 6-bit numbers.
Most significant bit is reserved - it is "continuation"., (*12)
Split numbers to groups of 5 bits: [11000 00011 10010, 11000 00011 10011, 00000]
.
Output starting from the least significant bits.
If the group is not the last in the current number then set the continuation bit., (*13)
Result: [110010 100011 011000 110011 100011 011000 000000]
.
Or decimal [50, 35, 24, 51, 35, 24, 0]
.
These are VLQ digits., (*14)
(3). Replace the numbers on the letters of the Base64-alphabet.
The standard alphabet is A..Za..z0..9+/
., (*15)
Result is yjYzjYA
., (*16)
How to use the library
use axy\codecs\base64vlq\Encoder;
$encoder = new Encoder();
$encoder->encode([12345, -12345, 0]); // yjYzjYA
$encoder->decode('yjYzjYA'); // [12345, 012345, 9]
$encoder->decode('Variable+Length+QuantitY'); // [-10, 13, -13349, -13 ... -12797139]
Or default encoder can be obtained as follow:, (*17)
$encoder = Encoder::getStandardInstance();
In this case, the object creation and preliminary calculations are performed only once., (*18)
Custom options
The standard encoder uses standard options:, (*19)
- Transfer the sign bit.
- 6-bit VLQ digits.
- Standard alphabet:
A..Za..z0..9+/
.
You can apply your settings:, (*20)
Encoder::__construct([array|string $alphabet, int $bits, bool $signed = true])
/* Custom alphabet, 4 bits, no sign transfer */
$encoder = new Encoder('My Alphabet', 3, false);
$encoder->encode([12345, 6789]); // phalllApplhhhy
Custom alphabet can be specified as a string (My Alphabet
) or as an array ([1 => 'A', 10 => 'B', 15 => 'C', 20 => 'D']
)., (*21)
Exceptions
The error classes are located in the namespace axy\codecs\base64vlq\Encoder\Errors
., (*22)
- Error
- VLQ
- Base64
- InvalidBase64
- InvalidBase64Input
InvalidVLQSequence
$encoder->decode('Az'); // VLQ sequence is invalid: [0,51]
z
is 51 (the continuation bit = 1).
The last digit must have 0 in continuation bit., (*23)
InvalidBase64
$encoder->decode('A*A'); // Base-64 string is invalid: "A*A"
*
are not in the standard Base64 alphabet., (*24)
For the standard encoder this exception should not occur., (*25)
Can occur for incorrect custom options:, (*26)
$encoder = new Encoder('qwe', 10);
$encoder->encode([10, 20, 30]); // Number 20 is not found in Base64 alphabet
10 bits is 1024 variants, but the alphabet contains only 3 letter., (*27)