Skip to content

Commit 0b3ec75

Browse files
committed
add non-pow-2-len-charset stuff to README
1 parent a9857d2 commit 0b3ec75

File tree

1 file changed

+17
-6
lines changed

1 file changed

+17
-6
lines changed

README.md

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,12 @@ $ echo HELLO WORLD | aces "DORK BUM"
2929
RRD RBO RKD M DRBU MBRRRKD RDOR
3030
```
3131

32+
You can also use emojis:
33+
```shell
34+
$ echo -n uwonsmth | aces 🥇🥈🥉
35+
🥇🥉🥇🥇🥉🥈🥇🥈🥈🥉🥈🥉🥈🥈🥈🥉🥇🥈🥇🥈🥇🥇🥇🥈🥉🥈🥈🥈🥉🥉🥉🥇🥈🥈🥉🥉🥉🥉🥇🥇🥈
36+
```
37+
3238
With Aces, you can see the actual 0s and 1s of files:
3339
```shell
3440
aces 01 < $(which echo)
@@ -125,11 +131,11 @@ echo -n -e \\x09\\x92 | base64 # base64 also adds a "=" character called "paddin
125131

126132
### Aces
127133

128-
Now we generalize this to all character sets.
134+
Now we generalize this to all character sets of any length.
129135

130-
Generalizing the character set is easy, we just switch out the characters of the array storing the character set.
136+
Generalizing the characters is easy, we just switch out the characters of the array storing the character set.
131137

132-
Changing the length of the character set is slightly harder. For every character set length, we need to figure out how many bits the chunked data should have.
138+
Changing the length of the character set is harder. For every character set length, we need to figure out how many bits the chunked data should have.
133139

134140
In the Base64 example, the chunk length (let's call it that) was 6. The character set length was 64.
135141

@@ -154,10 +160,15 @@ Every bit can either be 1 or 0, so the total possible values of a certain number
154160

155161
The total possible values is the length of the character set (of course, since we need the indices to cover all the characters of the set)
156162

157-
So, to find the number of bits the chunked data should have, we just do `log2(character set length)`. Then, we divide the bytes into chunks of that many bits (which was pretty hard to implement: knowing when to read more bytes, crossing over into the next byte to fetch more bits, etc, etc.), use those bits as indices for the user-supplied character set, and print the result. Easy! (Nope, this is the work of several showers and a lot of late night pondering :)
158-
159-
163+
So, to find the number of bits the chunked data should have, we just do `log2(character set length)`. Then, we divide the bytes into chunks of that many bits (which was pretty hard to implement: knowing when to read more bytes, crossing over into the next byte to fetch more bits, etc, etc.), use those bits as indices for the user-supplied character set, and print the result.
160164

165+
Unfortunately, this algorithm only works for character sets with a length that is a power of 2. For character sets with a length that is not a power of 2, we need to do something else.
161166

162167

168+
Sets that are not power of 2 in length use an algorithm that may not have the same output as other encoders with the
169+
same character set. For example, using the base58 character set does not mean that the output will be the same as a base58-specific encoder.
170+
This is because most encoders interpret data as a number and use a base conversion algorithm to convert it to the
171+
character set. For non-power-of-2 charsets, this requires all data to be read before encoding, which is not possible
172+
with streams. To enable stream encoding for non-power-of-2 charsets, Aces converts the base of a default of 8 bytes of data at a time, which is not the same as converting the base of the entire data.
163173

174+
Easy! (Nope, this is the work of several showers and a lot of late night pondering :)

0 commit comments

Comments
 (0)