basic algorithm description
This commit is contained in:
parent
48b015d660
commit
b5b59b8031
56
README.md
56
README.md
|
@ -21,4 +21,58 @@ $ sudo make install
|
||||||
|
|
||||||
## Algorithm description
|
## Algorithm description
|
||||||
|
|
||||||
TOOD.
|
Standard brainfuck text generation algorithms are usually stateless (i.e.
|
||||||
|
load a byte value, output, clear, repeat) or use otherwise low order finite
|
||||||
|
state. Using [optimal constants](https://esolangs.org/wiki/Brainfuck_constants)
|
||||||
|
for such a generator renders it optimal on IID inputs. The generator is still
|
||||||
|
however unable to efficiently exploit the redundancy in such an input stemming
|
||||||
|
from the underlying probability distribution of the source.
|
||||||
|
|
||||||
|
Low-order finite state generators usually encode transitions between cell states
|
||||||
|
that are desired to be output. As an example, redundancy on the 2-wide sliding
|
||||||
|
window level can be exploited by computing a 256x256 table of optimal transition
|
||||||
|
phrases between cell values.
|
||||||
|
|
||||||
|
This however does not approximate real, variable order redundancy in the source.
|
||||||
|
Techniques that implement data compressors in brainfuck and load the compressed
|
||||||
|
data to memory, decompressing it at the runtime, generally exhibit poor performance
|
||||||
|
characteristics due to the high overhead of random memory access in Brainfuck
|
||||||
|
(fastest algorithms are quintic-time).
|
||||||
|
|
||||||
|
blz78suf operates by finding long phrases in the input that can be encoded using
|
||||||
|
procedural logic. For example:
|
||||||
|
|
||||||
|
```
|
||||||
|
void the() { printf("the"); }
|
||||||
|
int main() {
|
||||||
|
the(); printf(" quick brown fox jumps over ");
|
||||||
|
the(); printf(" lazy dog");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Could be shorter than:
|
||||||
|
|
||||||
|
```
|
||||||
|
int main() {
|
||||||
|
printf("the quick brown fox jumps over the lazy dog");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Had an approperiate, sufficiently repetitive phrase been chosen. The benefit of this
|
||||||
|
approach is that we can deduplicate repeating phrases and delegate the more granular,
|
||||||
|
lower order redundancy to a stateless generator.
|
||||||
|
|
||||||
|
blz78suf builds on a stateless text generator nicked from the CodeGolf StackExchange
|
||||||
|
website for the Brainfuck Golf challenge. The same generator is used by
|
||||||
|
[copy.sh](https://copy.sh/brainfuck/text). Phrases are found by constructing the
|
||||||
|
suffix trie and ranking potential replacements by their frequency and length.
|
||||||
|
Then, the individual messages are encoded and the output is generated.
|
||||||
|
|
||||||
|
## Future improvements
|
||||||
|
|
||||||
|
blz78suf is a prototype and as such, it is not optimized for speed. The algorithm
|
||||||
|
could be sped up by using an efficient exclusion algorithm for suffix tries and
|
||||||
|
by using Ukkonen's algorithm for linear-time compressed structures. The procedural
|
||||||
|
structure of the output could be optimized by allowing phrasal chaining. Further,
|
||||||
|
a more efficient low order generator could be used. For example, such a desirable
|
||||||
|
tool would detect patterns via delta encoding and run length encoding.
|
Loading…
Reference in New Issue
Block a user