Research

BS99F07D05R08
bib1.911.9261.8871.795
book12.272.3562.2642.147
book21.962.0121.9531.840
geo4.164.2684.1293.967
news2.422.4642.3972.268
obj13.733.7653.6923.584
obj22.452.4332.4112.226
paper12.412.4392.3902.274
paper22.362.3872.3292.230
pic0.720.7530.7140.688
progc2.452.4762.4222.307
progl1.681.6971.6601.576
progp1.681.7021.6661.579
trans1.461.4881.4511.354
average bpb2.262.2982.2402.131

NanoZip is based on original lossless compression technology. Engineering efficient compression algorithms lead to research in various areas such as Burrows Wheeler Transform (BWT). The Calgary Corpus results here show how some of this work compares to the latest known research.

BS99 The best results of Balkenhol. [1]
F07 Fenwick's best results. [2]
D05 The best Deorowicz results. [3] Fenwick (2007) describes this as "the best Burrows Wheeler result to date."
R08 My experimental results. (The compressor is not a part of NanoZip.)

NanoZip uses a novel sorting algorithm requiring 4n space for computing BWT. The table beside shows approximated Manzini Corpus results based on Yuta Mori's timings [4] for the latest suffix array construction algorithms. With the exception of MSufSort3 all algorithms work with similar space requirements. The R08 timings are adjusted by the ratio of MSufSort3 timings done with the same hardware as R08.

Binaries of the sort and compression implementations will be made available here in the future.

Archon4r0Deep-ShallowMSufSort3divsufsort2R08
chr22.dna6.0307.5147.1325.3625.985
etext9922.16034.26424.10618.06413.823
gcc-3.0.tar13.85635.82214.95210.08414.533
howto5.8068.2885.6725.3204.034
jdk13c18.10632.18211.3149.0108.268
linux-2.4.5.tar18.17425.91219.89014.29018.121
rctail9632.49062.50221.06017.91415.225
rfc20.73629.66617.93615.65816.728
sprot34.dat22.83232.09623.35217.40415.735
w3c227.26454.68217.09013.48612.750
total seconds187.454322.928162.504126.592125.202

[1] B. Balkenhol, Y. M. Shtarkov, "One attempt of a compression algorithm using the BWT", Falculty of Mathematics, University of Bielefeld, 1999
[2] P. Fenwick, "Burrows Wheeler Compression: Principles and Reections." Theoretical Computer Science Vol 387 (2007) No. 3 pp 200-219
[3] S. Deorowicz "Context exhumation after the Burrows Wheeler transform", Information Processing Letters, Vol 95, No 1, pp 313-320, 2005
[4] Y. Mori http://code.google.com/p/libdivsufsort/wiki/SACA_Benchmarks