# Module “Bit::Vector”

“Bit::Vector - more than the name suggests”

Steffen Beyer

YAPC::Europe, London, UK,
sd&m
software design & management
GmbH & Co. KG
Thomas-Dehler-Straße 27
81737 München
Telefon (0 89) 6 38 12-0
ICA, September 22-24 2000
Telefax (0 89) 6 38 12-150
http://www.sdm.de
1

Agenda • What does it do? • Purpose(s) • Summary of available methods • Characteristics • Alternatives • Some Applications • Questions & Answers. Suggestions sd&m 2 .

you may think. is everybody familiar with two's complement binary representation and arithmetics? sd&m 3 ... Not very sexy. But actually bit vectors are the base of all computations performed by a computer! Your CPU calls them "processor registers".What does it do? The Bit::Vector module implements bit arrays of arbitrary size. By the way.

Purpose(s) • Efficient storage and handling of bit arrays • Extend your CPU to any desired number of bits • Efficient set operations • Efficient big integer arithmetic sd&m 4 .

Summary of available methods (See file "BitVector.....Reverse()" (same to bit vectors as Perl's "reverse" for strings) sd&m 5 ..()" (allows access to packets of bits at a time of chooseable size) – "..()" (finds contiguous blocks of set bits) – "Chunk_.txt") • Especially interesting methods: – "Interval_Substitute()" (is to bit vectors what "splice" is to Perl arrays) – "Interval_Scan_.

time complexity of many functions O(1). C library also LGPL sd&m 6 .Characteristics (1/3) • Internally written in C (thus fast) • Relies on CPU's machine word operations for maximum speed • Auto-adapts to size of machine word at runtime • Uses efficient algorithms (mostly "divide-and-conquer"). O(n). O(n ld n) • C library at the core can also be used stand-alone (without Perl) • Free Software (GPL+Artistic).

Efficient Algorithms • Example: Exponentiation (xk) E.Characteristics (2/3) . then uses machine word math operations to break remainder down further • Example: Bit counting (number of set bits) sd&m 7 . 2713 (base 10) k = 13 = 27*27*27*27*27*27*27*27*27*27*27*27*27 = 110111101 (base 2) n = int(ld k) = 3 = (110118)1 * (110114)1 * (110112)0 * (110111)1 Worst case: 2n multiplications = O(n) = O(ld k) instead of k .g.1 = O(k) – here: only 5 instead of 12 • Example: Conversion to decimal representation Divides bit vector modulo largest power of 10 fitting into a machine word.

\$vec1->intersection(\$vec2.Characteristics (3/3) • Object-oriented interface. – one set of operands for big integer math.\$vec3).g. is always loaded now sd&m 8 . e. e. * : will be optional in version 6. e.g. • Optionally(*) provides overloaded operators – one set of operands for set operations. \$bigsum += \$bigint.0 (for improved loading ( ) speed of "plain" module). \$set1 = \$set2 & \$set3.g.

internally) • Math::PARI – very powerful – requires separate C library "PARI" • Math::BigInt (is in the Core of Perl 5.Alternatives (1/2) • vec() – confusing – insufficiently powerful for many applications • PDL – complicated – designed primarily for astronomical data analysis and heavy duty number crunching (written in C. stores digits in Perl arrays) • Math::BigInteger – unmaintained.6) – slow (written entirely in Perl. doesn't compile (uses XS and a C library) sd&m 9 .

but only of limited use since the whole interval is either in or out) sd&m 10 .newsrc file type sets (also supported by Bit::Vector. set operations will then be faster) • Set::Scalar . but need more memory) • Set::Object .optimized for intervals of integers (needs much less memory than Bit::Vector.Alternatives (2/2) • Set::Bag .implements multisets • Set::IntSpan .optimized for . but also allows recursion (set of sets) • Set::Window .similar to Set::Object (?).implements sets of arbitrary objects (can be simulated with Bit::Vector using lookup table.

pl" sd&m 11 .Simulating Set::Object using lookup table • See file "SetObject.

follow & lookahead character sets) • Cryptography • Easy manipulation of data (files). any number of bits at a time sd&m 12 . shortest paths / Kleene's Algorithm) • Slice (multiple document version generator) • Parse table generators for compiler-compilers à la "yacc" (calculating first.useful for graph algorithms (e.sets of integers (universe = some interval) • Math::MatrixBool .g. Some Applications • Set::IntRange .

OK" – file "file.in" – file "Slice.de.en.com/sw/slice/ sd&m 13 .OK" – URL http://www.Application "Slice" • See – homepage screenshot "Slice.bmp" – file "file.engelschall.html.txt" – file "file.html.

one day = one bit) • Bit is "on" if corresponding day is a holiday • Performs calculations taking holidays into account sd&m 14 .0 (coming soon) • Stores years in bit vectors (one year = one bit vector.Application "Date::Calc" v5.