It is my great pleasure to recommend this excellent book written by my friend and
colleague, Professor Peter Fenwick. During the eleven years I have known him, we
have had many a discussion, often touching on topics covered here. Though this
is the closest we have come to a collaboration I have little doubt that had we met
earlier in our careers we would have collaborated extensively.
A major contribution of this book is to bring a historical perspective to many
topics that are so widely accepted that it might not be obvious there were choices to
be made. The binary representation of numbers was so obvious even in the 1940s that
Burks, Goldstine and von Neumann are said to have “adopted it seemingly without
discussion”. But Burks et al considered
floating point representation, then argued
against supporting it. Long ago I heard it claimed that von Neumann believed any
mathematician ”worth his salt” should be able to specify
floating point computations
using only integers. In any case,
floating point only came into its own in the 1980s,
with the broad acceptance of the IEEE standards. Professor Fenwick shows great
insight into why it took decades to get right something as basic as the representation
A second important contribution is discussion of the introduction of redundancy
to increase reliability in the presence of errors: check sums and variable-length
(universal) codes. While simple check sums are frequently discussed, I know of no
comparable source for a general discussion of Universal codes, an important but
somewhat obscure subject.
I agree with Professor Fenwick's quote, that “everybody thinks they know” about
these topics, but there are big holes, even today. Surely most of us have superficial
knowledge that fails us when we really need to work through the details. This book
covers a huge range of material, thoroughly and concisely. I have taught a good
bit of the material, but I learned much, even in areas where I claim some expertise.
The book displays a deep understanding of the many and varied requirements for
digital representation of information, from the obvious integers and floating point,
to Zeckendorf representations and Gray codes; from 2's complement to logarithmic
arithmetic; from Elias and Levenstein codes to Rice and Golomb codes and on to
Ternary Comma and Fibonacci codes.
In addition to the plethora of ways to represent numbers, it also covers representation of characters and strings. While the book will serve very well as a reference,
it is also fascinating reading. Many pages are devoted to obscure topics, interesting
largely because of their place in history, but outside the domain of a classic textbook
on computer organization or architecture. These are perhaps the most important
sections, precisely because they had to be understood and discarded to get us where
we are now.
This book definitely does not qualify for the subtitle, “Data Representation for
Dummies”. While it quickly surveys common forms of representation, the pace and
breadth will bewilder the true novice. On occasions, it uses terms unfamiliar (at
least to an American), requiring another source. Appropriately, Professor Fenwick
acknowledges the role of Wikipedia, which covers rather more topics than his book,
but certainly not as coherently.
The author has a wry, if somewhat subtle, sense of humour which often surfaces
unexpectedly: it's a bit of a stretch, but of course the description and figure regarding
Gray codes include a “grey area”!
Discussion of the roles and interaction of precision, accuracy and range is superb.
Floating point representation is highly precise, so why is it dangerous for use in financial calculations? Professor Fenwick points out something that had not occurred
to me: a “quite ordinary calculator” is capable of more precise arithmetic than a
32-bit [IEEE single-precision]
floating point computation. That explains why the
calculator “app” on my iPad has both less range, and less precision, than the HP
calculator I bought 35 years ago!
A topic rarely covered so clearly is “unwarranted precision”, the process of using
a precise mathematical operation to apparently increase accuracy (significant digits)
of a number. Professor Fenwick points out confusion over precision created by the
fact that the speed of light is so close to 300,000,000 metres per second—and the
fact that scientific notation provides information about the accuracy of a value
(pp. 106-107). I especially liked his discussion of the sins of the popular press,
for example, by apparently increasing precision in the process of converting units:
an altitude “10,000 feet”—accurate to, say, ±100 metres—becomes the apparently
more precise, but inaccurate, “3 048 metres”. It is unfortunate that the general level
of this book is beyond comprehension for most journalists!
In short, this is a fascinating book that will appeal to many because of its authoritative exploration of how we represent information. But it will also serve as
a reference for those requiring—or simply enjoying—the ability to choose efficient
representations that lead to accurate results. It's a good read, and a great book to
James R. Goodman
United States of America Fellow IEEE,
2013 Eckert-Mauchly Award
This book arose from lectures on data representation given to First year Com- puter Science students at the University of Auckland. But then it grew as I realised that ever-more material seemed relevant, useful, or just interesting. To a large extent it rejects my own journey through computing from about 1964–2004, starting from logic design, through computer hardware, computer arithmetic and data communications into, Finally, data compression. Thus the computers that I reference are largely those with which I have at least passing experience. (There are of course many others that I have not encountered, but few of these are mentioned.) And the footnotes and asides often come from personal experience; many are distant recollections which I cannot now attribute. A comment made by one person who read this book was “This is an area that everybody thinks they know, but really nobody really knows very well”. While most elementary Computer Science books certainly describe some data representation (usually restricted to current “best practice”), and other books give great detail of specialised topics such as ?oating point, there seems to be a great gap in the middle. It is this gap, giving reasonable coverage of most data types from first principles, that I hope this book supplies. It deals mostly with data at the architectural level, with no mention of the trees, lists etc as normally covered in Data Structures courses. The main exception here is the description of text strings – characters are of little inter- est in isolation; strings are the usual entity to be manipulated and are often regarded as a data primitive. It also includes a comprehensive coverage of variable-length integer representations and of checksums, both topics which seem to have little overall coverage in the general literature.
The University of Auckland,
New Zealand (retired)
email : firstname.lastname@example.org
The book was started while I was employed at the University of Auckland, but
with no explicit support.
I acknowledge the assistance from Brian Hicks and Murray Johns who,
many years ago, introduced me to computers, and some of whose insights are
still present in this book. Bob Doran, Amos Omondi and Brian Carpenter
read early drafts and suggested valuable extra topics. Assistance was also
received from Prof F.P. Brooks, Dr R.F. Rice and Jørgen Ibsen. Special thanks
go to Jim Goodman who provided many useful comments while preparing
the Foreword. And last but not least Brenda, who has endured many years
(probably far too many!) of “The Book”.
Conflicts of Interest
There are no conflicts of interest.
List of Contributors
The University of Auckland (Retired)