While looking up information on how to cite Kindle books in research papers, I noticed that one key bit of info seems to be missing (or at least I couldn’t find it anywhere): the actual length of a “location,” which is the unit of measurement Amazon uses in its ebooks.

A commenter on this MobileRead thread says one location equals 128 bytes of data, so 8 locations for every 1 kilobyte of file size. However, this doesn’t match my own tests–in two sample texts I prepared, the one with the smaller file size but longer word count actually had a higher number of locations, and neither one had anywhere near the number of locations that his 8:1 ratio predicts.

My best guesstimate, after playing around with both word and character counts, is that one location is equivalent to approximately 122-125 characters including spaces, or 22-23 words. In my tests, the character count was a better predictor than word count, although I suspect Amazon also includes other spacing that I’m not taking into consideration–especially on widely spaced sections like a copyright page, which always had more locations assigned than I could predict.

More thoughts on locations for those who are really interested…

Why does Amazon use locations in the first place? The simple answer is because page numbers are meant for printed pages, and ebooks don’t have printed pages.

You need a static, physical object to use page numbers. Since the text in an ebook can be displayed in a near infinite array of font sizes, screen widths, and font faces (assuming it’s not a PDF), page numbers would almost never be the same from one version to the next, and so would be useless.

It seems to me it would make more sense to cite works using line numbers or paragraph numbers, or even word count, because then the reference point is attached to the essential text and not to whatever format it’s in, but Amazon has gone a different route. The reason for this, as far as I can tell, is so that non-textual elements such as cover images and front matter can be referenced–locations 2-7 might be the copyright page or table of contents.

You might wonder (as I did) why Amazon doesn’t just use percentages, which would make it easier to compare an ebook edition with a print edition. The trouble with percentages is that they relate only to the current text, so they can’t tell you anything about how long a book is relative to another book. The 30% mark in a 50,000 word text is nothing like the 30% mark in a 110,000 word text. By comparison, location 827 is the same distance in for both ebooks (approximately 18,000 words in), and you know immediately that a book with 2000 locations is twice as long as a book with 1000 locations.

