I am frequently annoyed at the things that I can’t remember. And when I’m trying to remember the details of something, I often turn to my text messages—thanks to big improvements recently, it is now quite fast to search my whole iMessage history on my phone, provided that I can remember some verbatim part of the message I’m looking for. And often, once I’m in the past, I want to look around: text messages from ages ago provide surprisingly interesting insights into the past.
But iMessage isn’t set up well for this casual browsing: when you try to scroll away from a search result, the loading is very slow. And the interface provides no way to jump to a specific date. I’d really like to be able to “flip through” my messages and stop at a random place for a view into that moment in time. Apple doesn’t provide a way to do that, so, I thought, why not enable it myself? I though it’d be great to enable this “flipping through messages” in the most literal way possible: by creating a physical book of my biggest conversation.
Is it possible?
In order to do anything at all with the messages, I needed to get them out of my phone and onto my computer. I’d looked many times for a way to do this with Signal, so wasn’t sure what I’d find, but was pleased that it seemed relatively straightforward to pull messages off an iPhone (even easier if your messages are already on a Mac). According to the very helpful iPhone wiki, all I had to do was grab sms.db
from a backup of my phone, and I’d have a SQLite database that I could do whatever I liked with.
Querying my texts with SQL
This simplicity seemed a bit too good to be true—for some reason I expected some proprietary format that would be a pain to reverse-engineer. So I had to see it for myself. I took a standard backup on my Mac in finder (that was a trip—the “plugged-in iPhone” UI has barely changed since I used iTunes to sync music to my iPod touch in seventh grade). While the backup format is really not complicated, it was intimidating browsing the backup folder at first because an ls
in the root directory yields a bunch of directories named after a single hex byte:
/.../00008120-001854410CEB401E >>> ls
00 0e 1c 2a 38 46 54 62 70 7e 8c 9a a8 b6 c4 d2 e0 ee fc
01 0f 1d 2b 39 47 55 63 71 7f 8d 9b a9 b7 c5 d3 e1 ef fd
02 10 1e 2c 3a 48 56 64 72 80 8e 9c aa b8 c6 d4 e2 f0 fe
03 11 1f 2d 3b 49 57 65 73 81 8f 9d ab b9 c7 d5 e3 f1 ff
04 12 20 2e 3c 4a 58 66 74 82 90 9e ac ba c8 d6 e4 f2 Info.plist
05 13 21 2f 3d 4b 59 67 75 83 91 9f ad bb c9 d7 e5 f3 Manifest.db
06 14 22 30 3e 4c 5a 68 76 84 92 a0 ae bc ca d8 e6 f4 Manifest.db-shm
07 15 23 31 3f 4d 5b 69 77 85 93 a1 af bd cb d9 e7 f5 Manifest.db-wal
08 16 24 32 40 4e 5c 6a 78 86 94 a2 b0 be cc da e8 f6 Manifest.plist
09 17 25 33 41 4f 5d 6b 79 87 95 a3 b1 bf cd db e9 f7 Status.plist
0a 18 26 34 42 50 5e 6c 7a 88 96 a4 b2 c0 ce dc ea f8
0b 19 27 35 43 51 5f 6d 7b 89 97 a5 b3 c1 cf dd eb f9
0c 1a 28 36 44 52 60 6e 7c 8a 98 a6 b4 c2 d0 de ec fa
0d 1b 29 37 45 53 61 6f 7d 8b 99 a7 b5 c3 d1 df ed fb
Entering one of these directories yields a bunch of files starting with the hex byte after which the directory was named:
/.../00008120-001854410CEB401E >>> cd 3d
/.../00008120-001854410CEB401E/3d >>> ls
3d0292d3fe90e1e22c247403c0e9105ea0f9ff44 3d8830b71e98aae80b6eaf8bdd5500d79ce74946
3d02fe309afa7de839822d6f1b8433aa90090d17 3d88cdc16ff2b5231e5ea4b52271ee195a6f4b96
3d072c4fca5db4a5678fa10b137435f757e98492 3d8a425d70f4049417e855d273c44d8199de30c9
3d0739c90579fa907246d5c21bd8d8ebaa2d9d6b 3d8a43a1921f504bb4393250f75b24bfc2c5cedb
3d0798b3cc4d2f5ad347ffb8bc5a0f9d8c82cfb9 3d8a7c0460aadabf1b7fc9adea9e6a2a6e7bc73b
3d07a0adc5c5c22dc525ccd3a93fb05a50ef1ac5 3d8b6ad12c7617b3d783790a457b0aa19b193b68
3d0880f091c51ddc145e17c78d8e6f9a3e7e20c8 3d8b82abe05a9d697102d8b665c9d499e07492ea
3d093e92cf03abf3650411e09a647630a1e0c478 3d8ba897240ad32580bf8dfd00db8f181658cdfd
3d095e908ff898be3b3ffd64a75db959a58ac70a 3d8bc227d67ec4944df8e75291102367034d7214
3d09d5dcd5a9bdad67a80cd83201a9e1fb75aada 3d8c722f1d92f7cd6f90c936c14f60f51aad128b
3d0abb83123be82abf43ce20118e72fea06023c5 3d8ca6eeabeb1c01fae05bb20f08dedf734cfd04
3d0b246304c42d2ab1eb1892d629fcdfde689cb7 3d8d0c6b1bf7946c6bef91d60cccb32207b7bc01
3d0bb5f49e6f0e31348ef8feb9a38d4ce71f5ec7 3d8fd2fbcaf3079a683a8e486ecde8875f0a591d
3d0c1283936c45fec533a507b78558b5aa3159fa 3d8ff93bd94b3ea14edc77d1e677cf4ee4306e4e
3d0cb8e28462780bb9af1440e297ecd8224c70ff 3d90ea8bfbf62feda080cd0ccbd12fa5c8673993
3d0ce10de5f69606c52882215b99ebab259dc194 3d932638fe8ed669725b7a143c6a8b02b8959923
3d0d7e5fb2ce288813306e4d4636395e047a3d28 3d93c92679aa9d398331e27fdeed64b5094e68d1
...
Looking at these with a nice file explorer that looks at magic bytes to determine filetypes (I use Thunar) helps make some sense of it, since it can show that these cryptic names really are just regular old images and other files. But really even that is unnecessary since the iPhone Wiki told us that the filename for the sms.db
file that we’re looking for is 3d0d7e5fb2ce288813306e4d4636395e047a3d28
. Copying this to my home directory:
$ cp 3d0d7e5fb2ce288813306e4d4636395e047a3d28 ~/imessage.db
And opening it up with the sqlite3
CLI we can actually see some tables!
~ >>> sqlite3 imessages.db
SQLite version 3.44.2 2023-11-24 11:41:44
Enter ".help" for usage hints.
sqlite> .tables
_SqliteDatabaseProperties message
attachment message_attachment_join
chat message_processing_task
chat_handle_join recoverable_message_part
chat_message_join sync_deleted_attachments
chat_recoverable_message_join sync_deleted_chats
deleted_messages sync_deleted_messages
handle unsynced_removed_recoverable_messages
kvtable
sqlite>
The schema requires a couple of joins to extract an actual conversation, but without too much trouble we can start to pull out messages (in this case from CVS spamming me):
sqlite> select
message.ROWID, message.date, message.text, message.is_from_me from message
inner join chat_message_join on message_id=message.ROWID
inner join chat on chat.ROWID=chat_message_join.chat_id
where chat.chat_identifier='28732'
order by date asc;
278125|694030292385607040||0
278327|694647875648848000||0
...
314056|726702453329793024||0
314412|727316171079934976|CVS ExtraCare: 20% off one full-price item, just because. Tap the link to send to card: c.cvs.com/B0kjBMbNM|0
We got one, but a lot of blank ones too—many of the messages are missing! It turns out that for some messages, message data is stored in an encoded NSMutableAttributedString
binary blob in the message.attributedData
column instead of in message.text
. With a bit of wrangling to get the binary data out of the SQLite CLI, we can look at one of these missing messages and see that the data is indeed there:
~ >>> sqlite3 imessages.db "select hex(attributedBody) from message where ROWID=278125;" \
| cut -d\' -f2 \
| xxd -r -p \
| xxd -g1
00000000: 04 0b 73 74 72 65 61 6d 74 79 70 65 64 81 e8 03 ..streamtyped...
00000010: 84 01 40 84 84 84 19 4e 53 4d 75 74 61 62 6c 65 [email protected]
00000020: 41 74 74 72 69 62 75 74 65 64 53 74 72 69 6e 67 AttributedString
00000030: 00 84 84 12 4e 53 41 74 74 72 69 62 75 74 65 64 ....NSAttributed
00000040: 53 74 72 69 6e 67 00 84 84 08 4e 53 4f 62 6a 65 String....NSObje
00000050: 63 74 00 85 92 84 84 84 0f 4e 53 4d 75 74 61 62 ct.......NSMutab
00000060: 6c 65 53 74 72 69 6e 67 01 84 84 08 4e 53 53 74 leString....NSSt
00000070: 72 69 6e 67 01 95 84 01 2b 81 f3 00 43 56 53 20 ring....+...CVS
00000080: 45 78 74 72 61 43 61 72 65 3a 20 24 32 20 6f 66 ExtraCare: $2 of
00000090: 66 20 79 6f 75 72 20 70 75 72 63 68 61 73 65 2c f your purchase,
000000a0: 20 6a 75 73 74 20 66 6f 72 20 79 6f 75 21 20 49 just for you! I
000000b0: 6e 20 73 74 6f 72 65 20 6f 72 20 6f 6e 6c 69 6e n store or onlin
000000c0: 65 2e 20 54 61 70 20 74 68 65 20 6c 69 6e 6b 20 e. Tap the link
000000d0: 74 6f 20 73 65 6e 64 20 64 65 61 6c 20 74 6f 20 to send deal to
000000e0: 63 61 72 64 3a 20 63 2e 63 76 73 2e 63 6f 6d 2f card: c.cvs.com/
Luckily, we don’t need to implement the parsing for this binary format ourselves. There’s a great imessage-database
crate that does exactly this: ingests an iMessage database and outputs the data in nice Rust data structures. Out of the box, it comes with a binary (imessage-exporter
) to generate text or HTML versions of your conversations—so really quite similar to my goal.
With just a couple of tweaks to the SQL statement the library uses to fetch messages, I’m able to narrow down the query to just a single conversation. But for this project I want to make a nicely formatted physical book that I can hold in my hand and flip through—the HTML and text formats that the project ships with won’t quite work for this.
Generating LaTeX
I am a huge fan of LaTeX due to the beautiful documents it can be convinced to produce, and since leaving school have been itching to generate some more pretty PDFs. And since LaTeX’s text-based source code makes it perfect for templating and autogeneration, it seems like a great choice. I’ll my book by spitting out LaTeX code for every text message in the conversation.
Thanks to the imessage-database
library it’s pretty easy to iterate through all the messages in the conversation, so I start by generating LaTeX code for each message. My first approach at this LaTeX generation is quite simple: align left if the message is from me and right otherwise, insert some text indicating an attachment where images are sent, and skip things like reactions and replies that I don’t want to bother rendering. This initial approach works well, and after splitting the text up into chapters based on date and bit of visual tweaking, I’m satisfied.
But there’s one major problem: LaTeX doesn’t support unicode. Of course, this means that as soon as I extend the rendering window enough to include an emoji, the LaTeX compiler explodes. Simply stripping out emojis from the source text works, but is hardly a tolerable solution—after all, emojis are integral to modern communication.
After a bit of research, it looks like XeLaTeX is the key: it adds support for unicode fonts to LaTeX. Switching to XeLaTeX proves quite straightforward, and by defining a \emojifont
to an emoji font and wrapping every emoji in {\emojifont X}
in my generated LaTeX source, the output renders successfully with emojis inline. But I don’t want to pay for every page of my book to be printed in color when I print it. Luckily, Google’s Noto Emoji font has a great set of simple black-and-white emojis that are perfect for this purpose. I’m quite happy with the way these emojis look in print:
After a couple extra niceties like a header that tracks the current date (with a LaTeX command that sets \markright
with every message), I’m ready to put it all together.
When I finally compile all three years of messages that I want to be able to flip through, I’m surprised to find that the compiler dumps out well over a thousand pages of messages when I put them into a standard 6" x 9" page size. Since it’s exactly three years of messages anyway, though, there’s an easy solution: I split the opus into three volumes to get the size of each one down to something printable.
Ordering
When I decided to try to do this, I really wanted to end up with a physical book in my hand. So I had to figure out how to get these books printed. And to my surprise, printing a paperback book is quite cheap. After reviewing a bunch of options, Barnes and Noble Press seems like the best option. It’s decently more expensive than some of the other options like Lulu and Amazon KDP, but most options are targeted at people that are trying to sell their books. B&N Press is too, but their story for personal books seems better than the others as you don’t need to “publish” your book to get it printed. And the price is still quite reasonable: I was able to print all three volumes, around 1300 pages total, for $30 including shipping.
Before I can order books from my LaTeX-generated PDFs, the website tells me that the last step is to create covers. Upon uploading the body pages to B&N Press, the sites generates the dimensions required for the cover. Given these, I threw together a cover for each of the three volumes in Inkscape, which the website accepted without complaint.
The B&N press website is not perfect: it generally is very slow, and while trying to place my order the checkout page was broken and wouldn’t show up for over 24 hours. But after that was fixed, ordering worked.
And sure enough, after a couple weeks’ wait, I had three actual books in hand. I flip through them regularly, and it is so much easier to revisit old conversations this way than trying to do so on my phone.
Create your own
The source code is in rough shape, and I haven’t packaged it as a cargo binary, but there’s not much of it. If you want to take a look or try for yourself, it’s available at https://github.com/bkettle/message-book.