CHAPTER 6 – SQLite – Simpler Queries and Transactions

By creating only the tables, our email indexer still does nothing useful, so the next step is to add the emails into our database. We do that in a new script called "insert.php". Here is part of its code: <?php $db = new SQLiteDatabase("./crm.db", 0666, &$error) or die("Failed: $error"); ... if ($argc < 2) { echo "Usage:ntphp insert.php <filename>nn"; return; } First, we open the database and check if the number of parameters to this command-line script is correct. The first (and only) parameter passed to this script is the mailbox (in UNIX, the MBOX format) we're going to store and later index. $body = file_get_contents($argv[1]); $mails = preg_split('/^From /m', $body); unset($body); We load the mailbox into memory and split it into separate emails with a reg- ular expression. You might wonder what happens if a line in an email starts with From:; in this case, the UNIX MBOX format requires this From: to be escaped with the > character. // $db->query("BEGIN"); foreach ($mails as $id => $mail) { $safe_mail = sqlite_escape_string($mail); $insert_query = " INSERT INTO document(title, intro, body) VALUES ('Title', 'This is an intro.', '{$safe_mail}') "; echo "Indexing mail #$id.n"; $db->query($insert_query); } // $db->query("COMMIT"); ?> Here, we loop over the mails, making sure we escape all possible dangerous characters with the sqlite_escape_string() functions, and insert the data into the database with the query() method. Table 6.7 sqlite Quoting Function Function Name Description sqlite_escape_string(...) Escapes a string for use as parameter to a query By default, SQLite commits all queries directly to disk, which makes the inserting of many queries rather slow. Another problem that might arise is that other processes can insert data into the database during the process of importing our emails. To fix those two problems, you can simply use a transac- tion to perform the entire importing. To start a transaction, you can execute a query containing "BEGIN TRANSACTION" or simply "BEGIN". At the end of the trans- action, you can use the "COMMIT" query to commit all queries in the transaction to disk. In the full example (including the tricks we discuss later in this sec- tion), the time for importing 638 emails dropped from 60m29s to 1m59s, which is quite a speed boost.

Post Comment
Login to post comments