CHAPTER 6 – SQLite – User-Defined Functions (UDFs)
Because SQLite is Lite, it does not implement all the default SQL functions, but SQLite does provide you with the possibility to write your own functions that you then can use from your SQL queries.
Table 6.8 sqlite UDF Functions Function Name Description sqlite_create_function(...) Binds an SQL function to a user defined function $sqlite->createFunction(...) in your PHP script. Parameters: · DB handle (procedural only) · SQL function name (string) · PHP function name (string) · Number of arguments to the function (integer, optional) We're adding this function registration call after the argument check in insert.php: ... $db->createFunction("php_index", "index_document", 4); ... Of course, we create this new PHP function index_document. We place this func- tion, with another helper function at the start of our script: function normalize($body) { $body = strtolower($body); $body = preg_replace( '/[.;,:!?¿¡[]@()]/', ' ', $body); $body = preg_replace('/[^a-z0-9 -]/', '_', $body); return $body; } This helper function strips non-wanted characters and lowercase charac- ters, and changes punctuation marks to spaces. It is used to normalize the words we put into our search index. After the helper function, our main func- tion begins as follows: function index_document($id, $title, $intro, $body) { global $db; Because this function is called through SQLite, we need to import our database handle into the function's scope; we do that with the global keyword: $id = $db->singleQuery("SELECT max(id) from document"); Because of a bug in the SQLite library, we have to figure out the latest auto-increment value ourselves because we cannot trust the value passed through our callback function by SQLite. Using the PHP function sqlite_last_insert_row_id() (or the OO variant lastInsertRowId()) did not work here, either. $body = substr($body, 0, 32000); $body = normalize($body); Here, we reduce the body to only 32KB with the reason that emails larger than this usually have an attachment, and that's not important to put into our index. After that, the text is normalized so that we can make a nice search index out of it: $words = preg_split( '@([W]+)@', $body, -1, PREG_SPLIT_OFFSET_CAPTURE | PREG_SPLIT_NO_EMPTY ); This regular expression splits the body into words and calculates their position in the message (you can find more about regular expressions in Chap- ter 9, "Mainstream Extensions"). foreach ($words as $word) { $safe_word = sqlite_escape_string($word[0]); if ((strpos($safe_word, '_') === false) && (strlen($safe_word) < 24)) { Here, we start looping over all the words that the regular expression cre- ated. We escape the word, and enter only the index section of this function if there is no underscore present in the word, and when it is smaller than 24 characters. $result = @$db->query( "INSERT INTO dictionary(word) ". "VALUES('$safe_word');"); if ($result != SQLITE_OK) { /* already exists, need to fetch the * ID then */ $word_id = $db->singleQuery( "SELECT id FROM dictionary ". "WHERE word = '$safe_word'"); } else { $word_id = $db->lastInsertRowID(); } Here, we insert our word into the dictionary table, relying on the unique key of the word to prevent duplicate entries. In case the word already exists in the dic- tionary, the query will fail and we run a SELECT query to obtain the ID of the word with the singleQuery() method; otherwise, we request the ID with which the new word was inserted into the database. The singleQuery() method runs the query, and returns the first column of the first record returned by the query. $db->query( "INSERT INTO ". "lookup(document_id, word_id, position) ". "VALUES($id, $word_id, {$word[1]})"); } } } When we know the ID of the word, we insert it with the document_id and the position into the lookup table (see Table 6.9). Table 6.9 sqlite_last_insert_row_id and sqlite_single_query Function Name Description sqlite_last_insert_row_id(...) Returns the ID of the last inserted data in an $sqlite->lastInsertRowId() auto increment column. The procedural version requires the database handler as its only parameter. sqlite_single_query(...) Executes a query and returns the first column $sqlite->singleQuery(...) of the first record. Parameters: · The database handle (function only) · The query to execute (string)