CHAPTER 15 – An Introduction to Writing PHP Extensions – QUICKSTART
Instead of slowly explaining some of the building blocks of the scripting engine, this section dives into coding an extension, so do not worry if you don't see the whole picture right away. Imagine you are writing a web site but need a function, which will repeat a string n times. Writing this in PHP is simple: function self_concat($string, $n) { $result = ""; for ($i = 0; $i < $n; $i++) { $result .= $string; } return $result; } self_concat("One", 3) returns "OneOneOne". self_concat("One", 1) returns "One". Imagine that for some odd reason, you need to call this function often, with very long strings and large values of n. This means that you'd have a huge amount of concatenation and memory reallocation going on in your script, which could significantly slow things down. It would be much faster to have a function that allocates a large enough string to hold the resulting string and then repeat $string n times, not needing to reallocate memory every loop iteration.
The first step in creating an extension for your function is to write the function definition file for the functions you want your extension to have. In this case, the file will have only one line with the prototype of the function self_concat(): string self_concat(string str, int n) The general format of the function definition file is one function per line. You can specify optional parameters and a wide variety of PHP types, includ- ing bool, float, int, array, and others. Save the file as myfunctions.def in the ext/ directory under the PHP's source tree. Now it's time to run it through the extension skeleton creator. The script is called ext_skel and sits in the ext/ directory of the PHP source tree (more information can be found in the README.EXT_SKEL file under the main PHP source directory). Assuming you saved your function definitions in a file called myfunctions.def and you would like the extension to be called myfunctions, you would use the following line to create your skeleton extension: ./ext_skel --extname=myfunctions --proto=myfunctions.def This creates a directory myfunctions/ under the ext/ directory. First thing you'd probably want to do is get the skeleton to compile so that you're ready for actually writing and testing your C code. There are two ways to compile the extension: As a loadable module or DSO (dynamically shared object) Build it statically into PHP This chapter uses the second method because it's slightly easier to begin with. If you're interested in building your extension as a loadable module, you should read the README.SELF-CONTAINED_EXTENSIONS file in the PHP source tree's root directory. To get the extension to compile, you need to edit its config.m4 file, which can be found in ext/myfunctions/. As your extension does not wrap any external C libraries, you will want to add support of the --enable-myfunc- tions configure switch to PHP's build system (the with-extension switch is used for extensions that need to allow the user to specify a path to the relevant C library). You can enable the switch by uncommenting the following two auto-generated lines: PHP_ARG_ENABLE(myfunctions, whether to enable myfunctions support, [ --enable-myfunctions Include myfunctions support])
Now all that's left to do is to run ./buildconf in the root of the PHP source tree, which will create a new configure script. You can check that your new configure option made it into configure by finding it in the output of ./ configure --help. Now, reconfigure PHP with all of your favorite switches and include the --enable-myfunctions switch. Last but not least, rebuild PHP by running make. ext_skel should have added two PHP functions to your skeleton exten- sion: self_concat() which is the function you want to implement, and confirm_myfunctions_compiled(), which can be called to check that you properly enabled the myfunctions extension in your build of PHP. After you finish devel- oping your PHP extension, remove the latter function. <?php print confirm_myfunctions_compiled("myextension"); ?> Running this script would result in something similar to the following being printed: "Congratulations! You have successfully modified ext/myfunctions config.m4. Module myfunctions is now compiled into PHP." In addition, the ext_skel script creates a myfunctions.php script that you can also run to verify that your extension was successfully built into PHP. It shows you a list of functions that your extension supports. Now that you've managed to build PHP with your extension, it's time to actually start hacking at the self_concat() function. The following is the skeleton that the ext_skel script created: /* {{{ proto string self_concat(string str, int n) */ PHP_FUNCTION(self_concat) } char *str = NULL; int argc = ZEND_NUM_ARGS(); int str_len; long n; if (zend_parse_parameters(argc TSRMLS_CC, "sl", &str, &str_len, &n) == FAILURE) return; php_error(E_WARNING, "self_concat: not yet implemented"); } /* }}} */
The auto-generated PHP function includes comments around the func- tion declaration which are used for self-documentation and code-folding in edi- tors such as vi and Emacs. The function itself is defined by using the PHP_FUNCTION() macro, which creates a function prototype suitable for the Zend Engine. The logic itself is divided into semantic parts, the first where you retrieve your function arguments and the latter the logic itself. To retrieve the parameters passed to your function, you'll want to use the zend_parse_parameters() API function which has the following prototype: zend_parse_parameters(int num_args TSRMLS_DC, char *type_spec, ...); The first argument is the number of arguments that were passed to your function. You will usually pass it ZEND_NUM_ARGS(), which is a macro that equals the amount of parameters passed to your PHP function. The second argument is for thread-safety purposes, and you should always pass it the TSRMLS_CC macro, which is explained later. The third argument is a string specifying what types of parameters you are expecting, followed by a list of variables that should be updated with the parameters' values. Because of PHP's loose and dynamic typing, when it makes sense, the parameters will convert to the requested types if they are different. For example, if the user sends an integer and you request a floating-point number, zend_parse_parameters() automati- cally converts the integer to the corresponding floating-point number. If the actual value cannot be converted to the expected type (for example, integer to array), a warning is triggered. Table 15.1 lists types you can specify. For completeness, some types that we haven't discussed yet are included. Table 15.1 Type Specifiers Type Specifier Corresponding C Type Description l long Signed integer. d double Floating-point number. s char *, int Binary string including length. b zend_bool Boolean value (1 or 0). r zval * Resource (file pointer, database connection, and so on). a zval * Associative array. o zval * Object of any type. O zval * Object of a specific type. This requires you to also pass the class type you want to retrieve. z zval * The zval without any manipulation.
To understand the last few options, you need to know that a zval is the Zend Engine's value container. Whether the value is a Boolean, a string, or any other type, its information is contained in the zval union. We will not access zval's directly in this chapter, except through some accessor macros, but the following is more or less what a zval value looks like in C, so that you can get a better idea of what's going on: typedef union _zval { long lval; double dval; struct { char *val; int len; } str; HashTable *ht; zend_object_value obj; } zval; In our examples, we use zend_parse_parameters() with basic types, receiv- ing their values as native C types and not as zval containers. For zend_parse_parameters() to be able to change the arguments that are supposed to return the function parameters, you need to send them by refer- ence. Take a closer look at self_concat(): if (zend_parse_parameters(argc TSRMLS_CC, "sl", &str, &str_len, &n) == FAILURE) return; Notice that the generated code checks for the return value FAILURE (SUC- CESS in case of success) to see if the function has succeeded. If not, it just returns because, as previously mentioned, zend_parse_parameters() takes care of triggering warnings. Because your function wants to retrieve a string str and an integer n, it specifies "sl" as its type specifier string. s requires two arguments, so we send references to both a char * and an int (str and str_len) to the zend_parse_parameters() function. Whenever possible, always use the string's length str_len in your source code to make sure your functions are binary safe. Don't use functions such as strlen() and strcpy() unless you don't mind if your functions don't work for binary string. Binary strings are strings that can contain nulls. Binary formats include image files, compressed files, executable files, and more. "l" just requires one argument, so we pass it the ref- erence of n. Although for clarity's sake, the skeleton script creates C variable names that are identical to the argument names in your specified function pro- totype; there's no need to do so, although it is recommended practice. Back to conversion rules. All the three following calls to self_concat() result in the same values being stored in str, str_len, and n: self_concat("321", 5); self_concat(321, "5"); self_concat("321", "5"); str points to the string "321", str_len equals 3, and n equals 5. Before we write the code that creates the concatenated string and returns it to PHP, we need to cover two important issues: memory manage- ment and the API for returning values from internal PHP functions.