CHAPTER 9 – REGULAR EXPRESSIONS – Functions
Three groups of PCRE-related functions are available: matching functions, replacement functions, and splitting functions. preg_match(), discussed previ- ously, belongs to the first group. The second group contains functions that replace substrings, which match a specific pattern. The last group of functions split strings based on regular expression matches.
Matching Functions preg_match() is the function that matches one pattern with the subject string and returns either true or false depending whether the subject matched the pattern. It also can return an array contain- ing the contents of the different sub-pattern matches.
The function preg_match_all() is similar, except that it matches the pat- tern with the subject repeatedly. Finding all the matches is useful when extracting information from documents. Take, for example, the situation in which you want to extract email addresses from a web site: <?php $raw_document = file_get_contents('http://www.w3.org/TR/CSS21'); $doc = html_entity_decode($raw_document); $count = preg_match_all( '/<(?P<email>([a-z.]+).?@[a-z0-9]+.[a-z]{1,6})>/Ui', $doc, $matches ); var_dump($matches); ?> outputs Array ( [0] => Array ( [0] => <bert @w3.org> [1] => <tantekc @microsoft.com> [2] => <ian @hixie.ch> [3] => <howcome @opera.com> ) [email] => Array ( [0] => bert @w3.org [1] => tantekc @microsoft.com [2] => ian @hixie.ch [3] => howcome @opera.com ) [1] => Array ( [0] => bert @w3.org [1] => tantekc @microsoft.com [2] => ian @hixie.ch [3] => howcome @opera.com ) [2] => Array ( [0] => bert [1] => tantekc [2] => ian [3] => howcome ) ) This example reads the contents of the CSS 2.1 specification into a string and decodes the HTML entities in it. The script then uses a preg_match_all() on the document, using a pattern that matches < + an email address + >, and stores the email addresses in the $matches array. The output shows that preg_match_all() doesn't store all sub-pattern belonging to one match in one element of the $matches array. Instead, it stores all the sub-pattern matches belonging to the different matches into one element of $matches. preg_grep() performs similarly to the UNIX egrep command. It compares a pattern against elements of an array containing the subjects. It returns an array containing the elements that were successfully matched against the pat- tern. See the next example, which returns all valid IP addresses from the array $addresses: <?php $addresses = array('212.187.38.47', '188.141.21.91', '2.9.256.7', '<<empty>>'); $pattern = '@^((d?d|1dd|2[0-4]d|25[0-5]).){3}'. '(d?d|1dd|2[0-4]d|25[0-5])@'; $addresses = preg_grep($pattern, $addresses); print_r($addresses); ?>