There are several ways to clean websites, one could use CSS Selectors and the other use XPath , which both select elements from the DOM.
Since I do not see the full HTML page of the web page, it would be difficult for me to determine which method is best for you. There is another option that may be disapproved, but in this case it may work.
You can use regex (regular expressions) to find characters, I'm not the best in regular expressions, but here are some sample code on how this might work:
<?php $subject = "<html><body><p>Some User</p><p>User status: Online.</p></body></html>"; $pattern = '/User status: (.*)\<\/p\>/'; preg_match($pattern, $subject, $matches); print_r($matches); ?>
Output Example:
Array ( [0] => User status: Online.</p> [1] => Online. )
Basically, what the regular expression does above matches the pattern, in this case it searches for the string "Custom Status:" and then matches all the characters (. *) Up to the end paragraph tag (escaped).
Here is a template that will only return βonlineβ without a period, was not sure that all statuses ended during the period, but here is what it will look like:
'/User status: (.*)\.\<\/p\>/'
source share