How to use RegEx to strip specific leading and trailing punctuation in PHP-ThrowExceptions

Exception or error:

We’re scrubbing a ridiculous amount of data, and am finding many examples of clean data that are left with irrelevant punctuation at the beginning and end of the final string. Quotes and DoubleQuotes are fine, but leading/trailing dashes, commas, etc need to be removed

I’ve studied the answer at How can I remove all leading and trailing punctuation?, but am unable to find a way to accomplish the same in PHP.

- some text.                dash and period should be removed
"Some Other Text".          period should be removed
it's a matter of opinion    apostrophe should be kept
/ some more text?           Slash should be removed and question mark kept

In short,

  • Certain punctuation occurring BEFORE the first AlphaNumeric character must be removed
  • Certain punctuation occurring AFTER the last AlphaNumeric character must be removed

How can I accomplish this with PHP – the few examples I’ve found surpass my RegEx/JS abilites.

How to solve:

You can modify the pattern to include characters.

$array = array(
    '- some text.',
    '"Some Other Text".',
    'it\'s a matter of opinion',
    '/ some more text?'
);

foreach($array as $key => $string){
    $array[$key] = preg_replace(array(
        '/^[\.\-\/]*/',
        '/[\.\-\/]*$/'
    ), array('', ''), $string);
}

print_r($array);

Answer:

This is an answer without regex.

You can use the function trim (or a combination of ltrim/rtrim to specify all characters you want to remove. For your example:

$str = trim($str, " \t\n\r\0\x0B-.");

(As I suppose you also want to remove spacing and newlines at the begin/end, I left the default mask)

See also rtrim and ltrim if you don’t want to remove the same charlist at the beginning and the end of your strings.

Leave a Reply

Your email address will not be published. Required fields are marked *