I am trying to make a function to filter on a web page most used words.

Now I can fetch a HTML website and the function counts the words, filters them by most used, but can’t make to work that words that are less then 3 characters would’t be displayed.

$contents = file_get_contents('');    
$search = array(
        '@<script[^>]*?>.*?</script>@si',   // Strip out javascript
        '@<head>.*?</head>@siU',            // Lose the head section
        '@<style[^>]*?>.*?</style>@siU',    // Strip style tags properly
        '@<![\s\S]*?--[ \t\n\r]*>@',        // Strip multi-line comments including CDATA

$contents = preg_replace($search, '', $contents);    
$result = array_count_values(
            str_word_count(strip_tags($contents), 1)


How to implement additional functionality to this function?

Just make a new result with shorter words filtered out:

foreach($result as $k => $v) {
  if(strlen($k) > 2) {
    $result2[$k] = $v;


You can use

$array = array_filter($array, function($value){
    return strlen($value) >= 3;

Everything that doesnt match the check gets filtered out.

