Check for existence of a substring into a string is a common requirement for programmers. Here I’m describing 3 functions to check if string contains specific words.
The top 3 functions available in PHP to check for word existence are:
- strpos – Find the position of the first occurrence of a substring in a string.
- strstr – Find the first occurrence of a string
- preg_match – Perform a regular expression match
Each of them has variations for case-insensitive search, last occurrence etc. We are taking ‘strpos‘ into account as it’s faster and enough to find occurrence.
Check if string contains specific words
Suppose we have the requirement below:
1 2 3 4 5 | $sentence = 'How are you?'; $word = 'are'; if ($sentence contains $word) echo 'true'; |
Then the correct way to write the statement will be:
1 2 3 4 5 6 7 8 9 10 11 12 13 | if (strpos($sentence, $word) !== false) { echo 'true'; } /* Or wrap inside another function for better code readability // returns true if $needle is a substring of $haystack function contains($needle, $haystack) { return strpos($haystack, $needle) !== false; } */ |
Note that the use of !== false
is deliberate as if the needle (‘$word‘) you are searching for is at the beginning of the haystack (‘$sentence‘), it will return position 0. Since 0 is a valid offset and 0 is ‘falsey‘, we can’t use simpler constructs like !strpos($sentence, $word)
.
If you want to check if a string does not contain a word then rather changing ‘false‘ to ‘true‘, use complementary operator ‘===‘ like strpos($sentence, $word) === false
.
Be aware that this will also return true for the string ‘Do you care?‘. If you need to deal with this situation then you can check the substring or word by either improving ‘strpos‘ condition or using ‘preg_match‘.
1 2 3 4 5 6 7 | //needle is word to search and haystack is the string function containsWord($needle, $haystack) { $haystack = ' '.$haystack.' '; $needle = ' '.$needle.' '; return strpos($haystack, $needle) !== false; } |
1 2 3 4 | function containsWord($needle, $haystack) { return !!preg_match('#\b' . preg_quote($needle, '#') . '\b#i', $haystack); } |
The ‘preg_match‘ above will get fail with sentences which are going to be anything that isn’t a-z, A-Z, 0-9
, or _
. That means digits and underscores are going to be counted as word characters and scenarios like these will return false:
- The ‘are’ at the beginning of ‘area’
- The ‘are’ at the end of ‘hare’
- The ‘are’ in the middle of ‘fares’
- The ‘are’ in ‘What _are_ you thinking?’
- The ‘are’ in ‘lol u dunno wut those are4?’
The preg_match is slower and is not recommended to just check if string contains specific words.