[eluser]TheFuzzy0ne[/eluser]
Thanks for your replies everyone.
If I split by spaces, and the text contains more than one sequential space, this happens:
Code:
$str = "this is a test";
$arr = explode(' ', $str);
print_r($arr);
Array
(
[0] => this
[1] => is
[2] =>
[3] => a
[4] =>
[5] => test
)
Sorry for not making myself clear guys, but I'm not look for different ways to split, I am basically looking for a method that isn't going to allow any characters into the database index that shouldn't be there. In other words, I can't decide what should and shouldn't be indexed.
I'm going to settle on this regex unless anyone else has any advances.
The regex will extract anything that contains any letters a-z, numbers 0-9, or apostrophes. It will only extract words and numbers that are more than three characters long. The only catch I can see here is that the regex will extract ''', which isn't a word... I'm not sure whether I should even include apostrophes.