Welcome Guest, Not a member yet? Register   Sign In
Search text & highlight words in a range
#1

[eluser]adamfairholm[/eluser]
I'm writing a little search that just does a basic LIKE query on some text fields in a database.

I'm currently using the CI highlight_phrase and word_limiter to show where someone's query phrase comes up in the first 200 words or so of a text area, but of course the phrase isn't always in the first 200 words.

Does anyone have a method they've used to limit words starting with the first instance of a phrase? I've poked around various sources but haven't been able to find any ideas.
#2

[eluser]zpjorge[/eluser]
just write one dude, it will take you 10 minutes if you know what you are doing.
#3

[eluser]adamfairholm[/eluser]
If every post looking for advice got an answer of "just write one dude", then this would be a pretty small forum.

I'm asking because I'm not sure about the right way to start and what is the most efficient approach. Anyone does this before and found a good solution?
#4

[eluser]Michael Wales[/eluser]
Let me see if I am understanding you correctly - you want to limit the returned string to 200 words immediately following the first instance of the search phrase? I've modified the word_limiter() helper function to allow this, place it in your application/helpers/ directory.

Edit: Incorrect example - see sophistry's comments below.
#5

[eluser]sophistry[/eluser]
i think zpj was just encouraging you to attempt to solve your problem and then ask the forum: "why doesn't this thing i did work the way i thought it would"

the CI forum is a generally helpful place, and sometimes it helps to have someone tell you - "stop trying to hold my hand, just walk on your own"

anyhow, there is a function in PHP called: strstr or stristr that returns all the text after the first instance of a phrase.

this might help you get started... and now, please, when you solve your issue, post your code to this forum thread because i want to see it. ;-)
Code:
<?php

  $string = 'Hello World! This is a phrase i want to find.';
  $phrase = 'Phrase i want to find';
  // case insensitive matching
  $phrase_to_end = stristr($string, $phrase);
  
  if($phrase_to_end === FALSE)
  {
    echo $phrase . ' not found in string';
  }
  else
  {
      // do word limiter and highlight word here
      // using $phrase_to_end as your input text
       echo $phrase . ' WAS found in string';
       echo '<br>';
       echo $phrase_to_end;
     echo '<br>';
       echo $string;
  }
?&gt;
#6

[eluser]sophistry[/eluser]
oops, michael w. beat me to it.

however, my advice is don't modify the word_limiter() function by adding the $offset parameter - i should know; i wrote word_limiter(). ;-) the last page of that thread shows the function with my extensive commenting about the regex but someone took it out when they put it into the CI text helper.

$offset in MW's code above is a character position, but where it is placed in the regex is a word position.

you could test the code and report your results to MW and ask him to fix it or use stristr().

cheers.

glad to have you back michael wales!
#7

[eluser]Michael Wales[/eluser]
Thanks for the correction sophistry - I wasn't sure if it was right or not. I suck at Regex and don't have a dev environment to test on. Big Grin
#8

[eluser]sophistry[/eluser]
i love regex! as soon as i started thinking about it as a super compact pattern-detection programming language and approached it like i would any other programming challenge it became really fun.
#9

[eluser]adamfairholm[/eluser]
Thanks Michael and sophistry - I've got something going now that does the trick.

For some reason I've never come across stristr or strpos in my PHP travels, so that got me started on finding a solution.

This is what I ended up with that works for me:

Code:
function highlight_offset($str, $needle, $limit = 100, $tag_open, $tag_close, $end_char = '…', $start_char = '…')
{
  if(trim($str) == '')
  {
    return $str;
  }
  
  $sub_string = stristr($str, $needle);
  
  //If sstristr returns false, then just make the substring the original string
  if(!$sub_string)
  {
        $sub_string = $str;
        $start_char = '';
  }
    
  preg_match('/^\s*+(?:\S++\s*+){1,'.(int) $limit.'}/', $sub_string, $matches);
            
  if (strlen($str) == strlen($matches[0]))
  {
    $end_char = '';
  }
        
  $sub_string = rtrim($matches[0]).$end_char;
  
  $output_text = preg_replace('/('.preg_quote($needle, '/').')/i', $tag_open."\\1".$tag_close, $sub_string);

  return $start_char.$output_text;
}

I'm using this on a simple search, it combines the highlighting and offsetting. The search could also find a result in another field in the table, so if its not finding the result in this particular field, then it doesn't put the ellipsis at the beginning of the string.

I'm sure there are other ways to do this, but this gives me the results I'd like. Thanks for the help.
#10

[eluser]adamfairholm[/eluser]
Also, I'm with Michael on regex. I need to sit down with a good guide and start from the basics, otherwise it will continue to confuse and terrify me.




Theme © iAndrew 2016 - Forum software by © MyBB