WAY2WEB: Web Design & business...


Link Extractor

When building certain web applications, you need to search through the entire page of source code to find all of the outgoing links. This can be a tricky task to wrap your head around - where to start?!?

Well, it isn't so hard, but to help you along, I've written the following script. Take a look and see what you think: (it's in PHP)

<?php
function hyperlinkextract($s1,$s2,$s){
  
$myarray=array();
  
$s1=strtolower($s1);
  
$s2=strtolower($s2);
  
$L1=strlen($s1);
  
$L2=strlen($s2);
  
$scheck=strtolower($s);

  do{
  
$pos1 strpos($scheck,$s1);
  if(
$pos1!==false){
    
$pos2 strpos(substr($scheck,$pos1+$L1),$s2);
    if(
$pos2!==false){
      
$myarray[]=substr($s,$pos1+$L1,$pos2);
      
$s=substr($s,$pos1+$L1+$pos2+$L2);
      
$scheck=strtolower($s);
      }
        }
  } while ((
$pos1!==false)and($pos2!==false));
return 
$myarray;
}

$content file_get_contents('http://www.way2web.net/');
$myarray hyperlinkextract("href=\"","\"",$content);

// Process all the links
foreach($myarray as $key => $val) {
echo 
"<br />".$val."\n";
}
?>

Feel free to use this for whatever you want. A link back to this site would be nice, but I won't impose that limitation on you. Just enjoy this little function.