Welcome Guest, Not a member yet? Register   Sign In
Extending own libraries with good OO
#1

(This post was last modified: 12-10-2015, 03:45 PM by RobertSF.)

I'm working on a web scraper that turns web pages with rows of data (like a page of search results) into CSV files to use in Excel. Of course, there must be custom code for every site the scraper works on. Here is how I've arranged it, and it works fine, but I'd like feedback on the OO implementation. Not sure it follows best practices.

Here's the controller code. Give it the database row id of a search, and it will retrieve the search profile (search name, search url, etc.), load the correct library file, execute the scrape() function in that library, and then download the file.
PHP Code:
// controller Searches.php
public function execute($search_id)
{
    
$search $this->searches_model->get_search($search_id);
    if (
$this->input->server('REQUEST_METHOD') == 'POST')
    {
        
$this->load->library($search['site_class']);
        
$output $this->$search['site_class']->scrape($search);
        
force_download($search['name'] . '.csv'$output);
    } 

Then there is.
PHP Code:
// application/libraries/Site.php
abstract class Site {
    static private 
$curl_options = array(
        
//bunch of CURLOPT options
    
);
    abstract protected function 
scrape ($search);
    public function 
get_page($url)
    {
    }    
    public function 
clean_field ($field)
    {
    }

And
PHP Code:
// application/libraries/Site_craigslist.php
include ('Site.php');
class 
Site_craigslist extends Site {
const 
SITE 'http://sfbay.craigslist.com';
const 
SITE_CODE 'CL';
    public function 
scrape($search)
    {
    }

In the controller code, $search['site_class'] contains the string 'site_craigslist,' so Site_craigslist.php is loaded, and it loads Site.php, the parent class. It works, but I'm not instantiating anything, and I have to use self:: to access the parent class methods. I've read you generally shouldn't use classes statically, so I suspect the above is not best practice. Any input or suggestions for a better structure and approach?
Hey, don't work without a PHP debugger. Several free IDEs have this features built in. Two are NetBeans and CodeLobster. Without a debugger, it's like you're driving with a blindfold on -- you are going to crash!
Reply
#2

Yes, that's true. But my question wasn't about web scraping. It was a question about how to structure our own libraries. The question would exist even if it was a different application and not web scraping. Any ideas? Smile
Hey, don't work without a PHP debugger. Several free IDEs have this features built in. Two are NetBeans and CodeLobster. Without a debugger, it's like you're driving with a blindfold on -- you are going to crash!
Reply
#3

(This post was last modified: 12-11-2015, 06:34 PM by solidcodes.)

Ooops. sorry wrong answer.
How about just follow SOLID ?
No SEO spam
Reply
#4

Yes, good idea. I was just reading https://en.wikipedia.org/wiki/SOLID_(obj...ed_design)
Hey, don't work without a PHP debugger. Several free IDEs have this features built in. Two are NetBeans and CodeLobster. Without a debugger, it's like you're driving with a blindfold on -- you are going to crash!
Reply




Theme © iAndrew 2016 - Forum software by © MyBB