イノベーション エンジニアブログ


株式会社イノベーションのエンジニアたちの技術系ブログです。ITトレンド・List Finderの開発をベースに、業務外での技術研究などもブログとして発信していってます!


このエントリーをはてなブックマークに追加

PHP Under the Hood

Nice to meet you! First-year engineer Ah-Yung here! For my first blog post I wanted to introduce some of the peculiarities in variable declaration and memory management within PHP. As a new-grad I have a background in C++, so the differences between a compiled language and an interpreted language intrigued me.

Variable Declarations

Variable declaration has always been a strict process. Especially with the constant need to ensure that the type and size were correct in relation to the variable at hand. However, with PHP much of the type declaration and memory allocation is done under the hood, Here is a basic example of a PHP variable declaration:

$myNum = 1.23;

In PHP variables are maintained and represented by an arbitrary value known as Zend value, usually shortened to zval. Each variable declared corresponds to a zval in the form of a struct:

Zval Struct

typedef struct _zval_struct {
    zvalue_value value;        /* variable value */
    zend_uint refcount__gc;    /* reference counter */
    zend_uchar type;           /* value type */
    zend_uchar is_ref__gc;     /* reference flag */
} zval;

Each zval struct contains the value of the variable, the type of the variable, the reference counter and the reference flag. The first two components are fairly straightforward; however, the reference counter and the reference flag are more specific to PHP.
The reference counter indicates the number of ways that the value of the variable is referenced to. Rather than make a new copy of the variable each time, PHP uses a symbol table to keep track of how many different ways that the variable is referred to. The reference flag is a way for PHP to keep track of whether the variable is a reference or not.

To get a basic idea of what a zval struct looks like in relation to a specific variable, debug_zval_dump($myVariable) can be used to show the contents of a variable’s zval struct. In continuation with the previous example using debug_zval_dump with $myNum would produce the following:

Example Code:

$myNum = 1.23;
debug_zval_dump($myNum);

Output:

double(1.23) refcount(2)

The output reveals that the type of the variable is a double, the value being 1.23, and the reference count as two. However, logically speaking the reference count at this point should be 1. PHP uses a system under the hood called copy-on-write. When a variable is used in a function, even one such as debug_zval_dump, PHP automatically creates a reference to the variable, yet waits to create a copy until the variable is changed.

The zval struct holds a union zval_value which is used to determine the type of variable declared.

Zval Struct

typedef union _zvalue_value {
    long lval;                 /* long value */
    double dval;               /* double value */
    struct {
        char *val;
        int len;               /* this will always be set for strings */
    } str;                     /* string (always has length) */
    HashTable *ht;             /* an array */
    zend_object_value obj;     /* stores an object store handle, and handlers */
} zvalue_value;

Another perk of PHP is that rather than with C++ where the developer needs to think about what level of precision they want based on single, double or float data types, PHP defaults the precision of double variables as 64-bit IEEE 754 precision.

Garbage Collection

While on the topic of data types I wanted to touch on the concept of arrays in PHP, especially in regard to memory management. When declaring dynamic arrays in a language such as C++ the developer needs to make sure to deallocate values from the arrays after they are used otherwise the allocated values will lead to a memory leak. However, PHP has a handy garbage collector that for the majority of the time takes care of memory management.

There are many different functions utilized by the garage collector such as in relation to variables, sessions, etc. As of now I will introduced two main ways that the garbage collector makes sure that unused variables are disposed of properly in order to prevent memory leaks.

  • Garbage collection based on reference count, once the variable’s reference count reaches 0 then the automatic garbage collection will make sure that there are no wild values left floating around in memory.

  • In regard to arrays, depending on the complexity of the array the garage collector may be able to collect the unused values with the above reference count. If not, then it will use cycle collection with the aid of a buffer in order to check the reference count of each unique value.

In a more complex scenario, and where the developer wants complete control over memory management it is possible to toggle the automatic garbage collection, however is not recommended unless memory leaks are very strictly maintained.

Conclusion

When I use a dynamically typed language like PHP I think it is really interesting to look at what PHP is actually doing under the hood in order to make development easier. If you have a moment, how about taking a step back and observing how a language like PHP actually moves behind the scenes!

Done