PHP: Arrays or Linked Lists?

Arrays vs. Linked List

If we talk about arrays and linked lists we know the pros and cons about both of them. No matter which programming language we use arrays benefit from direct access to its items, while linked lists are more memory efficient for particular tasks.

Array & Linked List
Array & Linked List

The items of a linked list keep a reference to their successor, so we can easily walk through the entire list. However we don’t have direct access to its elements. Thus we can’t go directly to its middle element! Even more – in particular implementations of a linked list we don’t know its length. But in some cases linked lists are far more effective than arrays. For instance reversing an array of non-numeric values require constant additional memory, but also requires n/2 exchanges. The same taks using linked lists is not only performed in linear time, but doesn’t require any additional memory. The only thing we need to do is to reverse the links – no movement of values and the items remain at the same place in the memory.

Merging of two arrays often require more space (proportional of the space of the two arrays) or many exchanges in case we try to do it in place. The same task on linked lists is far more effective with only changing pointers and without moving the values.

Arrays or Linked Lists are More Memory Efficient

Many developers consider linked lists as something used only in college, but actually they can be very useful in practice as well. However how practically useful they are? Let’s see the following PHP experiment.

Here we have one class called “Item”, which is designed to keep only one integer value as its key and to point to its successor. Practically this class is designed to be used by a singly linked list, but let say we put some of these objects into an array and the same amount of the “Item” objects into a linked lists so what are the results?

First let’s see the code!

class Item
{
    protected $_key = '';
    protected $_next = null;
 
    public function __construct($key)
    {
        $this->_key = $key;
    }
 
    public function setNext(&$next) { $this->_next = $next; }
    public function &getNext() { return $this->_next; }
 
    public function setKey($key) { $this->_key = $key; }
    public function getKey() { return $this->_key; }
 
    public function __toString()
    {
        return $this->_key . "\n";
    }
}

This is the “Item” class and here we have the Linked_List class. As you can see this is the very basic implementation of a linked list with only one “insert” method and the magic __toString() in order to print the entire list. The insert method pushes an item at the end of the list thus the insertion is O(1).

class Linked_List 
{
    protected $_head = null;
    protected $_tail = null;
 
    public function insert($item)
    {
        if ($this->_head == null) {
            $this->_head = $item;
            $this->_tail = $item;
            return;
        }
 
        $this->_tail->setNext($item);
        $this->_tail = $item;
    }
 
    public function __toString()
    {
        $current = $this->_head;
        $output = '';
 
        while ($current) {
            $output .= $current->getKey() . "\n";
            $current = $current->getNext();
        }
 
        return $output;
    }
}

Now let’s see the creation of an array with N objects of class “Item”.

$n = 10000;
$a = array();
for ($i = 0; $i < $n; $i++) {
    $a[$i] = new Item($i);
}

The same thing but using the Linked_List class follows on the lines below.

$n = 10000;
$a = new Linked_List();
for ($i = 0; $i < $n; $i++) {
    $a->insert(new Item($i));
}

And the Winner is …

More memory efficient is … the linked list! On the next chart we can see the results. It’s clear that for 10K objects the array uses nearly 1MB more memory than the linked list!

Array vs. Linked List Chart
 

So what do you think now? Will you use linked list in your code or not?

Final Words

Although the linked list seems to be more memory efficient we don’t have direct acess to it’s items. In the same time often we don’t need direct access, we just need to walk through the array, which doesn’t benefit from the direct access. In PHP this is usally done with some loop construction as “foreach”. So why we have such results in the experiment above. First our linked list is really very basic. It doesn’t have any functionality, which in fact shouldn’t affect memory usage much more. The array in the other hand keeps indexes for each of its items so this results in additional space. This explains a bit the victory of the linked list in the memory efficiency test.

In the other hand PHP can’t have the full benefit of using linked lists, trees and other data structures since it keeps them in memory only for the request. In this case C, C++, Java loads a data structure in memory till the software runs so unfortunately coding complex data structures in PHP doesn’t look as a great option. Indeed here we have an entire “Item” class only to keep an integer. Instead we can use an array of integers!

5 thoughts on “PHP: Arrays or Linked Lists?

  1. You also have to take into consideration the overhead involved in programming a linked list. Even to a developed programmer, the time it takes to program a linked list sometimes makes the ease of an array more attractive, particularly for simpler tasks.

  2. @AfterMath – absolutely agree with that! Arrays are far more simple to use. However in some libraries there are built-in linked list implementations so if that’s more memory efficient why not use it?

  3. I’m not sure you’ve really done arrays justice. In C, arrays are more memory-efficient than linked lists by the size of the pointers (which linked lists need to store and arrays don’t), at the cost of being harder to resize. However, PHP “arrays” are not merely arrays – they’re hashtables crossed with doubly-linked lists.

    On the other hand, your linked list is contaminated with hashtables as well – every single one of your items is a hashtable. Intuitively, that should lose the efficiency race – so why did it win?

    It took me until this point to see it, but you’ve actually filled your array with hashtable nodes, thus burdening your array with all of the inefficiencies of your linked list implementation on top of its own implementation. Not doing arrays justice indeed!

    I thought that smooth line was suspect, so I ran my own benchmarks:
    http://www.freeimagehosting.net/7kvmr

    The line for linked lists cuts off about midway because… my Apache Server crashed at that point. At that point, the linked list is showing memory usage that’s worse by a factor of three.

    A true linked list (with no hashtable contamination) would be able to outperform PHP’s array, but a true array would still beat it by a factor of about two. Linked lists really come into their own when you want to insert or remove elements in the middle of the sequence – for arrays, or just about any other data type, this is a linear-time operation, but linked lists can do it in constant time. PHP arrays, being a hybridized linked list, should theoretically be able to pull that off, but as far as I know there is no PHP function that exposes that functionality. A similar situation arises when you want to insert or remove elements at the beginning of the sequence – while there are several data types that can do that in constant time (for example, deques), PHP arrays are not one of them, at least when used conventionally; things may be a bit different when the array is in “associative mode”.

  4. For reference, here’s the code that I used for my benchmark:

    Array benchmark:

    <?php
    $max = 100000;
    $base_memory = memory_get_usage();
    $data = array();
    for($i = 1; $i &lt;= $max; $i++) {
    	$data[] = $i;
    	if($i % 100 == 0)	// Having this code in the loop appears to have no significant effect on memory usage
    		echo $i."t".(memory_get_usage() - $base_memory)."n";
    }

    Linked list benchmark:

    <?php
    class Item {
    	public $data;
    	public $prev;
    	public function __construct($data, $prev) {
    		$this-&gt;data = $data;
    		$this-&gt;prev = $prev;
    	}
    }
     
    $max = 100000;
    $base_memory = memory_get_usage();
    $head = new Item(1, null);
    $tail = $head;
    for($i = 2; $i &lt;= $max; $i++) {
    	$tail = new Item($i, $tail);
    	if($i % 100 == 0)
    		echo $i."t".(memory_get_usage() - $base_memory)."n";
    }
  5. @Brilliand I don’t understand what you mean his items are a kind of hash tables. Care to explain?

Leave a Reply

Your email address will not be published. Required fields are marked *