Tag Archives: possible solution

5 PHP String Functions You Need to Know

Strings in PHP

The Task

First of all what we’d like to achieve? The task is to convert a string, most of the cases single word, by capitalize the first letter. In my case I’ve the world countries names all lower cased, while I need them with first letter capitalized. In example “united states” must become “United States”, but not “United states” or “UNITED STATES”. So here began the journey into PHP string functions, especially those for capitalization!

1. ucwords

The first thing you find in the PHP Manual is the ucwords function. It changes the first letter to a capital letter, but it does not do the job. Why? Well let me show you an example.

$str1 = 'foo bar';
$str2 = 'Foo bar';
$str3 = 'FOO BAR';
$str4 = 'фуу бар';
 
echo ucwords($str1); // Foo Bar
echo ucwords($str2); // Foo Bar
echo ucwords($str3); // FOO BAR
echo ucwords($str4); // фуу бар

Here we have four strings. A lower cased, an upper cased, a mixed cased and … a lower cased Cyrillic string. First of all the main reason why ucwords doesn’t fit here is because of the Cyrillic string. Whatever non-Latin string you have you can forget about capitalization. However the other strings conversions are also interesting. Take a look at the third string! Here the string remains “FOO BAR” instead of going “Foo Bar”, which simply means that this function only looks, and hopefully changes, the first letter.

So here we have two questions. How can we overcome the Cyrillic problem and how to “normalize” the UPPER CASE string?

2. ucfirst

This is another useful function in PHP. ucfirst as you may guess from its name converts a string by only changing its first letter. So “Foo bar” will remain “Foo bar”, while with ucwords it has become “Foo Bar”. Let’s see what this function does:

$str1 = 'foo bar';
$str2 = 'Foo bar';
$str3 = 'FOO BAR';
$str4 = 'фуу бар';
 
echo ucfirst($str1); // Foo bar
echo ucfirst($str2); // Foo bar
echo ucfirst($str3); // FOO BAR
echo ucfirst($str4); // фуу бар

Here even the first string has only one capital letter – “foo bar” became “Foo bar”, and yet again we’ve the Cyrillic string unchanged. It simply doesn’t help us here!

3. mb_convert_case

As a PHP developer you know what the “mb_” prefix means – multibyte. This is quite useful. You can convert the string whatever the encoding is, so perhaps we can overcome the Cyrillic problem. But before proceeding to tests, let’s take a look at the parameters of this function.

The first thing to note here is that mb_convert_case doesn’t contain the case in its name – upper or lower. There’s a second parameter, after the first which is the string itself, who setups that. Note that here you don’t have the typical camel case or capitals parameter name, but MB_CASE_TITLE (as you know in English the title is always capitalized):

echo mb_convert_case($str, MB_CASE_TITLE, ...

And a third one which specifies the encoding:

echo mb_convert_case($str, MB_CASE_TITLE, 'utf-8')

Now let’s see what we can achieve with it:

$str1 = 'foo bar';
$str2 = 'Foo bar';
$str3 = 'FOO BAR';
$str4 = 'фуу бар';
 
echo mb_convert_case($str1, MB_CASE_TITLE, 'utf-8'); // Foo Bar
echo mb_convert_case($str2, MB_CASE_TITLE, 'utf-8'); // Foo Bar
echo mb_convert_case($str3, MB_CASE_TITLE, 'utf-8'); // Foo Bar
echo mb_convert_case($str4, MB_CASE_TITLE, 'utf-8'); // Фуу Бар

As you can see now the Cyrillic problem doesn’t exists and mb_convert_case is intelligent enough to change “FOO BAR” into “Foo Bar” – as I said this is the English style titling. That is by no means the solution when you deal with capitalization with different encoding.

However there is another approach to overcome the all UPPER CASE conversion problem. A possible solution is to convert the string first to a lower case string.

4. strtolower

strtolower is very useful PHP string function and perhaps any PHP developer has used it at least once. But yet again – it does not do the job. Again because of the encoding problem.

echo strtolower('ФУУ БАР'); // #*&$(#*%#

As you can see the Cyrillic string cannot be lower cased! Let’s search again into the “mb_” universe.

5. mb_strtolower

This is the function. Again you’ve to specify the encoding:

echo mb_strtolower('ФУУ БАР', 'utf-8')

Conclusion

It doesn’t matter whether you’re native English speaker or not. Most of the web sites are multilingual and you cannot be sure what happens when you convert strings in Alphabets different from the Latin. Thus be careful even when everything seems to be OK with Latin string tests.

Secure Forms with Zend Framework

Maybe the correct title is not “with Zend Framework”, but “with PHP”, because the general approach I used is purely PHP and no Zend Framework dependency is used. However let me mention that ZF allows you to build forms with Zend_Form, which gives you an abstraction over the HTML forms with many goodies like validation, filtering and protection.

Zend_Form and Zend_Form_Element_Hash

Although the technique I’m using is doing the same thing, note that in ZF there’s a Zend_Form_Element_Hash which generates and validates the form, thus protecting you from CSRF attacks. The thing is that I didn’t use it because the form I’m protecting is not generated with Zend_Form, and I cannot benefit from everything ZF is giving to me. However you can easily reproduce the basic strategy with every form and every framework till it’s written in PHP.

What’s the solution?

It’s pretty simple and it’s described many many times around the web, simply generate a random hash, a possible solution is to use uniqid in combination with mt_rand and md5, thus you’d get quite strong hash.

Step two is to pass this generated hash, also stored in the session in a hidden value of the form. Of course now the most asked question is: but that’s visible to the source and thus everybody will have a valid hash.

There’s the trick. OK everybody will have a valid hash, but on submit the hash is validated against the SESSION variable, and as you know the session is specified between the browser (client) and the web server. Although the attacker may have a valid hash he must execute the attacking script from the same domain, possibly with the same browser, which makes the task rather difficult.

An Example

Let me show a breve example, it may help make things clearer.

1. First step – start the session

<?php
session_start();
?>

2. Second step – validate the form against the $_SESSION and generate a valid token

<?php
if (isset($_POST['name']) && $_POST['token'] == $_SESSION['token'])
    echo $_POST['name'];
else
    echo 'dont hack';
 
$_SESSION['token'] = md5(uniqid('test', true));
?>

3. Third step – make a form

<form method="POST" action="">
<input type="hidden" value="<?php echo $_SESSION['token'] ?>" name="token" />
<input type="text" name="name" value="stoimen" />
<input type="submit" name="submit" />
</form>

Demo here.

For more to test this you may try to make the same form somewhere else on the web and to point the action to http://www.stoimen.com/projects/php.secure.forms/! Without the session validation it’s absolutely sure you can post on the attacked server.

P.S. Now I’ve to admit that this have nothing to do with Zend Framework, however it’s good practice and thus may be used with every framework.