Beware of ISPs Data Cache – The Evils of Session Collision and Data Mix-up

If you’ve ever spent 2 or more hours trying to figure out why a perfectly working system suddenly begins to misbehave for only a select few people then you will easily relate to the rest of this post which I’m about to share with you.

After receiving two calls following an email with the same issue which I had never heard of or experienced, I decided this was it – This Means WAR (me on the one side and the problematic system on the other).

Eye Looking Over Person On Computer

Looking through the carefully stacked mini optimized web server running Apache, with MPM-Worker, having a protective DNS layer caching system with DDos protection, I knew this would be another onion peeling exercise – hopefully there would be no tears in this case.

An application I  had built and maintained for a client using the popular CakePHP framework and the technologies listed earlier had suddenly started sharing customers’ personal details between selected users at its own leisure.
Two customers had called and emailed my client claiming they “no longer felt safe” with this platform. In order to stall the crisis and provide more resources for the investigation (debugging), I asked my client to request for a screen shot from the affected users.

I was having other clients projects concurrently running and time was ticking on all of them. What do you do when you have problems attacking you from multiple directions?

To cut to the chase, I had no option left but to put out the fire in the house before trying to build up other storeys.
I stopped work on the other projects and then started with the first layer of the problem – localhost.
After conducting several tests on the app on localhost I knew there was no problem locally, and this had to be a remote live occurrence.

So quickly I switched to the first level of caching before the customers – the DNS cache. After turning this off I felt relieved and said to myself, now this problem should be gone. So I sat back and relaxed, only to receive another call moments later with the same complaint.

So this did not work and I was back to the problem again. Personally, I had not experienced the problem and it seemed strange and unfounded to me that this could really be happening.

The next attempt to resolve this could not have been at a better time. My high speed internet service provider was out of reach and I had to fall back to the GSM “broadband” modem. Right on the spot the phone rang and my client was explaining to me that they were on a live chat session with a customer who was experiencing the problem as we spoke.

Quickly, I sprung to action.

1. Check all the server logs

2. Check the database server for running queries.

3. Find out if there’s any system performance degradation or failure.

All of these led me to nowhere, just a few slow queries without any relationship to the main problem. Then I refreshed a page on the app which a user had earlier sent in with the mixed up data. And voila! I’m seeing another user’s information.

This was really creepy… It true, it appears my application is going bonkers.

Not the kind who easily gives in to defeat, I decided to do something out of the blues. I checked my IP address (41.190.2….), then checked the IP address of the users who had been complaining. The result was an astounding message to me which would fuel be my will to write this post detailing my experience. We were all using the same ISP at that moment within the same internet number range 41.190.0.0 – 41.190.31.255.

Suddenly my AHA moment had come. I almost would have screamed to myself – damn you EMTS!!! So this is your idea of incredibly fast internet speed with easy blaze! Caching page results for websites which clearly specify no-cache in their headers, then serving the most recent of those pages to every and any user on your network that requests for them. This is evil as I could end up looking at the private pages of a Facebook profile of someone on the street assuming we both access Facebook without HTTPS.

Surely, EMTS was saving bandwidth using this method, but they were also corrupting and mixing up people’s data along the way.

Hopefully they would resolve this soon. But rather than wait for a solution from them (if ever they realize that it’s a problem) and cause my clients more heartaches, I will have to quickly implement a random generator plugin to add random strings at the beginning of each users request.

If my hypothesis is right, this should prevent their caching server from caching every result as the same, while serving only unique requested pages to the correct user from whom the requests originated.

Better Performance – Speeding Up Your CakePHP Website

It seems that whenever I mention CakePHP to the developer community, those who have not used it think of it as a slow framework. Indeed it isn’t the fastest according to the results of many benchmarks – out of the box that is – but what it might lack in performance it certainly makes up for in Rapid Application Development.

By applying a few simple modifications, and even some more complex enhancements, CakePHP can be sped up quite a bit. By the time you work your way through even half of these changes, the performance of your your CakePHP site will be comparable to many other popular PHP frameworks, with the advantage that your development speed will never falter!

There are two types of modifications that I will be describing in the following article. The first is code changes, e.g. these will work for anyone even if you are running on a shared hosting environment. The second type is geared towards users who have their own dedicated or virtual server that they can add and remove software as required.

Do not fear though, if you can only follow the first set you will not be disappointed.

Upgrade CakePHP Versions

It’s important to note that people’s misconceptions around CakePHP’s speed probably date all the way back to versions 1.1 or 1.2. In version 1.3, the core development team performed a serious overhaul of CakePHP’s underlying architecture. In fact, I performed several benchmark tests comparing CakePHP versions 1.2 against 1.3 against 2.0. The results were astonishing. By simply updating from 1.2 to 1.3, I saw an immediate decrease by over 70% in the average time it took to load a page.

If you’ve been delaying upgrading your version of CakePHP and you are still using version 1.2 or less, the first thing you should do is upgrade to at least version 1.3, but preferably version 2.3.x would be even better.

The CakePHP Cookbook provides some very detailed migration guides:

Disable Debug Mode

Don’t laugh! This might seem obvious, but in a mad scramble to move code into production it can be very easy to forget to turn off debugging. A typical setup that I follow is to create multiple app/Config/core.php files: one for my local environment, one for dev, one for staging, and one for production. These files contain the different settings based on the target environment.

The key statement to change is Configure::write('debug', 2) to Configure::write('debug', 0).

The change hides all error messages and no longer refreshes the model caches. This is extremely important because, by default, each page load causes all of your models to be dynamically generated instead of cached from the first page load.

Disable Recursive Find Statements

When you perform a query using the find() method, recursion is set to 0 by default. This indicates that CakePHP should attempt to join any first level related models. For example, if you have a user model that has many user comments, each time you perform a query on the users table it will join the comments table as well.

The processing time on performing, returning, and creating an associative array with this data can be significant, and I’ve actually seen CakePHP sites crash in production because of this!

My preferred approach to making sure that the default recursion is none is to override the setting in app/Model/AppModel.php by adding the following code:

1 <!--?php
2 class AppModel extends Model {
3     public $recursive = -1;
4 }

Cache Query Results

This is truly my favorite optimization. I’d like to think I uncovered it myself, but I’m sure that would be debated as I’ve seen other articles discuss a similar solution.

In many web applications, there are probably a lot of queries going to the database that do not necessarily need to. By overriding the default find() function inside app/Model/AppModel.php, I’ve made it easy to cache the full associative array results of queries. This means that not only do I avoid hitting the database, I even avoid the processing time of CakePHP converting the results into an array. The code required is as follows:

01 <!--?php
02 class AppModel extends Model
03 {
04     public $recursive = -1;
05
06     function find($conditions = null, $fields = array(), $order = null, $recursive = null) {
07         $doQuery = true;
08         // check if we want the cache
09         if (!empty($fields['cache'])) {
10             $cacheConfig = null;
11             // check if we have specified a custom config
12             if (!empty($fields['cacheConfig'])) {
13                 $cacheConfig = $fields['cacheConfig'];
14             }
15             $cacheName = $this->name . '-' . $fields['cache'];
16             // if so, check if the cache exists
17             $data = Cache::read($cacheName, $cacheConfig);
18             if ($data == false) {
19                 $data = parent::find($conditions, $fields,
20                     $order, $recursive);
21                 Cache::write($cacheName, $data, $cacheConfig);
22             }
23             $doQuery = false;
24         }
25         if ($doQuery) {
26             $data = parent::find($conditions, $fields, $order,
27                 $recursive);
28         }
29         return $data;
30     }
31 }

Subtle changes need to be made to the queries we wish to cache. A basic query which looks like this:

1 <!--?php
2 $this->User->find('list');

requires updating to include caching information:

1 <!--?php
2 $this->User->find('list',
3     array('cache' => 'userList', 'cacheConfig' => 'short')
4 );

Two additional values are added: the cache name and the cache config that should be used.

Two final changes must be made to app/Config/core.php. Caching must be turned on and the cacheConfig value that is used must be defined. First, uncomment the following line:

1 <!--?php
2 Configure::write('Cache.check', true);

And then, add a new cache config as follows (updating the parameters as required for name and expiry):

1 <!--?php
2 Cache::config('short', array(
3     'engine' => 'File',
4     'duration'=> '+5 minutes',
5     'probability'=> 100,
6     'path' => CACHE,
7     'prefix' => 'cache_short_'
8 ));

All of the options above can be updated to extend the duration, change the name, or even define where the cache should be stored.

For a more detailed explanation on how to add, update, and purge the cache, continue reading about the specific caching optimization on my blog.

Don’t consider this the end of your CakePHP caching, though. You can cache controller actions, views, and even helper functions, too. Explore the Cache Component in the CakePHP book for more information.

Install Memory Based Caching

By default, PHP sessions are stored on disk (typically in a temp folder). This means that each time you access the session, PHP needs to open the session’s file and decode the information it contains. The problem is disk I/O can be quite expensive. Accessing items from memory opposed to disk I/O is immensely faster.

There are two nice approaches to session-related disk I/O. The first one is to configure a RAM disk on your server. Once configured, the drive will be mounted like any other drive and the session.save_path value in php.ini would be updated to point to the new RAM disk. The second way to is by installing software like Memcached, an open source caching system that allows objects to be stored in memory.

If you’re wondering which approach is best for you, the way to decide this is by answering the following question: Will more than one server be required to access this memory simultaneously? If yes, you’ll want to choose Memcached since it can be installed on a separate system allowing other servers to access it. Whereas, if you are just looking to speed up your single web server, choosing the RAM disk solution is nice and quick and requires no additional software.

Depending on your operating system, installing Memcached can be as simple as typing sudo aptitude install memcached.
Once installed, you can configure PHP to store sessions in memory opposed to on disk by updating your php.ini:

session.save_handler = memcache
session.save_path = 'tcp://127.0.0.1:11211'

If Memcached is installed on a different port or host, then modify your entries accordingly.

After you have finished installing it on your server, you will also need to install the PHP memcache module. Once again depending on your operating system, one of these commands will work for you:

pecl install memcache

or:

sudo aptitude install php5-memcache

Removing Apache and Installing Nginx

Apache is still the favorite according to recent statistics, but Ngnix adoption is picking up a lot of steam when it comes to the most-heavily trafficked websites on the Internet today. In fact, Nginx is becoming an extremely popular replacement for Apache.

Apache is like Microsoft Word, it has a million options but you only need six. Nginx does those six things, and it does five of them 50 times faster than Apache. — Chris Lea

Nginx differs from Apache because it is a process based server, whereas Nginx is event driven. As your web server’s load grows, Apache quickly begins to be a memory hog. To properly handle new requests, Apache’s worker processes spin up new threads causing increase memory and wait-time creating new threads. Meanwhile, Nginx runs asynchronously and uses one or very few light-weight threads.

Nginx also is extremely fast at serving static files, so if you are not using a content delivery network then you’ll definitely want to consider using Nginx for this as well.

In the end, if you are short on memory Nginx will consume as little as 20-30M where Apache might be consuming upwards to 200M for the same load. Memory might be cheap for your PC, but not when it comes to paying for servers in the cloud!

For a more in-depth breakdown between Apache and Nginx visit the Apache vs Nginx WikiVS page.

HowToForge has an excellent article for configuring Nginx on your server. I suggest following the step-by-step guide to install and configuring Nginx with php-fpm.

Configure Nginx to use Memcached

Once you’ve installed Nginx and Memcached, making them leverage each other will even further extend your performance. Even though CakePHP applications are dynamic, it’s likely the 80-90% is still relatively static – meaning that it only changes at specific intervals.

By making the following edits to your Nginx config file, Memcached will begin serving your Nginx requests that have already been processed and stored in memory. This means that only a few requests will actually invoke PHP which significantly increases your website’s speed.

server {
    listen 80;
    server_name endyourif.com www.endyourif.com;
    access_log /var/log/nginx/endyourif-access.log;
    error_log /var/log/nginx/endyourif-error.log;
    root /www/endyourif.com/;
    index index.php index.html index.htm;

    # serve static files
    location / {
        # this serves static files that exists without
        # running other rewrite tests
        if (-f $request_filename) {
            expires 30d;
            break;
        }
        # this sends all-non-existing file or directory requests
        # to index.php
        if (!-e $request_filename) {
            rewrite ^(.+)$ /index.php?q=$1 last;
        }
    }

    location ~ \.php$ {
        set $memcached_key '$request_uri';
        memcached_pass 127.0.0.1:11211;
        default_type       text/html;
        error_page 404 405 502 = @no_cache;
    }    location @no_cache {
        try_files $uri =404;
        fastcgi_split_path_info ^(.+\.php)(/.+)$;
        fastcgi_pass unix:/var/run/php5-fpm.sock;
        fastcgi_index index.php;
        include fastcgi_params;
    }
}

The above configuration specifies that static files should be served immediately. PHP files should be directed to Memcached to serve the page if it’s been cached. If it doesn’t exist in the cache, the @nocache location is reached which acts like your previous PHP setup to serve the page via php-fpm.

Don’t forget to update your app/Config/core.php to feed the memcache cache. The following:

1 <?php
2 $engine = 'File';
3 if (extension_loaded('apc') && function_exists('apc_dec') && (php_sapi_name() !== 'cli' || ini_get('apc.enable_cli'))) {
4     $engine = 'Apc';
5 }

becomes:

1 <?php
2 $engine = 'Memcache';

Our code for caching queries from earlier also requires updating:

1 <?php
2 Cache::config('short', array(
3     'engine' => 'Memcache',
4     'duration'=> '+5 minutes',
5     'probability'=> 100,
6     'path' => CACHE,
7     'prefix' => 'cache_short_'
8 ));

Remove MySQL and Install Percona

The final server modification I recommend is to install Percona‘s build of MySQL. The Percona team has spent many years understanding and fine-tuning the database’s architecture for optimal performance. It is best to follow the installation instructions from Percona as it describes the several different installation options.

Summary

There are quite a lot of techniques to ensure CakePHP will run lightning fast. Some changes may be more difficult to make than others, but do not become discouraged. No matter what your framework or language, optimization comes at a cost, and CakePHP’s ability to let you rapidly develop extremely complex web applications will strongly outweigh the effort involved in optimizing things once your app is complete.

via PHP Master

Switching From Apache MPM Prefork to Worker

My very first experience of setting up a live cloud server was one I had looked forward to with optimism.

In the past I was comfortable with the shared hosting and semi dedicate solutions which provided the basic tools for managing a website. But then I needed shell access and hosting my applications using the dedicated solutions provided by the shared hosting companies was costing a lot more than I bargained for.

A little overcommitted?

After getting several warning emails from my shared hosting company at the time for exceeding resource usage, there was just one option left – move.

Almost everyone is used to the localhost setup of a WAMP or LAMP stack. However, these implementations are built to run on a single computer without the rigors that a constantly active webserver experiences.

In the case of a live webserver on the internet you need to tweak and tune stuff in order to ensure your server doesn’t get overrun on this highway. Crawlers, Spiders, and SPAM bots all contribute to traffic on your server and you need to plan to manage them.

Setting up your LAMP stack is somewhat straight forward (on Debian, Ubuntu type: sudo apt-get install apache2 php5 libapache2-mod-php5).

I wouldn’t go into the details of setting that up here. What I want to share on the other hand is the effect of going along with the default implementation of the LAMP stack, and particularly the implementation of the Apache server that it comes bundled with.

By default, most new installations of Apache install the prefork memory management module (mpm-prefork) which handles each connection using a single process.

Sort of like having multiple instances of Apache running at the same time on the server. However, after several cases of server thrashing (endless memory swapping), I was left with two choices (other than increasing the RAM again). Either switch to Nginx which by default is built to work with PHP as a Fast CGI script, or configure Apache to run PHP using FastCGI with mpm-worker.

Due to time constraints I had to make the switch quick and I was sure I didn’t have enough time to debug a faulty new Nginx install. So I opted to go with the least intensive and more familiar option of changing Apache’s MPM to from prefork to worker.

Highlighted below are the simple steps I took to make this work in less than thirty minutes.

Note that the commands are for Debian distributions of Linux.

  1. Install the FPM CGI binary for PHP. apt-get install php5-fpm
  2. This will run as a daemon and you need to configure to run as a socket instead of it listening on a local port.
    • Open the file located at /etc/php5/fpm/pool.d/www.conf
    • Comment the line with listen = 127.0.0.1:9000 by adding a semicolon to the beginning of that line so it becomes ;listen = 127.0.0.1:9000
    • Now add a line after it with the following: listen = /var/run/php5-fpm.sock
      which makes PHP-FPM run as a socket
    • Restart the service by entering: service php5-fpm restart
  3. Now you need to install the FastCGI module for Apache’s worker MPM which would replace prefork. So you enter apt-get install libapache2-mod-fastcgi
  4. Once the installation is complete, Apache would have replaced your default prefork MPM with the new FastCGI module for worker.
  5. Finally you need to configure your sites to use the new FASTCGI module. It is preferable to make the changes to your main Apache configuration file located at /etc/apache2/apache2.conf
    • Use this config below making sure the directory path below exists or create a path that can be accessed on the server file system
    • <IfModule mod_fastcgi.c>
             AddHandler php5-fcgi .php
             Action php5-fcgi /php5-fcgi
             Alias /php5-fcgi /usr/lib/cgi-bin/php5-fcgi
             FastCgiExternalServer /usr/lib/cgi-bin/php5-fcgi -socket /var/run/php5-fpm.sock -pass-header Authorization
       </IfModule>
    • If you are using virtual hosts it is still preferable to make the changes in the main Apache config file in /etc/apache2/apache2.conf instead of adding it to each config file for each site you have enabled.
  6. Now, all that is left is for you to restart your web server typing the command: service apache2 restart

And that’s all to it. Your server should now work faster and suffer fewer memory outages than when it was configured to use prefork.

Best of luck in your adventure towards better performance …

Related Posts:

Using PHP5-FPM With Apache2 On Ubuntu 12.04 LTS

Setting up Apache with PHP-FPM

Installing Apache 2 with PHP FastCGI on Ubuntu 12.10

HOWTO: Building IPTables rules

How to Resolve Errors in CakePHP After Changing Database Structure – Missing Fields Error

Being an avid CakePHP user, I had the idea that I could do some things and get the desired results in my application whenever I wanted.

Looking now I’ve learned something new again. I recently renamed some fields in one of the tables in my database. I also made the necessary changes to the model file (refactoring) for that table in order to avert any unexpected results in my application.

yahoo internal server error message

So I tested the application locally to ensure there were no glitches and then committed and pushed my changes to the live repository.

But to my utmost dismay – it broke! My application was no longer functioning in one part. Next thing that came to mind was to delete the files in the tmp/cache/models to be sure the table structure wasn’t cached.

Yet, no joy. Now I was thinking of going into the Cakephp core library to find out why this thing was still using an old cached copy of the database table.

Looking through the Model.php file I noticed the caching was still based on the files in the tmp directory which I had deleted.

Finally, I decided to take one more swipe at the tmp directory this time deleting the files in both the tmp/cache/models and tmp/cache/persistent directories.

Alas! All the errors were gone. Now I’ll always remember to delete the files here manually of write an addon to do just that after modifying my database structure.