If you’ve ever spent 2 or more hours trying to figure out why a perfectly working system suddenly begins to misbehave for only a select few people then you will easily relate to the rest of this post which I’m about to share with you.
After receiving two calls following an email with the same issue which I had never heard of or experienced, I decided this was it – This Means WAR (me on the one side and the problematic system on the other).
Looking through the carefully stacked mini optimized web server running Apache, with MPM-Worker, having a protective DNS layer caching system with DDos protection, I knew this would be another onion peeling exercise – hopefully there would be no tears in this case.
An application I had built and maintained for a client using the popular CakePHP framework and the technologies listed earlier had suddenly started sharing customers’ personal details between selected users at its own leisure.
Two customers had called and emailed my client claiming they “no longer felt safe” with this platform. In order to stall the crisis and provide more resources for the investigation (debugging), I asked my client to request for a screen shot from the affected users.
I was having other clients projects concurrently running and time was ticking on all of them. What do you do when you have problems attacking you from multiple directions?
To cut to the chase, I had no option left but to put out the fire in the house before trying to build up other storeys.
I stopped work on the other projects and then started with the first layer of the problem – localhost.
After conducting several tests on the app on localhost I knew there was no problem locally, and this had to be a remote live occurrence.
So quickly I switched to the first level of caching before the customers – the DNS cache. After turning this off I felt relieved and said to myself, now this problem should be gone. So I sat back and relaxed, only to receive another call moments later with the same complaint.
So this did not work and I was back to the problem again. Personally, I had not experienced the problem and it seemed strange and unfounded to me that this could really be happening.
The next attempt to resolve this could not have been at a better time. My high speed internet service provider was out of reach and I had to fall back to the GSM “broadband” modem. Right on the spot the phone rang and my client was explaining to me that they were on a live chat session with a customer who was experiencing the problem as we spoke.
Quickly, I sprung to action.
1. Check all the server logs
2. Check the database server for running queries.
3. Find out if there’s any system performance degradation or failure.
All of these led me to nowhere, just a few slow queries without any relationship to the main problem. Then I refreshed a page on the app which a user had earlier sent in with the mixed up data. And voila! I’m seeing another user’s information.
This was really creepy… It true, it appears my application is going bonkers.
Not the kind who easily gives in to defeat, I decided to do something out of the blues. I checked my IP address (41.190.2….), then checked the IP address of the users who had been complaining. The result was an astounding message to me which would fuel be my will to write this post detailing my experience. We were all using the same ISP at that moment within the same internet number range 220.127.116.11 – 18.104.22.168.
Suddenly my AHA moment had come. I almost would have screamed to myself – damn you EMTS!!! So this is your idea of incredibly fast internet speed with easy blaze! Caching page results for websites which clearly specify no-cache in their headers, then serving the most recent of those pages to every and any user on your network that requests for them. This is evil as I could end up looking at the private pages of a Facebook profile of someone on the street assuming we both access Facebook without HTTPS.
Surely, EMTS was saving bandwidth using this method, but they were also corrupting and mixing up people’s data along the way.
Hopefully they would resolve this soon. But rather than wait for a solution from them (if ever they realize that it’s a problem) and cause my clients more heartaches, I will have to quickly implement a random generator plugin to add random strings at the beginning of each users request.
If my hypothesis is right, this should prevent their caching server from caching every result as the same, while serving only unique requested pages to the correct user from whom the requests originated.