Sunday, October 28, 2012

Regarding favicon.ico and related 404.

In recent times the /favicon.ico hit was showing up many 404's in the Web servers access logs. Found that ga.js is accessing this link and it is not finding the icon in the default root of the site which it ideally should.

Performance tip - 
Make sure there’s a favicon.ico file in the root of the site, i.e. /favicon.ico. Browsers will automatically request it, even if the icon isn’t mentioned in the HTML at all. If you don’t have a /favicon.ico, this will result in a lot of 404s, draining your server’s bandwidth.

Refer -
http://programmers.stackexchange.com/questions/46716/what-should-every-programmer-know-about-web-development

Thursday, October 25, 2012

Persistent Connections and their Timeouts for Browsers IE and FF


Persistent Connections and their Timeouts for Browsers IE and FF

Internet Explorer
By default, versions 6 and 7 use 2 persistent connections
Version 8 uses 6.

Persistent connections time out after 60 seconds of inactivity which is changeable via the Windows Registry.

Mozilla Firefox
The number of simultaneous connections can be customized (per-server, per-proxy, total).
For e.g. For FF 13.0.1 -> network.http.max-persistent-connections-per-server = 8

Persistent connections time out after 115 seconds (1.92 minutes) of inactivity which is changeable via the configuration -> network.http.keep-alive.timeout in about:config

Thursday, October 18, 2012

Ways to reduce MySQL Master-Slave Replication Lag

Many a times batch jobs are written to do the following. In case of MySQL some of these can cause Master-Slave Replication lags -

1. Need to update a lot of rows
2. Need to select rows for modification
3. Need to insert and select
4. Need to execute batch jobs on master
Some of the ways to overcome this are detailed below.

1. Need to update a lot of rows

Solution -
We know MySQL Replication is single threaded. We should avoid any long running write query to prevent  replication stream clog.  One way to do this - Chop the query

Example -
If we need to update 10,000 rows then instead of doing this in the SAME query we can do this in say 10 Small and Short queries

2. Need to select rows for modification

If there is considerable effort needed to select rows for modification – spread it out and have separate select and update queries. In such case slave will only need to run UPDATE and not make Slave to do more work than it needs to do.

Example:
UPDATE posts SET  spam=1 WHERE body LIKE "%cheap rolex%"; TO

SELECT id FROM posts WHERE  body LIKE "%cheap rolex%";
UPDATE posts SET spam=1 WHERE id IN (list of ids)

3. Need to insert and select

Solution -

Best solution is to avoid INSERT … SELECT going through replication for reasons like long query time, waste of execution on slave, etc.

Other alternative is - Piping data through application is the best solution in many cases and is quite easy – it is trivial to write the function which will take SELECT query and the table to which to store its result set and use in your application in all cases when you need this functionality.

4. Need to execute batch jobs on master

We should not overload the replication. Quite typically we see replication lagging when batch jobs are running. These can load master significantly during their run time and make it impossible for slave to run the same load through single thread.

Solution -

i) Space out and slow down the batch jobs (such as adding sleep calls) to ensure there is enough breathing room for replication thread.

ii) Use controlled execution of batch job – If we can check the Slave lag every so often and pause if it becomes too large. This is a bit more complicated approach but it saves from adjusting the Sleep behavior to keep the progress fast enough and at the same time keep replication from lagging.


Refer -
http://www.mysqlperformanceblog.com/2008/09/22/fighting-mysql-replication-lag/
http://www.mysqlperformanceblog.com/2007/10/12/managing-slave-lag-with-mysql-replication/

Monday, October 1, 2012

File Fragmentation & MySQL Performance


Problem Description
---------------------------------------------------
Since several months we were observing a specific Select query that was tuned but running slow on couple of tables in MySQL 5.1.62 (Innodb engine) in our production database. Also, the queries continued to be slow after the database upgrade to MYSQL 5.5.24.

One of the table has 1.4 million rows of size 184 MB. The select query was tuned and using a ref join, scanning ~150 rows only. The Select query was taking 220 milliseconds. Since we had multiple select queries in a loop the time for the transaction inflated to more than 30 seconds.

We ran the Optimize command on the table and the .ibd file size did shrink in order of 20-25 MB. But, the the queries continued to be slow. We then dumped and restored the table and to our surprise the queries ran very fast. The performance gain was 10 times.


The root cause was File fragmentation.

Lessons learnt -
1) Learnt a core difference between a Dump/restore v/s Optimize.
2) Command named filefrag

1) DUMP and RESTORE causes the table to be rebuilt entirely from scratch, in the 5.5 row format.    It would have also gotten rid of any fragmentation that may have existed on the filesystem
(since we weren't doing a simple cp or mv call for the files on the filesystem, as happens with the ALTER/OPTIMIZE TABLE) as well as on any logical volume, and finally on the actual storage device as well.

2) filefrag (Linux)

Following was the output generated for two tables
> filefrag foo.ibd
Output > foo.ibd: 83 extents found, perfection would be 2 extents

> filefrag poo.ibd
Output > poo.ibd: 131 extents found, perfection would be 4 extents