Saturday, December 3, 2011

Some links am tracking

Some links am tracking -


http://www.webperformancetoday.com
http://kirk.blog-city.com/antipattern_nostress_testing.htm
http://www.javapassion.com/portal/java-performance-with-passion/java-performance-with-passion
http://www.simtree.net/more.html
http://www.javaperformancetuning.com/
http://www.perftestplus.com/presentations.htm

http://www.wilsonmar.com/
https://www.google.com/reader/bundle/user%2F04786489666004867675%2Fbundle%2Fperformance-feeds

Thursday, December 1, 2011

About fault isolative architectural structures

There are four general principles for designing and implementing swim lanes in a system’s architecture:
  1. There must not be any shared hardware or software between lanes other than possibly highly available network gear such as paired load balancers or border routers. (Nothing is shared)
  2. No synchronous calls can take place between swim lanes. If cross swim lane communication is required, i.e. grabbing search results to display on the same page as login, it must be done asynchronously. (Nothing crosses a swim lane boundary, Transactions occur along swim lanes)
  3. Limit asynchronous communication. While permitted, more calls lead to a greater chance of failure propagation.
  4. Use timeouts with asynchronous calls. There is no need to tie up the user’s browser or your server waiting for the eventual TCP timeout.
A Cheat Sheet-
Swim lane the money maker
Swim lane the problem areas
Swim lane natural barriers

--- From the book Scalability Rules.

Saturday, November 19, 2011

What comes to my mind when I think of Performance, Scalability, Availability ?

When I think of performance what comes to my mind.

1) Parallel vs Serial
2) Cache vs disk access
3) Less clicks vs more clicks
4) Synchronous vs Asynchronous design
5) Compression
6) Archive
7) Concurrency
8) Contention
9) Stateful vs Stateless system design
10) Design to scale up vs scale out

Friday, November 18, 2011

Categories of objects in JVM heap

Objects created and stored on the heap generally fall into three categories:

1) Short-lived objects - life-cycle of these is bound to a http request

2) Medium-lived objects - usually represent cache entries with shorter TTL (time to live)

3) Long-lived objects - represent cache entries with big TTL, settings and infrastructure objects (plugin framework, rendering engine, etc), cache entries taking most of the space.

Wednesday, November 2, 2011

Considerations for deciding between native and Web mobile apps

What should we use a Mobile web or Native apps and for what type of applications? This question popped in my head whilst reviewing a mobile solution that we plan to launch.

After some reading I learnt -various factors that one needs to bear in mind before deciding either way, they are -

1) Market/User segmentation
2) Development and maintenance cost
3) Distribution and channel support
4) User experience and expectations
5) Off-line operation
6) Time to market
7) Development skills and capabilities
8) Data security
9) Technical product management
10) Usage analytics

Personally I believe that in short term Native Apps may steal the show but it may fade in long run as its difficult to manage apps version for various devices and it requires various skills sets to develop.

I am tending more towards the Web apps using some cross-platform frameworks as many other media touch points would increase (tablets, web connected TV, etc). HTML5 would be a parallel work so that in future which ever path takes center stage .. am atleast using that technology

Disable hot deployment in production

A colleague of mine shared this blog link which may help if you are using Weblogic Application server. It talks about disabling hot deployment in Weblogic server especially for Production environment.

http://jojovedder.blogspot.com/2009/05/slow-weblogic-response-jsp-and-servlet.html

Saturday, October 22, 2011

Memcached - Points to ponder

Few points to watch out in Memcached implementation and testing

  1. Decide between Early loading v/s Lazy loading or combination
  2. Size of Memory & CPU (not CPU intensive)
  3. Set LRU policy (will help Memory sizing)
  4. Latency
  5. What to cache and what to skip?
  6. Identify no. of Memcached instances and banks
  7. Select appropriate log levels
  8. Separate & manage Memcached logs
  9. Monitor Memcached servers (CPU, Memory,hits & misses of database and cache, etc)
  10. Ensure code coverage to leverage caching
  11. Test for thread contention
  12. Identify when to use get v/s multi-get
  13. Memcached allows us to scale out Application server instances. In case of multiple we need to take care of synchronization blocks (if any) used in our code.

Sunday, October 16, 2011

Conway's Law.

This week I came across Conway's Law which says "Organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations."

I think its important to note that in a project comprising of different teams, the teams should NOT work in silo.

For e.g.
A project may consist of following teams - User interface teams (create screens), developer team (code), reporting team (develop reports) , Quality control team and the deployment/server monitoring team.

Now it may so happen that the reporting team create report wherein they write the entire logic in Stored procedures as the team would be well versed in writing procedures and interaction with the development team is limited.

In this scenario, the performance team and monitoring team may find a procedure taking up resources of database and may find it as a root cause of server slow down. But in reality, the real root cause lies in whats Conway's Law says.

If the development team and reporting team worked together then probably during the transaction itself a summary table could have been populated on which the reports could work.

Sunday, September 18, 2011

Log slow queries, prioritize & tune them

In my current assignment we collect queries that take more than a second to execute in our production environment We then parse these logs using a perl script to populate unique queries a database table. A charting tool is used to generate top 10 slow schema's and their top ten slow queries to keep our focus.

As a next step we tune the queries on following priority basis:

1. repeatedly run slow
2. examine lot of records
3. fetch lot of records
4. take more time to execute
5. take up locks for long time

We spend some time every day on the reports and many a times we learn new patterns which if attached would help to get rid of many slow queries.

For e.g. we had a release of a common component that several applications use. We found that the slow query count increased one fine day. The common component was actually firing these queries impacting performance of the applications using them. We of course tuned all of them and witnessed the sky scrappers become huts in 2 days :)

String concatenation

There are various ways of implementing String concatenation in Java. Here are three means which one would often hear and ponder on their differences with respect to performance implications of using them

String – Concatenation of two string objects creates a new object internally. Use String objects in places where concatenation requirement is very less. (String object is immutable, i.e. value stored in the String object cannot be changed)

StringBuffer – Is synchronized, which means it is thread safe and hence we can use it to implement threads for your methods

StringBuilder – Added in Java 5. Identical in all respects to StringBuffer except that is NOT synchronized, which means that if multiple threads are accessing it at the same time, there could be trouble (Not thread safe). For single-threaded programs, the most common case, avoiding the overhead of synchronization makes the StringBuilder very slightly faster.

Note: StringBuffer/StringBuilder is mutable.

Review default configuration settings

I learnt that its important to review the default server configuration settings. Some of the examples are:

1. Server log Level, Disable console logging, etc
2. Clients/Threads in Web, Application and Database tiers
3. JVM settings in an JEE Application server
4. Max File Descriptors - OS like Linux.

How do I decide which HTML object I need?

After reviewing many web pages I have come with some guidelines to select an HTML object and its impact on performance.

1. Criteria - If records are less than 5
Object type to use - Radio button/Check box
Example - Male/ Female

2. Criteria - Records rarely increase & is easy to scroll & choose
Object type to use - Drop down
Example - City, State, Country

3. Criteria - Records increase but still in range & limited filters to search
Object type to use - AJAX look up (AJAX call can be triggered after certain no. of characters are typed)
Example - Clinic names, Site names.

4. Criteria -Records are in thousands and can increase also, many filters to search
Object type to use - Use Lookups (popup screen)
Example - Search an item by item id, item name, store name, item type, etc

Web Reports - Some points to check for.

Some of the things one can check in web reports -

1. Mandate filters - To narrow the number of records fetched

2. Put appropriate Constraints – Date field can always be constrained although its period and any constraints on any other field would required help from a functional domain expert

3. A constraint would typically require Functional knowledge

4. Set default for certain filters e.g. Status filter defaulted to Open

5. Paginate depending on the context - need not be same for all screens

6. To reduce navigation show the latest/relevant records first

7. Leverage various Pagination design patterns

Points to ponder before writing a Stored Procedure

I have been in several debates and have read several articles on whether to use Stored procedure. I came up with set of points one should ponder before writing a Stored Procedure.

Basically a stored procedure is -
- Set of SQL statements stored on the server and are pre-compiled.
- Takes in certain arguments and processes that code with those arguments at execution time

Points to ponder before we use Stored Procedure

1. Ease to scale database compared to application tier?
2. Ease to write and maintain?
3. Ease to debug
4. Need for specialized skill set
5. Performance gain
6. Reduction in traffic between application and database tier
7. Reusability & transparency
8. Security

Developer - Query works fast in my environment but is slow in production?

Database perspective matters than the developers. :)

Generate a Query execution plan to understand the execution plan chosen by the database optimizer.

An execution plan (Explain plan) helps to understand:

1. Number of rows examined
2. Use of indexes
3. Missing indexes
4. Type of join
5. Information on sorting, etc

Avoid Full Table Scans

Full table scans are where the database will read the entire table without an index.
Some of the reasons why full table scans are performed:

1. no WHERE clause
2. no index on any field in WHERE clause
3. poor selectivity on an indexed field
4. too many records meet WHERE conditions
5. using SELECT * FROM
6. function used on the indexed column in the query

Tip - Generate Explain plans to understand how database optimizer treats the query

Software Library Management - Do’s & Don’ts

In software projects we make use of various types of software libraries. These libraries can be third party, application server related or project specific. Governance of Library management in such projects is very important. I have compiled a small list of do's and don'ts of software library management.

1. Avoid duplicates jar files
2. Avoid multiple version of same jar files
3. Remove unwanted jars
4. Use latest stable versions of library files
5. Use Server default Library NOT Application library for common shared libraries
6. Form a Change management to govern the library management and its usage.

In one of the projects am going through 250 + jar files to know which files to keep, update or delete. It's difficult as I do not know which jar files are used by which deployed application. After some googl-ing I came across a very handy utility named Jaranalyzer which helps to understand jar dependency. The tool needs to be fed with the folder that contains all the jar files and the output is either an XML or a dot file. The dot file can be passed through an open source graph visualization software to get a graphic representation of the dependency.

URLs to refer -
http://code.google.com/p/jaranalyzer/
http://www.kirkk.com/main/Main/JarAnalyzer#xml
http://www.graphviz.org/