I came across a sceanario where due to load, several Application servers had thread built up and finally crashed. Thread dumps revealed that many threads were stuck on following stack trace
ajp-10.10.48.19-8009-238" daemon prio=10 tid=0x00000000537f7000 nid=0x51c waiting for monitor entry [0x000000004547e000..0x000000004547fc00]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.io.BufferedInputStream.read(BufferedInputStream.java:308)
- waiting to lock <0x00002aaaf2868f98> (a java.io.BufferedInputStream)
at sun.security.provider.SeedGenerator$URLSeedGenerator.getSeedByte(SeedGenerator.java:453)
at sun.security.provider.SeedGenerator.getSeedBytes(SeedGenerator.java:123)
at sun.security.provider.SeedGenerator.generateSeed(SeedGenerator.java:118)
at sun.security.provider.SecureRandom.engineGenerateSeed(SecureRandom.java:114)
at java.security.SecureRandom.generateSeed(SecureRandom.java:495)
We have had the similar symptoms as NIA. Under load the API to generate secure random number would take time. Random number generation is based on keyboard/ mouse activity. These are missing in servers. Hence the API takes time to generate a distinct random number. The API completes quickly when there is no load. But when there are many threads requesting for a random number, the API slows down.
To disable the API from generating a perfect random number and to make it complete fast we set a new Java following property in Java Options.
java.security.egd=file:/dev/./urandom
The 90th percentile transaction response time went down from 15.1 sec to 1.6 sec by just setting this property for a load of 50 concurrent users.
Here is the complete set of JAVA Parameters:
-Djboss.jvmroute.name=f2m56g2 -Dprogram.name=run.sh -server -javaagent:/AppData/AppDynamics/AppServerAgent/javaagent.jar=uniqueID=f2m56 -Xms4096m -Xmx4096m -XX:NewRatio=3 -Xss256k -XX:+UseParallelGC -XX:ParallelGCThreads=4 -XX:+UseAdaptiveSizePolicy -Dsun.rmi.dgc.client.gcInterval=3600000 -Djboss.partition.name=f2m56 -Dsun.rmi.dgc.server.gcInterval=3600000 -XX:+Disable ExplicitGC -XX:PermSize=384m -XX:MaxPermSize=384m -XX:+UseTLAB -Xloggc:ace2_gc.log -Dorg.jboss.logging.Log4jService.catchSystemOut=false -Djava.util.Arrays.useLegacyMergeSort=true -Djava.security.egd=file:/dev/./urandom -Djava.net.preferIPv4Stack=true -Djava.library.path=/jboss-4.2.3.GA/bin/native -Djava.endorsed.dirs=/jboss-4.2.3.GA/lib/endorsed -classpath /jboss-4.2.3.GA/bin/run.jar:/usr/java/jdk1.7.0_55/lib/tools.jar org.jboss.Main -Dorg.jboss.logging.Log4jService.catchSystemOut=false -Dcfg.system.property=/project/GEMSSmbImpl/config/env/common,/project/GEMSSmbImpl/config/env/GEMSCommonEnv.xml,/project/GEMSSmbImpl/config/env/GEMSOnlineEnv.xml,/project/GEMSSmbImpl/config/env/GEMSOnlineCfg.xml,/project/GEMSSmbImpl/config/Gems.properties
This is our env.:
OS version: Red Hat Enterprise Linux Server release 6.4 (Santiago)
JBoss version: jboss-4.2.3.GA
JDK version: jdk1.7.0_55
When you go to Linux prompt, you can actually list down the device /dev/urandom. You may have to find an equivalent urandom path, if your Linux version is different.
NOTE
One funny thing. We had to use file:/dev/./urandom. It would not work if we just set the property to /dev/random.
Java was waiting on input to the secure number generator code. On Linux, /dev/random is a “blocking” number generator meaning if it doesn’t have enough random data to provide, it will simply wait until it does. We can get more background information on /dev/random here - http://en.wikipedia.org/wiki//dev/random. Keyboard and mouse input as well as disk activity can generate the randomness, or entropy, needed but perhaps not fast enough for particular applications. A lack of random data would force the JVM to wait, for eternity if necessary.
Reference links -
http://tech.top21.de/techblog/20100317-why-java-s-random-number-generation-might-block-my-application-under-linux.html;jsessionid=B76AA3FB494E7E50B732CEB498B61668
http://www.ghidinelli.com/2011/01/11/cold-fusion-slow-start-messagebrokerservlet