Wednesday, January 13, 2016

The redo and undo logs

Redo Logs -

Instead of writing to one file (the permanent table records file) you are writing to two different files (the redo log file and the permanent table records file). The difference is that the writes to the database table file(s) are random in nature while the writes to the redo log file will be sequential, which is usually going to be much faster. You perform the much-faster writes to the redo log as needed and then perform the slower writes to the table files periodically when there is time. Thus, the system actually operates faster writing to both files rather than only one file. This applying of the redo logs occurs automatically on startup of the MySQL server after a crash. Very similar processes are used by other database platforms.

Undo Logs -

In addition to the redo log there must also be undo logs. When a database user starts a transaction and executes some commands, the database does not know if the user will end the transaction with a COMMIT or a ROLLBACK command. Ending with  a COMMIT means all the changes made in the course of the transaction have to be preserved (fulfilling the Durable aspect of ACID). If the transaction gets interrupted for some reason, such as the MySQL daemon crashing, the client disconnecting before sending a COMMIT, or the user issuing a ROLLBACK command, then all changes made by the transaction need to be undone.
If the server crashed, the redo log files are applied first, on start up. This puts the database in a consistent state. Now the database server will need to roll back the transactions which were not committed but had  already made changes to the database. Undo logs are used for this.  As an example, if you are running a transaction that adds a million rows and the server crashes after eight hundred thousand inserts are performed, the server will first use the redo log to get the database server into a consistent stand than then will perform a rollback of the eight hundred thousand inserts using the undo logs. For InnoDB, this undo information is stored in the ibdata file(s).

Refer - 

High VM Memory analysis and lessons learnt

This week I investigated a high VM memory utilization on a JBoss servers in production
Using the Linux top command found that the JBoss - Java process was hogging 6.8 GB (Resident Memory). We have set JVM to 6144MB JVM.
Lets look at the sizing formula - 

Max memory = [-Xmx] + [-XX:MaxPermSize] + number_of_threads * [-Xss]

6144 MB + 384 MB + 25 MB (Say 100 threads * 256 K = 25mb) = 6553 MB (Close to 6.8 GB) +
Read this article -

https://dzone.com/articles/why-does-my-java-process which says - But besides the memory consumed by your application, the JVM itself also needs some elbow room.


This means the actual memory the JBoss uses is not equal to the JVM set, but its more at least in case of JAVA 1.7.

One more thing I learnt in investigation. There were several other java processes related to machine agent. We use a APM tool and this tool has a  machine agent which is used to collect the VM resources for correlation Recently the script to restart the JBoss servers were changed from killall -9 java to kill -9 . Due to this for some reason the machine agent processes were left hanging for each day. Each process was hogging 50-60 MB which over days becomes significant given that OS also requires memory.I wonder what would come out as the root cause of the Application servers swapping. It would sound something like this - " The issue of slowness/servers crashing/swapping was due to a APM tool agent" :)

Thursday, January 7, 2016

Quality of Service and Throttling (Work in progress)

QoS is not just about isolation; it’s about giving customers/apps what they need.
After setting a QoS its possible that the user experience is bad for that we throttle the incoming traffic so that the experience is maintained of those users who are say already logged in

Sunday, January 3, 2016

What to ELK (work in progress)

User Activity Data
How many reports ?
How many Logins ?
How many transactions done?

Host Activity Data
What is the CPU utilization ?
What is the Memory utilization ?

Application Activity Data

Which reports are being accessed most ?


Server logs - Exceptions logging/Excessive logging

Top slow and high count URLs
Respond code wise name and count of URLs
Response time wise (bucket) name and count of URLs
IP wise count of URLs


Security Infrastructure Data-
How many failed logins ?
How many logins ?

Database-
Slow Query Logs

Real User Monitoring-
Page Render
Page Load
Page Weight
FTTB
Page requests
First Interactive Time
Speed Index.