This entry goes out to my Oracle techie friends that have been in the Linux camp for sometime now and are suddenly finding themselves needing to know more about Solaris… hmmmm… I wonder if this has anything to do with Solaris now being an available option with Exadata? Or maybe the recent announcement that the SPARC T3 multiplier for T3-x servers is now 0.25. Judging by my inbox recently, I suspect a renewed interest in Solaris to continue.
I have focused on Oracle database performance on Solaris for 14 years now. In the last few years, I began to work on Exadata and found myself needing to learn the “Linux” way of performance analysis. I will cover some basic tools needed for Oracle performance analysis with Solaris as well as some special performance topics. I am not a deep dive kernel type of guy, so don’t expect esoteric Dtrace scripts to impress your friends. I am not going to cover how to patch and maintain Solaris – this is out of scope. With this in mind, lets get started.
prstat(1M) … top on steriods!
Probably the first tool that the typical Linux performance guy reaches for is top. This is part of every Linux distribution that I know of but is sadly missing from Solaris… But Solaris has something much better “prstat(1m)“. I know the name is boring but it simply means “process status”. This is the first place to get an idea of how processes are performing on a system and quite likely the most useful tool in general.
# prstat PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 5623 oracle 32G 32G cpu63 0 0 0:02:23 1.0% oracle/1 5625 oracle 32G 32G cpu60 0 0 0:02:22 1.0% oracle/1 5627 oracle 32G 32G sleep 0 0 0:02:18 1.0% oracle/1 5629 oracle 32G 32G cpu38 0 0 0:02:16 1.0% oracle/1 5609 oracle 43M 38M sleep 0 4 0:01:21 0.6% rwdoit/1 5605 oracle 43M 38M sleep 0 4 0:01:18 0.6% rwdoit/1 5607 oracle 43M 38M sleep 0 4 0:01:18 0.6% rwdoit/1 5601 oracle 43M 38M sleep 0 4 0:01:17 0.6% rwdoit/1 ... ... Total: 106 processes, 447 lwps, load averages: 5.03, 7.11, 19.48 Top would show you.... # top load averages: 5.06, 6.84, 18.81 14:50:13 109 processes: 100 sleeping, 1 stopped, 8 on cpu CPU states: 92.7% idle, 4.0% user, 3.3% kernel, 0.0% iowait, 0.0% swap Memory: 128G real, 60G free, 50G swap in use, 135G swap free PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU-a- COMMAND 5623 oracle 1 0 0 0K 0K cpu57 3:08 66.41% oracle 5625 oracle 1 0 0 0K 0K cpu33 3:06 66.02% oracle 5627 oracle 1 0 0 0K 0K cpu14 3:02 64.45% oracle 5629 oracle 1 0 0 0K 0K cpu41 2:59 63.28% oracle 5609 oracle 1 0 4 0K 0K cpu31 1:46 37.70% rwdoit 5605 oracle 1 0 4 0K 0K cpu55 1:43 36.33% rwdoit 5607 oracle 1 0 4 0K 0K sleep 1:43 36.33% rwdoit 5601 oracle 1 0 4 0K 0K cpu32 1:42 35.94% rwdoit
What is happening? “top” shows details from a from process point of view where as by default prstat(1M) shows the system aggregates. To make prstat(1M) look more like top, you have to enable micro-state accounting with the “-m” option. With the “-m” option, prstat(1M) shows CPU utilization of various processes like top but with a LOT more detail. You have access to details regarding CPU time broken out by user and system. You can find out the percentage of time spent in traps, sleep, and time spent waiting for CPU “LAT”. Finally, you can see the number of voluntary and involuntary context switches along with the number of threads per process.
# prstat -m PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/NLWP 5623 oracle 48 18 0.0 0.0 0.0 0.0 29 4.3 38K 8 .3M 0 oracle/1 5625 oracle 48 18 0.0 0.0 0.0 0.0 30 4.3 38K 9 .3M 0 oracle/1 5627 oracle 43 21 0.0 0.0 0.0 0.0 30 5.5 32K 7 .3M 0 oracle/1 5629 oracle 42 21 0.0 0.0 0.0 0.0 31 5.6 33K 6 .3M 0 oracle/1 5609 oracle 17 21 0.0 0.0 0.0 0.0 57 5.7 33K 5 77K 0 rwdoit/1 5605 oracle 20 17 0.0 0.0 0.0 0.0 59 4.5 38K 5 89K 0 rwdoit/1 5607 oracle 17 19 0.0 0.0 0.0 0.0 58 5.4 32K 4 75K 0 rwdoit/1 5601 oracle 20 16 0.0 0.0 0.0 0.0 59 4.5 38K 5 90K 0 rwdoit/1
There are a lot of options available to prstat(1M) so do take a look at the prstat(1M) man page. Also look at the “scalingbits” blog for an excellent discussion on prstat(1M) monitoring. Stephan goes into much more detail about the utility and how to monitor by “zones” or “projects”… very useful stuff.
lsof = pfiles(1M)… the proc(1) commands
pfiles(1M) mirrors what “lsof” does but there is much more information available on a per-process basis available. A man of “proc(1)” shows the available process related commands: “pstack(1M)”, “pldd(1M)”, and “pflags(1M) to name a few. These utilities referred to as process introspection commands are detailed nicely on the solarisinternals.com website.
vmstat(1M), mpstat(1M), iostat(1M), sar(1M)… basically the same!
It is good to know that somethings are not all that different. The above tools have minor differences, but generally look the same. Some options are different and expanded. Take for example iostat.
“iostat(1M) for Solaris has as a “z” option that takes out the devices that are not doing any IO and have “Zeros”. The biggest issue with most of these tools come into play when there are missing options or the formatting is different. This messes up scripts that have been developed to help aid analysis. This is not too hard to fix… just something that is going to have to be worked out.
References
The best references for Solaris performance and analysis would be the “Solaris Internals” and the “Solaris Performance and Tools” books. These books describe the architecture of Solaris and show how to analyze and diagnosis performance issues… and you can get them on the kindle
The books also have an accompanying website “solarisinternals.com” to continue the discussion.
That is all for now…
Looks like Oracle has shutdown solarisinternals.com for good…It is not accessible…
You might find this series of videos useful as well: http://smartos.org/2011/05/04/video-the-gregg-performance-series/
Brendan’s analysis is brilliant and goes deep into the kernel…. but it ignores algorithmic changes in the database. In the case I wrote about, no amount of kernel diving could have have solved it. Both types of analysis are useful when done together.