Posts Tagged 'NAS'

Kernel NFS fights back… Oracle throughput matches Direct NFS with latest Solaris improvements

After my recent series of postings, I was made aware of David Lutz’s blog on NFS client performance with Solaris.  It turns out that you can vastly improve the performance of NFS clients using a new parameter to adjust the number of client connections.

root@saemrmb9> grep rpcmod /etc/system
set rpcmod:clnt_max_conns=8

This parameter was introduced in a patch for various flavors of Solaris.  For details on the various flavors, see David Lutz’s recent blog entry on improving NFS client performance.  Soon, it should be the default in Solaris making out-of-box client performance scream.

DSS query throughput with Kernel NFS

I re-ran the DSS query referenced in my last entry and now kNFS matches the throughput of dNFS with 10gigE.

Kernel NFS throughput with Solaris 10 Update 8 (set rpcmod:clnt_max_conns=8)

This is great news for customers not yet on Oracle 11g.  With this latest fix to Solaris, you can match the throughput of Direct NFS on older versions of Oracle.  In a future post, I will explore the CPU impact of dNFS and kNFS with OLTP style transactions.


Direct NFS vs Kernel NFS bake-off with Oracle 11g and Solaris… and the winner is

NOTE::  Please see my next entry on Kernel NFS performance and the improvements that come with the latest Solaris.


After experimenting with dNFS it was time to do a comparison with the “old” way.  I was a little surprised by the results, but I guess that really explains why Oracle decided to embed the NFS client into the database 🙂

bake-off with OLTP style transactions

This experiment was designed to load up a machine, a T5240, with OLTP style transactions until no more CPU available.  The dataset was big enough to push about 36,000 IOPS read and 1,500 IOPS write during peak throughput.  As you can see, dNFS performed well which allowed the system to scale until DB server CPU was fully utilized.   On the other hand, Kernel NFS throttles after 32 users and is unable to use the available CPU to scale transactional throughput.

lower cpu overhead yields better throughput

A common measure for benchmarks is to figure out how many transactions per CPU are possible.  Below, I plotted the CPU content needed for a particular transaction rate.  This chart shows the total measured CPU (user+system) to for a given TPS rate.


As expected, the transaction rate per CPU is greater when using dNFS vs kNFS.  Please do note, that this is a T5240 machine that has 128 threads or virtual CPUs.  I don’t want to go into semantics of sockets, cores, pipelines, and threads but thought it was at least worth noting.  Oracle sees a thread of a T5240 as a CPU, so that is what I used for this comparison.

silly little torture test

When doing the OLTP style tests with a normal sized SGA, I was not able to fully utilize the 10gigE interface or the Sun 7410 storage.   So, I decided to do a silly little micro benchmark with a real small SGA.  This benchmark just does simple read-only queries that essentially result in a bunch of random 8k IO.  I have included the output from the Fishworks analytics below for both kNFS and dNFS.

Random IOPS with kNFS and Sun Open Storage

Random IOPS with dNFS and Sun 7410 open storage

I was able to hit ~90K IOPS with 729MB/sec of throughput with just one 10gigE interface connected to Sun 7140 unified storage.  This is an excellent result with Oracle 11gR2 and dNFS for a random test IO test… but there is still more bandwidth available.  So, I decided to do a quick DSS style query to see if I could break the 1GB/sec barrier.

SQL> select /*+ parallel(item,32) full(item) */ count(*) from item;
Elapsed: 00:00:06.36

SQL> select /*+ parallel(item,32) full(item) */ count(*) from item;

Elapsed: 00:00:16.18

kNFS table scan

dNFS table scan

Excellent, with a simple scan I was able to do 1.14GB/sec with dNFS more than doubling the throughput of kNFS.

configuration notes and basic tuning

I was running on a T5240 with Solaris 10 Update 8.

$ cat /etc/release
Solaris 10 10/09 s10s_u8wos_08a SPARC
Copyright 2009 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
Assembled 16 September 2009

This machine has the a built-in 10gigE interface which uses multiple threads to increase throughput.  Out of the box, there is very little to tuned as long as you are on Solaris 10 Update 8.  I experimented with various settings, but found that only basic tcp settings were required.

ndd -set /dev/tcp tcp_recv_hiwat 400000
ndd -set /dev/tcp tcp_xmit_hiwat 400000
ndd -set /dev/tcp tcp_max_buf 2097152
ndd -set /dev/tcp tcp_cwnd_max 2097152

Finally, on the storage front, I was using the Sun Storage 7140 Unified storage server as the NFS server for this test.  This server was born out of the Fishworks project and is an excellent platform for deploying NFS based databases…. watch out NetApp.

what does it all mean?

dNFS wins hands down.  Standard kernel NFS only essentially allows one client per “mount” point.  So eventually, we see data queued to a mount point.  This essentially clips the throughput far too soon.   Direct NFS solves this problem by having each Oracle shadow process mount the device directly.  Also with dNFS, all the desired tuning and mount point options are not necessary.  Oracle knows what options are most efficient for transferring blocks of data and configures the connection properly.

When I began down this path of discovery, I was only using NFS attached storage because nothing else was available in our lab… and IO was not initially a huge part of the project at hand.  Being a performance guy who benchmarks systems to squeeze out the last percentage point of performance, I was skeptical about NAS devices.  Traditionally, NAS was limited by slow networks and clumsy SW stacks.   But times change.   Fast 10gigE networks and Fishworks storage combined with clever SW like Direct NFS really showed this old dog a new trick.

Direct NFS access to Sun Storage 7410 with Oracle 11g and Solaris… configuration and verifcation

During the course of experimentation with 11gR2, I was given some space on a Sun Storage 7410 NAS.   In the past NAS meant using NFS with obscure mount options that seemed to vary from platform to platform.  So, at first I went scrambling for the “best practices” to use with Oracle NAS on Solaris.

There is a nice Metalink article Note:359515.1 with the latest information for all platforms.  This Metalink note does include the “tcp” option which is not necessary on Solaris.  So it boiled down to the following mount options for using Oracle data files on NAS devices with Solaris.

forcedirectio, vers=3,suid

But wait, what about the new 11g feature to use direct NFS “dNFS”?    More searching…

configuring dNFS on Solaris

This is a fairly simple process.  Although Oracle dNFS configuration is fairly well documented for Linux, I will post my interpretation and commentary to help other Solaris users that might want to configure dNFS.

First, create mount the NFS share just as you would have in the past.  Oracle still needs to see the file system from the OS point of view.  You don’t have to use the mount options as in the past, but you might want them anyway for OS tools may access the mount.  You would most likely place these options in the “/etc/vfstab” file, but I will just show the mount command.

mount -o rw,bg,hard,nointr,rsize=32768,\
wsize=32768,noac,forcedirectio,vers=3,suid \

toromondo.west:/export/glennf /ar1

Second, you have to link the direct NFS libraries in place of ODM.  This is a little clunky, but not terrible.

cp libodm11.so_stub
ln -s

Third, create the “$ORACLE_HOME/dbs/oranfstab” file.  This file defines the various details Oracle needs to directly access the NFS share.  You can configure multiple paths, so that Oracle can multiplex access to the NFS share.  This is for redundancy and load balancing.  There is another Metalink article ID:822481.1 that details how to configure dNFS with multiple paths across the same subnet and force the OS to not route packets.  This is a great feature, which I will try once I get some more network plumbing.  For now, I just did the most simple configuration as shown below.

cat $ORACLE_HOME/dbs/oranfstab
server: toromondo.west
path: toromondo.west
export: /export/glennf mount:/ar1

Finally, you will be able to see if this takes effect by looking at the “alert.log” file.  When Oracle starts up it places debug information in the alert.log file so we can see if Oracle is using Direct NFS or not.

grep NFS alert_*.log
Oracle instance running with ODM: Oracle Direct NFS ODM Library Version 2.0
Direct NFS: attempting to mount /export/glennf on filer toromondo.west defined in oranfstab
Direct NFS: channel config is:
Direct NFS: mount complete dir /export/glennf on toromondo.west mntport 38844 nfsport 2049
Direct NFS: channel id [0] path [toromondo.west] to filer [toromondo.west] via local [] is UP
Direct NFS: channel id [1] path [toromondo.west] to filer [toromondo.west] via local [] is UP

That’s all there is to it.  Hopefully, you will find this useful.

Oaktable Member