How to share directories using Linux NFS

You want to share directories and files between computers on your network but don’t want to use SAMBA or transfer them via FTP. Also, you can mount directories with NFS, so they appear as local files/folders. NFS also has the advantage of not needing special programs to move files back and forth.

Install the nfs-utils package on both the server and the client (while not necessary for basic NFS on the client, it provides some useful features such as locking)

Modify EXPORTS

Once NFS is installed on the server you need to tell the daemon which directories to export to NFS.

The /etc/exports file controls access to the NFS shares and is in the following format:

directory machineA(option,option) machineB(option,option) …

Where:

* directory = directory to export (e.g. /mnt/hdb1)
* machine(A|B)… = machine(s) allowed to mount this export (see below). You can specify any number of machines, each separated by a single space.
* option = options for the exporting (see below)

File: /etc/exports

By IP address

/opt/media 192.168.0.100(async,no_subtree_check,rw) 192.168.0.101(async,no_subtree_check,rw)

By DNS name

/opt/media spunkster(async,no_subtree_check,rw) nivvy(async,no_subtree_check,rw)

Or by IP range

/opt/media 192.168.0.0/255.255.255.0(async,no_subtree_check,rw)

(If you get an “Error exporting NFS directories” while using the “IP” example above switch to the “IP range” style.)

After this is complete, start the NFS daemon and add it to the default runlevel by issuing: /etc/init.d/portmap start && /etc/init.d/nfs start && rc-update add nfs default

If you make later changes to your exports file it is recommended that you run the following command and restart the NFS daemon: exportfs -ra && /etc/init.d/nfs reload

Options

The options can tailor the access the connecting machines have to the directory. Options are on a per-client-machine basis and are a simple comma-separated list enclosed in parenthesis after the machine they are modifying.

ro
(default) The client machine will have READ-ONLY access to the directory.
rw
The client machine will have READ/WRITE access to the directory.
no_root_squash
By default, any file request made by user root on the client machine is treated as if it is made by user nobody on the server. (Exactly which UID the request is mapped to depends on the UID of user “nobody” on the server, not the client.) If no_root_squash is selected, then root on the client machine will have the same level of access to the files on the system as root on the server. This can have serious security implications, although it may be necessary if you want to perform any administrative work on the client machine that involves the exported directories. You should not specify this option without a good reason.
no_subtree_check
If only part of a volume is exported, a routine called subtree checking verifies that a file that is requested from the client is in the appropriate part of the volume. If the entire volume is exported, disabling this check will speed up transfers.
sync
By default, all but the most recent version (version 1.11) of the exportfs command will use async behavior, telling a client machine that a file write is complete – that is, has been written to stable storage – when NFS has finished handing the write over to the filesystem. This behavior may cause data corruption if the server reboots, and the sync option prevents this. See Section 5.9 of the NFS FAQ for a complete discussion of sync and async behavior.
async
Opposite to sync, if not set system will default to sync option. Using async will also speed up transfers.

(descriptions taken from NFS FAQ listed below)

insecure
Tells the NFS server to use unpriveledged ports (ports above 1024). This may be needed to allow mounting the NFS share from MacOS X or through the nfs:/ kioslave in KDE.

Mounting exported directories

After the server is set up and the NFS daemon is running, you can move on to mount the exported directories on your client.

First, make sure you have the portmap service on, start the portmap daemon by issuing: /etc/init.d/portmap start

Adding the portmap daemon to the boot process is not usually needed because Gentoo’s init scripts figure that out by themselves! However to make sure that the init scripts indeed added portmap to the default runlevel type: rc-update show

If ‘default’ isn’t present on the portmap line, then it won’t start at the next bootup. Correct this by issuing: rc-update add portmap default

To do a test mount, create a mount directory and mount the remote drive by issuing: mount x.x.x.x:/directory /mount_directory

Where:

* x.x.x.x = IP address or DNS name of the NFS server
* /directory = Directory exported and to be mounted
* /mount_directory = Local mount point for the exported NFS directory

This may take a couple of minutes to mount, be patient.

hosts.allow

If you try to mount your NFS partition and get something similar to this:

# mount /mnt/nivvy
NFS Portmap: RPC: Program not registered

then it’s being blocked. To unblock it, edit the following:

On the NFS server, add all IPs you want accessing your NFS shares (again)
File: /etc/hosts.allow

# Portmapper is used for all RPC services; protect your NFS!
# (IP addresses rather than hostnames *MUST* be used here)
portmap: 192.168.0.20
lockd: 192.168.0.20
rquotad: 192.168.0.20
mountd: 192.168.0.20
statd: 192.168.0.20

or by IP Range

# Portmapper is used for all RPC services; protect your NFS!
# (IP addresses rather than hostnames *MUST* be used here)
portmap: 192.168.0.0/255.255.255.0
lockd: 192.168.0.0/255.255.255.0
rquotad: 192.168.0.0/255.255.255.0
mountd: 192.168.0.0/255.255.255.0
statd: 192.168.0.0/255.255.255.0

Followed by the command: /etc/init.d/portmap restart

Automatic mounting via FSTAB

To make the mounting occur on startup, add the following line to your FSTAB:

x.x.x.x:/directory /mount_directory nfs rw 0 0

Where the variables are defined as above.

Add the nfsmount daemon to the default runlevel: rc-update add nfsmount default

Security Implications

TODO: Information!

If you specify ‘no_root_squash’ in the server’s /etc/exports file, anyone who gains root permissions on the client automatically has root permissions on the server within that exported directory – good for sharing Portage directories, bad if nasty people want to compile and/or run evil software on your boxes.

IP addresses are not always static, so when using numeric addresses (as opposed to DHCP names), anyone who gains that IP has access to what you’ve exported. Keep this in mind with confidential information.
Note: The paragraph below no longer seems to be true. The latest versions of NFS support the sec=krb5 export option, which authenticates via Kerberos 5 instead of UIDs and GIDs.

NFS also uses numeric user and group ID’s, so, even if you keep the passwd files identical on your systems somehow, someone else with the right IP address can create a user on their own system that can access anybody’s files. Unless you are certain it is impossible to forge an IP address on your network, you cannot depend on the normal user/group access control. For this reason, NFS is not recommended for sharing user-private data (home directories, for example).

Setting Up Firewall (Server Side)

Setting up a firewall to cover NFS ports is quite tricky because there are ports that are assigned randomly as the NFS daemon is restarted. To see what ports you need to open, type in:

# rpcinfo -p

Try restarting the NFS daemon:

# /etc/init.d/nfs restart

Then type in rpcinfo -p again. You’ll see that some ports are changed. You probably note that some of these ports are static: Port 111 (tcp and udp) are for portmaps, and port 2049 (tcp and udp) are for nfs. The rest, which are equally important, are random. In order to fix this, you need to edit /etc/conf.d/nfs file to should look something like this:

# Number of servers to be started up by default
RPCNFSDCOUNT=8
# Options to pass to rpc.mountd
# ex. RPCMOUNTDOPTS=”-p 32767
RPCMOUNTDOPTS=”-p 32767″
# Options to pass to rpc.statd
# ex. RPCSTATDOPTS=”-p 32765 -o 32766″
RPCSTATDOPTS=”-p 32765 -o 32766″
# OPTIONS to pass to rpc.rquotad
# ex. RPCRQUOTADOPTS=”-p 32764″
RPCRQUOTADOPTS=”-p 32764″

EDITED: has not worked for me. Instead I used

# Number of servers to be started up by default
RPCNFSDCOUNT=8
# Options to pass to rpc.mountd
# ex. RPCMOUNTDOPTS=”-p 32767
RPCMOUNTDOPTS=”-p 4002″
# Options to pass to rpc.statd
# ex. RPCSTATDOPTS=”-p 32765 -o 32766″
RPCSTATDOPTS=”-p 4000″

This way, you’ll fix status, mountd, and quotad ports to 32764-32767. The only task left is to fix the lock manager ports (nlockmgr).

Fixing the nlockmgr ports depends on the version of your kernel and whether or not you build NFS into the kernel or as a module.

Deduce whether or not you have NFS built in to the kernel (Y), as a module (M) or not at all (N):

zgrep CONFIG_NFSD /proc/config.gz

If Y:

mount /boot -o remount,rw

If GRUB:

edit the file /boot/grub/grub.conf in your favorite editor.

If LILO:

edit the file /etc/lilo.conf in your favorite editor, and then run

lilo

Within the editor, append one of these lines to your kernel options, depending on your kernel version:

lockd.nlm_udpport=4001 lockd.nlm_tcpport=4001 # for 2.6.x kernels
lockd.udpport=4001 lockd.tcpport=4001 # for 2.4.x kernels

And reboot your machine.

If M: Open /etc/modules.d/nfsd in your favorite editor. Append this line:

options lockd nlm_udpport=4001 nlm_tcpport=4001

Run

modules-update

That way, you fix the nlockmgr ports into 4001 tcp/udp.

Warning for genkernel Users: If you have compiled nfs as module the above won’t work, because genkernel does not put the module options into the initrd. Unfortunately the initrd loads the nfs module. You have two options:

* Compile nfs statically and use the in-kernel method (easiest)
* Remove nfs from the MODULES_FS line of the file /usr/share/genkernel/{arch}/modules_load before starting genkernel

Adding the Firewall Rules

It’s probably best that you reboot your computer to ensure that all of the appropriate daemons and modules are reloaded, then double check that the ports in use are what you expect by running rpcinfo -p. If that’s all set, then add the firewall rules.

1. Save your current firewall rules iptables-save > /etc/iptables.bak
2. Open /etc/iptables.bak in your favorite text editor
3. Add the following rule(s) in appropriate order (according to your existing rules).
Firewall Rule: nfs

-A INPUT -p tcp -m state –state NEW -m tcp –dport 111 -j ACCEPT
-A INPUT -p udp -m state –state NEW -m udp –dport 111 -j ACCEPT
-A INPUT -p tcp -m state –state NEW -m tcp –dport 2049 -j ACCEPT
-A INPUT -p udp -m state –state NEW -m udp –dport 2049 -j ACCEPT
-A INPUT -p tcp -m state –state NEW -m tcp –dport 4001 -j ACCEPT
-A INPUT -p udp -m state –state NEW -m udp –dport 4001 -j ACCEPT
-A INPUT -p tcp -m state –state NEW -m tcp –dport 32764:32767 -j ACCEPT
-A INPUT -p udp -m state –state NEW -m udp –dport 32764:32767 -j ACCEPT

4. Restore all rules to be part of your current configuration iptables-restore /etc/iptables.bak

For a shorter version of the first few ports (if you want your iptables list to look smaller), you can use -m multiport instead, as follows:

-A INPUT -p tcp -m state –state NEW -m multiport –dport 111,2049,4001,32764:32767 -j ACCEPT
-A INPUT -p udp -m state –state NEW -m multiport –dport 111,2049,4001,32764:32767 -j ACCEPT

Setting Up Firewall (Client Side)

Setting up firewall on the client side is much, much simpler. The only relevant port is 111 tcp/udp. This is the port for portmap, the only service required for client to run.

Note :

root_squash — Prevents root users connected remotely from having root privileges and assigns them the user ID for the user nfsnobody. This effectively “squashes” the power of the remote root user to the lowest local user, preventing unauthorized alteration of files on the remote server. Alternatively, the no_root_squash option turns off root squashing. To squash every remote user, including root, use the all_squash option. To specify the user and group IDs to use with remote users from a particular host, use the anonuid and anongid options, respectively. In this case, a special user account can be created for remote NFS users to share and specify (anonuid=,anongid=), where is the user ID number and is the group ID number.

Be the first to comment - What do you think?  Posted by ZACH - October 2, 2011 at 10:00 pm

Categories: Linux Administration   Tags: , , , , , , ,

New data storage solution from DELL : Data Center Bridging (DCB)

What is Data Center Bridging ?

DCB is the new technology standard for data center especially for Ethernet. It support more than 10GB traffic.
Converged or Trucking switchport mode is essential. DCB used robust Ethernet IEEE standards like PFC , ETS , CN , DCBX.

DCB Advantages

  • Can support mixed speed ranging from 10 GE to 40GE/100GE
  • Network and application to use single infrastructure Ethernet(10 GB speed)
  • PFC standard allow to minimise Ethernet TCP retransmits issues

Other benefits of DCB are

Effective utilization of bandwidth

Through the use of ETS, networked devices such as hosts or storage targets can be guaranteed a minimum percentage of bandwidth, while at the same time the ability to access the full bandwidth when it is not in use by other applications. PFC manages the multiple flows of network data to ensure frames are not dropped for lossless priorities.

Reduce Power and cooling

The airflow of each rack is also immediately impacted through the use of fewer cables. One or two 10 Gb fiber connections per server takes up significantly less volume than four or more 1 Gb copper cables per server. With more cables multiplied across the servers in a rack, airflow becomes severely inhibited.

Cost Reduce

As the cost of 10 Gb Converged Network Adapters (CNAs) continues to fall, the economic benefits of converging multiple traffic flows onto 10 Gb will continue to grow. With the same ideas from the areas above, it will be far more cost-effective to purchase fewer ports of 10 Gb DCB Ethernet than to purchase many ports of 1 Gb non-DCB Ethernet.

Be the first to comment - What do you think?  Posted by ZACH - September 22, 2011 at 8:09 pm

Categories: Storage Area Network   Tags: , ,

Dell Equallogic FS7500 NAS ISCSI PS Series Unified Solution

Equallogic FS7500 ISCSI NAS


Dell introduced new high performance unified storage solution called FS7500. FS7500 is compatible with existing PS Series Equallogic Arrays (PS6010/6510,PS6000 and PS4000). Main feature added to FS700 is the Network attached Storage (NAS). NAS services are provided by the SMB / CIFS and NFS protocols. Equallogic FS7500 array support RAID 5 , RAID 6 , RAID 10 and RAID 50. FS7500 works with all Equallogic PS Series Arrays and need lastest 5.1 firmware.

FS7500 Advantages

  • Scale a single file share to 500 TB
  • Support NFS ( NFS v3 support)
  • CIFS (SMB 1.0 support)
  • Option to add more FS7500 to the group
  • EAch share can support both CIFS and NFS access
  • FS7500 have two redundant active / active controllers
  • Can manage NAS system from the same Equallogic group manager
  • FS7500 has 4 x 1Gb E ports can produce approximately 120 Mb/sec maximum line speed

You can easily create CIFS share name and share directory / NFS export name and directory.

To access a CIFS share from a Windows system, follow these steps:

1. Click Start > Run.
2. Specify the NAS Service IP address in the Open field and click OK.
3. Right-click the share and select Map Network Drive.
4. In the Map Network Drive dialog box:

• Enter \\service_ip_address\share_name.
• Click Connect using a different user name.

5. In the Connect As dialog box, enter a valid user name and password, then click OK. Note that
you can enter CIFSstorage\administrator for a user name and the CIFS password that you set
previously.

The user can now log in to the CIFS share and perform read and write operations. The default permission is to disallow guest access. You can modify the share to allow guest access.

FS7500 have 3 set of network interface modules for internal network (4 ports) , SAN (4 ports) and Client (4 ports)

Network connection requirements and recommendations for each FS7500 Controller are below:

• A switched 1GE network is recommended.
• You need either 5 times as many network cables as you have controllers (minimum network
configuration) or 13 times as many network cables as you have controllers. (recommended
network configuration) That is, you need 5 or 13 cables for each FS7500 Controller.
• Connect the IPMI port to the internal network.
• Connect the two internal network ports on each network interface card (NIC) to different
switches.
• Connect the two SAN network ports on each NIC to different switches.
• Connect two client network ports on the bottom NIC to one switch, and connect the other two
client network ports to a different switch.
• The following recommendations apply to the SAN network:
• Flow Control enabled on switches and network interfaces
• Unicast storm control disabled on switches
• Jumbo Frames should be enabled.
• VLANs may be used, but are not required.

1 comment - What do you think?  Posted by ZACH - September 21, 2011 at 10:37 pm

Categories: Storage Area Network   Tags: , , , , , , , ,

Open iSCSI configuration for Linux : Equallogic Array

Below recommended open Iscsi installation and configuration for Equallogic Storage Array.

Installation Instructions for Linux Iscsi

The initiator is comprised of kernel modules that come with the appropriate Red Hat Enterprise Linux
installation. To use and manage the initiator, you need to install the iSCSI utilities.
For RHEL v5.x you can install the iSCSI initiator through the Add/Remove Programs or at the command line.
#up2date iscsi-initiator-utils
Newer versions of RHEL v5 use the ‘yum’ utility instead
#yum install iscsi-initiator-utils
Once installed, run:
# service iscsi start
To verify that the iSCSI service will be started at boot time, the chkconfig command can be used as follows:
# chkconfig –-list iscsi
iscsi 0:off 1:off 2:off 3:off 4:off 5:off 6:off
By default, the newly added iscsi initiator is not enabled at boot, which is the reason for each of the run levels
listed to have the service set to off. To enable this at boot, again use the chkconfig command as follows:
# chkconfig –-add iscsi
# chkconfig iscsi on
The above two commands first checks to be sure there are the necessary scripts to start and stop the service,
and then it sets this service to be enabled for the appropriate runlevels.
Then check to be sure the changes took effect:
# chkconfig –list iscsi
iscsi 0:off 1:off 2:on 3:on 4:on 5:on 6:off
You also need to do the same for the Multipath daemon.
# chkconfig –list multipathd
iscsi 0:off 1:off 2:on 3:on 4:on 5:on 6:off

Installation Instructions:
The initiator is comprised of kernel modules that come with the appropriate Red Hat Enterprise Linux
installation. To use and manage the initiator, you need to install the iSCSI utilities.
For RHEL v5.x you can install the iSCSI initiator through the Add/Remove Programs or at the command line.
#up2date iscsi-initiator-utils
Newer versions of RHEL v5 use the ‘yum’ utility instead
#yum install iscsi-initiator-utils
Once installed, run:
# service iscsi start
To verify that the iSCSI service will be started at boot time, the chkconfig command can be used as follows:
# chkconfig –-list iscsi
iscsi 0:off 1:off 2:off 3:off 4:off 5:off 6:off
By default, the newly added iscsi initiator is not enabled at boot, which is the reason for each of the run levels
listed to have the service set to off. To enable this at boot, again use the chkconfig command as follows:
# chkconfig –-add iscsi
# chkconfig iscsi on
The above two commands first checks to be sure there are the necessary scripts to start and stop the service,
and then it sets this service to be enabled for the appropriate runlevels.
Then check to be sure the changes took effect:
# chkconfig –list iscsi
iscsi 0:off 1:off 2:on 3:on 4:on 5:on 6:off
You also need to do the same for the Multipath daemon.
# chkconfig –list multipathd
iscsi 0:off 1:off 2:on 3:on 4:on 5:on 6:off

Discovering Targets:

Once you have the iSCSI service running you will use the ‘iscsiadm’ userspace utility to discover, login and logout of targets.

To get a list of available targets type:

#iscsiadm –m discovery –t st –p :3260

Example:

# iscsiadm -m discovery -t st -p 172.23.10.240:3260
172.23.10.240:3260,1 iqn.2001-05.com.equallogic:0-8a0906-83bcb3401-
16e0002fd0a46f3d-rhel5-test

The example shows that the ‘rhel5-test’ volume has been found.
Logging in:

Here are two ways to connect to iSCSI targets:

• Log into all targets.

#iscsiadm –m node –l

• Log into an individual target.

#iscsiadm –m node –T –l –p :3260

Example:
#iscsiadm –m node –l –T iqn.2001-05.com.equallogic:83bcb3401-
16e0002fd0a46f3d-rhel5-test –p 172.23.10.240:3260

This is useful when using with array snapshots.
Logging out:

• Logging off an individual target.
#iscsiadm –m node –u –T iqn.2001-05.com.equallogic:0-8a0906-83bcb3401-
16e0002fd0a46f3d-rhel5-test –p :3260
Logging in and out of individual targets is very useful especially when using array snapshots.
• Logging out all targets.
#iscsiadm –m node –u

Checking Session Status:
To see the connection status run:
#iscsiadm –m session
tcp: [3] 172.23.10.240:3260,1 iqn.2001-05.com.equallogic: 0-8a0906-
83bcb3401-16e0002fd0a46f3d-rhel5-test
To see the session status and what SCSI devices are being used run:
#iscsiadm -m session –P3 | less
Sample output:
iscsiadm version 2.0-742
************************************
Session (sid 1) using module tcp:
************************************
TargetName: iqn.2001-05.com.equallogic:0-8a0906-83bcb3401-16e0002fd0a46f3d-rhel5-test
Portal Group Tag: 1
Network Portal: 172.23.10.240:3260
iSCSI Connection State: LOGGED IN
Internal iscsid Session State: NO CHANGE
************************
Negotiated iSCSI params:
************************
HeaderDigest: None
DataDigest: None
MaxRecvDataSegmentLength: 65536
MaxXmitDataSegmentLength: 65536
FirstBurstLength: 65536
MaxBurstLength: 262144
ImmediateData: Yes
InitialR2T: No
MaxOutstandingR2T: 1
************************
Attached SCSI devices:
************************
Host Number: 2 State: running
scsi2 Channel 00 Id 0 Lun: 0
Attached scsi disk sdb State: running

Mapping EQL volume name to /dev/sd(X) device name
To see what SCSI devices are being used run:
#iscsiadm -m session –P3 | less
Sample output:
iscsiadm version 2.0-742
************************************
Session (sid 1) using module tcp:
************************************
TargetName: iqn.2001-05.com.equallogic:0-8a0906-83bcb3401-16e0002fd0a46f3d-rhel5-test
Portal Group Tag: 1
Network Portal: 172.23.10.240:3260
iSCSI Connection State: LOGGED IN
.
.
.
.
************************
Attached SCSI devices:
************************
Host Number: 2 State: running
scsi2 Channel 00 Id 0 Lun: 0
Attached scsi disk sdb State: running

Using CHAP authentication:
To enable CHAP for ALL targets, edit the /etc/iscsi/iscsid.conf file. Find the CHAP setting section
and uncomment the following and add in the CHAP username and passwords
# *************
# CHAP Settings
# *************
# To enable CHAP authentication set node.session.auth.authmethod
# to CHAP. The default is None.
node.session.auth.authmethod = CHAP
# To set a CHAP username and password for initiator
# authentication by the target(s), uncomment the following lines:
node.session.auth.username = username
node.session.auth.password = password
# To enable CHAP authentication for a discovery session to the target
# set discovery.sendtargets.auth.authmethod to CHAP. The default is None.
discovery.sendtargets.auth.authmethod = CHAP
# To set a discovery session CHAP username and password for the initiator
# authentication by the target(s), uncomment the following lines:
discovery.sendtargets.auth.username = username
discovery.sendtargets.auth.password = password
To enable CHAP for a particular target use the iscsiadm command to update the settings for that target.
#iscsiadm -m node -T “” -p :3260 –op=update -
-name node.session.auth.authmethod –value=CHAP
#iscsiadm -m node -T ““ -p :3260 –op=update –
name node.session.auth.username –value=
#iscsiadm -m node -T ““ -p :3260 –op=update –
name node.session.auth.password –value= #iscsiadm -m node –T “” -p :3260 -l
I.e.
#iscsiadm -m node –T iqn.2001-05.com.equallogic:0-8a0906-83bcb3401-
16e0002fd0a46f3d-rhel5-test –p 172.23.10.240:3260 –l

Configuring Mulitpath Connections for Linux in Equallogic Array.

To create the multiple logins needed for Linux dev-mapper to work you need to create an ‘interface’ file for
each GbE interface you wish to use to connect to the array.
Use the following commands to create the interface files for MPIO.
(Select the appropriate Ethernet interfaces you’re using.)
#iscsiadm -m iface -I eth0 -o new
New interface eth0 added
Repeat for the other interface, i.e. eth1
#iscsiadm –m iface –I eth1 –o new
New interface eth1 added
Now update the interface name for each port:
#iscsiadm –m iface –I eth0 – -op=update –n iface.net_ifacename –v eth0
eth0 updated
#iscsiadm –m iface –I eth1 – -op=update –n iface.net_ifacename –v eth1
eth1 updated
Here’s an example of what the /var/lib/iscsi/ifaces/eth0 looks like:
iface.iscsi_ifacename = eth0
iface.net_ifacename = eth0
iface.hwaddress = default
iface.transport_name = tcp
If you have already discovered your volumes, you now need to re-discover the target(s).
#iscsiadm –m discovery –t st –p :3260
172.23.10.240:3260,1 iqn.2001-05.com.equallogic: 0-8a0906-83bcb3401-
16e0002fd0a46f3d-rhel5-test
172.23.10.240:3260,1 iqn.2001-05.com.equallogic: 0-8a0906-83bcb3401-
16e0002fd0a46f3d-rhel5-test
You should see the volume info for each interface you specified. In this example, two interfaces were defined
so we see the volume listed twice. You now need to log into the volume.

#iscsiadm –m node –l –T iqn.2001-05.com.equallogic:0-8a0906-8951f2302-
815273634274741f-rhel5-test –p 172.23.10.240:3260
#iscsiadm –m session
tcp: [3] 172.23.10.240:3260,1 iqn.2001-05.com.equallogic: 0-8a0906-
83bcb3401-16e0002fd0a46f3d-rhel5-test
tcp: [4] 172.23.10.240:3260,1 iqn.2001-05.com.equallogic: 0-8a0906-
83bcb3401-16e0002fd0a46f3d-rhel5-test
This shows that both adapters have connected to the array.
Verify that the multipathing is correctly configured.
#multipath –v2
#multipath –ll
rhel5-test (36090a02830f251891f74744263735281) dm-1 EQLOGIC,100E-00
[size=100G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
\_ 9:0:0:0 sdd 8:48 [active][ready]
\_ 8:0:0:0 sde 8:64 [active][ready]
svr-vol (36090a01840b31c74e173a4873200a02f) dm-0 EQLOGIC,100E-00
[size=10G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][enabled]
\_ 6:0:0:0 sdb 8:16 [active][ready]
\_ 7:0:0:0 sdc 8:32 [active][ready]
In this example you see that there are two paths to each volume. They are set one on top of the other. If they
are separated, then MPIO is not working correctly. Try restarting the multipath service and check again. If not,
review your multipath configuration file for any errors.
#service multipathd restart
What you do NOT want to see in the #multipath –ll output is:
[features=1 queue_if_no_path] it should be [features=0] as shown above.
Otherwise you will likely run into the issue described below. When a link
fails, ALL I/O will be paused.

Persistent Device naming:
Devices using the software initiators do not have a persistent naming scheme, and do not guarantee that a
device (i.e. /dev/sdc) will always have the same device node. Adding or removing a disk can change the device
order on the next boot up. Persistent Naming describes mechanisms where the system identifies devices
without relying on the /dev node, and provides a reference point for it that does not change at reboot.
First you have to comment out the ‘Blacklist all devices’ section in /etc/multipath.conf file
Note:
If the example file, multipath.conf is not in /etc, copy it from /usr/share/doc/device-mappermultipath-0.4.7/multipath.conf.synthetic
#cp /usr/share/doc/device-mapper-multipath-0.4.7/multipath.conf.synthetic
/etc/multipath.conf
# By default all devices are blacklisted. Modify this to enable multipathing
# on the default devices. Most will want to exclude /dev/sda and /dev/sdb,
exceptions would include those booting from SAN and wanting MPIO support.
Note: Please review the blacklist settings to make sure they’re applicable. Some
systems may require that some devices, including boot disks, remain
blacklisted. Your OS vendor may be able to provide more specific guidance
So at minimum it should look like this:
blacklist {
devnode “^sd[a]$”
}
(This excludes the first SCSI/SATA disk)
Or
blacklist {
devnode “^sd[ab]$”
}
(This excludes the first two disks if you are mirroring your boot drives)
To cover other objects typically not needed by MPIO add the following:
blacklist {
wwid SATA_WDC_WD2500YS-18_WD-WCANY4730307 (SAMPLE for local SATA HD)
devnode “^sd[a]$”
devnode “^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*”
devnode “^hd[a-z][[0-9]*]”
devnode “^cciss!c[0-9]d[0-9]*[p[0-9]*]”
}

Then restart the multipathd daemon
#service multipathd restart
Now check that dev-mapper has configured the volume.
#multipath –v2
#multipath -ll
mpath0 (36090a01840b3bc833d6fa4d02f00e016) dm-2 EQLOGIC,100E-00
[size=8.0G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
\_ 2:0:0:0 sdb 8:16 [active][ready]
The highlighted number is the UUID of the volume. That never changes. You can use that UUID to create a
persistent, friendlier name. For example you can name it the same as you called the volume on the EQL array.

Setting the default values for all Equallogic devices:
This will configure the default parameters for all EQL devices. You can change the parameters afterwards on a
volume-by-volume basis in the ‘multipaths’ section. I.e. you could have a different rr_min_io setting for
SQL volumes vs. NFS share volumes. Either method works fine. This just makes it easier as you add new
volumes, they will automatically get these default settings. You can still set them all individually, if you wish.
Find the “devices” section near the end of the file.
} devices {
device {
vendor “EQLOGIC”
product “100E-00″
path_grouping_policy multibus
getuid_callout “/sbin/scsi_id -g -u -s /block/%n”
path_checker readsector0
failback immediate
path_selector “round-robin 0″
rr_min_io 10 <-- See tuning section for more on this setting
rr_weight priorities
}
}
Then restart the multipathd daemon
#service multipathd restart
Now check that dev-mapper has configured the volume.
#multipath –v2
#multipath -ll
mpath0 (36090a01840b3bc833d6fa4d02f00e016) dm-2 EQLOGIC,100E-00
[size=8.0G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
\_ 2:0:0:0 sdb 8:16 [active][ready]

The highlighted number is the UUID of the volume. That never changes. You can use that UUID to create a
persistent, friendlier name. For example you can name it the same as you called the volume on the EQL array. Again edit the /etc/multipath.conf file.
Uncomment the following section and change the defaults to match your UUID and set a friendly alias name.
Edit the /etc/multipath.conf file and uncomment out the following:
#multipaths {
# multipath {
# wwid 3600508b4000156d700012000000b0000
# alias yellow
# path_grouping_policy multibus
# path_checker readsector0
# path_selector "round-robin 0"
# failback immediate
# rr_weight priorities
# no_path_retry fail
# rr_min_io 100
# }
# multipath {
# wwid 1DEC_____321816758474
# alias red
# }
#}

Change the number after ‘wwid’ to the UUID for your volume. Change the ‘alias’ to something more friendly
or use the volume name from the array. Change the ‘rr_min_io’ to 10. ** See performance tuning section for
more info on setting this value.
Here’s an example showing how to do more than one volume.
multipaths {
multipath {
wwid 36090a02830f251891f74744263735281
alias rhel5-test
path_grouping_policy multibus
path_checker readsector0
path_selector "round-robin 0"
failback immediate
rr_weight priorities
no_path_retry fail
rr_min_io 10
}
multipath {
wwid 36090a01840b31c74e173a4873200a02f
alias svr-vol
}
}
Using this section ‘multipaths’ you can overwrite the defaults. The most common feature to change on a per-volume
basis is the ‘rr_min_io’. See next paragraph for more info.
Note: rr_min_io sets how many IOs go down a path before switching to another path. A lower number tends
to work better in SQL environments. Larger numbers work better for more sequential loads. (I.e. 200 to 512)
When you change the minimum IO setting to make it effective rerun: #multipath –v2
See performance tuning section for more info on setting this parameter.
Save the file, then run:

#multipath –v2
#multipath –ll
rhel5-test (36090a02830f251891f74744263735281) dm-1 EQLOGIC,100E-00
[size=100G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][active]
\_ 9:0:0:0 sdc 8:48 [active][ready]
svr-vol (36090a01840b31c74e173a4873200a02f) dm-0 EQLOGIC,100E-00
[size=10G][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=0][enabled]
\_ 6:0:0:0 sdb 8:16 [active][ready]
#ls -l /dev/mapper
total 0
crw-rw---- 1 root root 10, 63 2007-11-16 17:15 control
brw-rw---- 1 root disk 254, 1 2007-11-19 15:59 rhel5-test
brw-rw---- 1 root disk 254, 0 2007-11-19 15:58 svr-vol
You should now have a persistent name to access that volume. /dev/mapper/rhel5-test
Note: It’s not required, but if you partition the device, devmapper will create a –part1 (or p1) MPIO device.
You will have to use that or you will get a “device busy” error.
I.e. /dev/mapper/rhel5-test-part1 or /dev/mapper/rhel5-testp1 which represents the partition
on that volume. Use that for device for all filesystem creation and mount commands.
A second partition slice, would be –part2 or –testp2
Now create a filesystem on that device. In this example we’re using the EXT3 filesystem. You are free to use any supported filesystem.
Example:
#mke2fs –j –v /dev/mapper/rhel5-test creates an EXT3 filesystem on that
device.

Mounting iSCSI Filesystems at Boot:
In order to mount a filesystem that exists on an iSCSI Volume connected through the Linux iSCSI Software
initiator, you need to add a line to the /etc/fstab file. The format of this line is the same as any other
device and filesystem with the exception being that you need to specify the _netdev mount option, and you
want to have the last two numbers set to 0 (first is a dump parameter and the second is the fsck pass).
The _netdev option delays the mounting of the filesystem on the device listed until after the network has
been started and also ensures that the filesystem is unmounted before stopping the network subsystem at
shutdown.
An example of an /etc/fstab line for a filesystem to be mounted at boot that exists on an iSCSI Volume is
as follows:
#cat /etc/fstab
LABEL=/1 / ext3 defaults 1 1
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
LABEL=SWAP-sda5 swap swap defaults 0 0
#
## Equallogic iSCSI volumes
#
/dev/mapper/rhel5-test /mnt/rhel5-test ext3 _netdev,defaults 0 0
Once you’ve added the /etc/fstab entry you can manually mount the volume with:
#mount /mnt/rhel5-test
Or if you just want to manually mount the volume, i.e. for a snapshot use:
#mount
*Note: The mount point must already exist.

Flowcontrol settings on Equallogic Array

Be sure that the network interfaces are configured to use Flow Control and Jumbo Frames if supported. To do
this, use the ethtool utility on Red Hat.
To check for Flow Control (RX/TX Pause) on your interface, use:
# ethtool –a
Example:
# ethtool -a eth0
Pause parameters for eth0:
Autonegotiate: on
RX: off
TX: off
To set Flow Control to on with ethtool use:
#ethtool –A autoneg off [rx|tx] on
Example:
# ethtool –A eth0 autoneg off rx on tx on
# ethtool -a eth0
Pause parameters for eth0:
Autonegotiate: off
RX: on
TX: on
Using ethtool will not persistently set this setting. See the manufacturer of the NIC for steps to configure
the Flow Control setting for the NIC, or add the ethtool command to the end of the
/etc/rc.d/rc.local file to set Flow Control for rx and tx to on. The majority of GbE NICs will detect and
enable flowcontrol properly

Enabling Jumbo Frames:
Note: Not all switches support both Jumbo Frames and Flowcontrol simultaneously. Please check with your
switch vendor first before enabling jumbo frames. We also suggest first running without jumbo frames to set
a baseline, and then see what if any improvement you get with jumbo frames enabled.
For Jumbo Frames, you can use the ifconfig utility to make the change on a running interface. Unfortunately,
this change will revert back to the default on a system reboot.
First, show the current setting for the interface in question using the ifconfig command using the interface
name as an argument:
# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:0E:0C:70:C9:63
inet addr:172.19.51.160 Bcast:172.19.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
The setting we are interested in is the MTU value. As we can see from the above output, this is currently set to
1500 which is not a Jumbo packet size. To change this, we need to set the mtu again using the ifconfig
command as follows:
# ifconfig eth0 mtu 9000
# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:0E:0C:70:C9:63
inet addr:172.19.51.160 Bcast:172.19.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
To make this setting persistent, you need to add the MTU=”” parameter to the end of the ifcfg
startup scripts for your SAN interfaces. These are found in the /etc/sysconfig/network-scripts
directory. The naming format of the cfg files for your interfaces is ifcfg-.
A sample output of one of these files after adding the MTU line.
# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=none
BROADCAST=172.19.255.255
IPADDR=172.19.51.160
NETMASK=255.255.0.0
NETWORK=172.19.0.0
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
PEERDNS=yes
GATEWAY=172.19.0.1
MTU=”9000″

Once the above Network changes have been made, you should reboot the host to verify that they have been made correctly and that the settings are persistent

Performance tuning options:
• #/etc/iscsi/iscsid.conf
# cat /etc/iscsi/iscsid.conf | grep -v “#”
node.startup = automatic
node.session.timeo.replacement_timeout = 120
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 5
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 20
node.session.initial_login_retry_max = 4
node.session.cmds_max = 1024 < --- Default is 128
node.session.queue_depth = 128 < --- Default is 32
node.session.iscsi.InitialR2T = No
node.session.iscsi.ImmediateData = Yes
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.conn[0].iscsi.MaxRecvDataSegmentLength = 131072 <--- try 64K-512K
discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 32768
node.session.iscsi.FastAbort = No < --- default is “Yes”
Note: This is the default template, if you have already discovered targets you will need to either
update the individual configuration files (/var/lib/iscsi/nodes/xxx) or rediscover your targets.

• #/etc/mulitpath.conf
rr_min_io
In /etc/multipath.conf, the rr_min_io parameter sets how many IOs go down a path before
switching to another path. A lower number, 10-20, tend to work better in SQL environments. Larger
numbers work better for more sequential loads. (I.e. 100 to 512). Larger values, 200+ require that you
increase the max commands and queue depth parameters. See “iscsid.conf” info above
When you change the minimum IO setting you must either restart the multipath service or run
#multipath –v2. To find the optimal setting for your environment will require some
Experimentation

• Linux Kernel Settings
# Increase network buffer sizes /* Default values */
net.core.rmem_max = 16777216 /* 131071 */
net.core.wmem_max = 16777216 /* 131071 */
net.ipv4.tcp_rmem = 8192 87380 16777216 /* 4096 87380 4194304 */
net.ipv4.tcp_wmem = 4096 65536 16777216 /* 4096 16384 4194304 */
net.core.wmem_default = 262144 /* 129024 */
net.core.rmem_default = 262144 /* 129024 */
Edit /etc/sysctl.conf, then update the system using #sysctl –p
# Increase network buffer sizes
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 8192 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.core.wmem_default = 262144
net.core.rmem_default = 262144
You can copy and paste the above into the /etc/sysctl.conf file

Linux Read Ahead Value

By default Linux requests the next 256 sectors when doing a read. In a very sequential environment,
increasing this value can improve read performance.
You can set the read-ahead on an sd device by using the “blockdev” command. This tells the SCSI layer
to read X sectors ahead. This is only valuable with sequential I/O-type applications, and can cause
performance problems with high random I/O. Under sequential I/O. the performance gain was
observed to be in the 10%-20% range.
Syntax: blockdev –setra X
i.e.
#blockdev –setra 4096 /dev/sda or /dev/mapper/mpath1
(Note: 4096 is just an example value. You will have to do testing to determine the optimal value for
your system). The OS will read-ahead X pages, and throughput will be higher.
To make the blockdev change effective every time you boot, add the following to
/etc/rc.d/rc.local .
/sbin/blockdev –setra 4096
/sbin/blockdev –setra 4096
etc…
You can also put a setting in /etc/sysctl.conf which will set the read-ahead on boot:
/sys/bus/scsi/drivers/sd/[DEVICEID]/block/queue/read_ahead_kb
Note: You can also use the persistent device names vs. the /dev/sd? device names.

To check the existing read ahead setting use:
#blockdev –getra

• Changing IO sizes
#echo X > /sys/block/sdX/queue/max_sectors_kb
Recommended settings are from 64 -> 512
Note: For MPIO configurations, each disk in the MPIO device will have to be updated individually. This
setting is not persistent across reboots.

Changing Linux I/O scheduler
Linux offers different kernel I/O schedulers. In Redhat the default is “CFQ” (Completely Fair Queuing)
However, the Open-iSCSI group reports that sometimes using the “NOOP” scheduler works better in iSCSI
server environments.
This website provides information on selecting different schedulers.

http://www.redhat.com/magazine/008jun05/features/schedulers/

Here’s a small excerpt:
The Linux kernel, the core of the operating system, is responsible for controlling disk access by using kernel I/O
scheduling. The I/O schedulers provided in Red Hat Enterprise Linux 4, embedded in the 2.6 kernel, have
advanced the I/O capabilities of Linux significantly. With Red Hat Enterprise Linux 4, applications can now
optimize the kernel I/O at boot time, by selecting one of four different I/O schedulers to accommodate
different I/O usage patterns:
* Completely Fair Queuing —elevator=cfq (default)
* Deadline —elevator=deadline ** this is intended for Real Time applications **
* NOOP —elevator=noop
* Anticipatory —elevator=as **this is intended for desktop environments**
You can change the scheduler on the fly to see whether the NOOP scheduler is better for your environment.
This setting is not persistent.
#echo noop > /sys/block/${DEVICE}/queue/scheduler
Change ‘noop’ to ‘cfq’ to return it to the default setting.

Minimum RHEL version for Open Iscsi for Equallogic
• Do not use the version that comes on the RHEL v5.0 (First GA release) install CDs. That version,
iscsi-initiator-utils-6.2.0.742-0.5.el5 does not work with our array. You can find
targets but not connect to them. You need version iscsi-initiator-utils-6.2.0.742-
0.6.el5 or greater.
• RHEL v5.0 (First GA release) requires at least two iSCSI HBAs to do multipathing. The iSCSI initiator
code is not capable of doing MPIO with GbE NICs.
• You will need RHEL v5.2 or greater to take advantage of multipathing with GbE NICs. This requires
version iscsi-initiator-utils-6.2.0.868-0.7.el5 or greater.

Be the first to comment - What do you think?  Posted by ZACH - September 17, 2011 at 7:55 pm

Categories: Storage Area Network   Tags: , , , , ,

Enable jumbo frames linux on Equallogic Array

Jumbo frames are ethernet packets with the MTU set to 9000 bytes (actually, any frame higher than 1500 is considered jumbo, but 9000 is the generally accepted size). Like flow control, this needs to be set at the NIC and agreed to by the switch. So your switch must be properly configured, or else packets could be lost.

To view your current setting for jumbo frames (MTU size), run “ifconfig ethX” and look for the “MTU:” opt. The default is 1500; the standard size for jumbo frames is 9000. To set jumbo frames, go to /etc/sysconfig/network (SuSE) or /etc/sysconfig/network-scripts (Red Hat), and edit the appropriate script. And this will make jumbo frames persistent across reboot

Say in redhat Linux edit the below and add “MTU 9000″ (without quates)

/etc/sysconfig/network-script/ifcfg-eth0

Run the command “ethtool -a ethX” to see what the flow settings currently are. To configure them as EqualLogic wants them, run “ethtool -A ethX rx on tx on autoneg off”. That will force receiving and transmitting flow control to be on, and prevent autonegotiation from accidentally turning them off again. Since
“ethtool” operates directly on the ethernet interface and causes immediate effects.

You dont see changes on the ISCSI connection until logging out and back into the ISCSI targets or having them load balanced by array. Rebooting is the easiest way to make MTU applied to all TCP session.

Be the first to comment - What do you think?  Posted by ZACH - at 11:20 am

Categories: Storage Area Network   Tags: , , , , ,