Google

Using generic drives in a Netapp FAS250

We use a retired netapp as a backup with snapmirror. We don’t have support on the device anymore and one of the disks failed. I took the drive out and discovered it was a SEAGATE ST3146807FC. I found this site that said it was possible to flash the firmware on a generic drive for use in a network appliance.

Since out FAS250 does not have a shelf of disks as vardomskiy is using in his example, I had to find a way to connect the SCA40 drive to a host controller.

I found on ebay a Emulex LightPulse 850 PCI card for 10$ with a DB9 connector. I found this adapter to convert from the SCA40 connector on the drive to a standard copper fibre channel connector (DB9). I then needed to purchase a NetApp shelf to shelf cable to connect the adapter to the LP850 card. The final part of this setup is a fibre channel copper terminator (I’m still not sure what this is, if it’s a resistor or a straight loopback setup…need to take it apart).

Now, my choice of the LP850 presented more than a few problems…the LP850 is not supported by the current lpfc driver. I had to download an older version of the driver here. This driver was made to for the 2.2 and 2.4 series of kernels. I had to install an ancient version of the operating system to get this card to work.


[root@fcal root]# tar xf lpfc-i386.tar
[root@fcal root]# cd SourceBuild/
[root@fcal SourceBuild]# make
Build Environment root: /lib/modules/2.4.20-8/build
cc -D__GENKSYMS__ -D__KERNEL__=1 -D__SMP__=1 -DMODULE -DMODVERSIONS -include /lib/modules/2.4.20-8/build/include/linux/modversions.h  -I./include -I/lib/modules/2.4.20-8/build/drivers/scsi -I/lib/modules/2.4.20-8/build/include/scsi -I/lib/modules/2.4.20-8/build/include -DLP6000 -D_LINUX -I./include -I/lib/modules/2.4.20-8/build/drivers/scsi -I/lib/modules/2.4.20-8/build/include/scsi -I/lib/modules/2.4.20-8/build/include  -E fcLINUXfcp.c > lpfc.ver1
In file included from fcLINUXfcp.c:164:
/lib/modules/2.4.20-8/build/include/linux/module.h:15:1: warning: "_set_ver" redefined
In file included from /lib/modules/2.4.20-8/build/include/linux/modversions.h:4,
                 from <command line>:1:
/lib/modules/2.4.20-8/build/include/linux/modsetver.h:9:1: warning: this is the location of the previous definition
cat lpfc.ver1 | /sbin/genksyms -k 2.2.5 > lpfc.ver
cc -Wall -O2 -fomit-frame-pointer -D__KERNEL__=1 -D__SMP__=1 -DMODULE -DMODVERSIONS -include /lib/modules/2.4.20-8/build/include/linux/modversions.h  -I./include -I/lib/modules/2.4.20-8/build/drivers/scsi -I/lib/modules/2.4.20-8/build/include/scsi -I/lib/modules/2.4.20-8/build/include -DLP6000 -D_LINUX -I./include -I/lib/modules/2.4.20-8/build/drivers/scsi -I/lib/modules/2.4.20-8/build/include/scsi -I/lib/modules/2.4.20-8/build/include  -c fcLINUXfcp.c
fcLINUXfcp.c: In function `lpfc_do_dpc':
fcLINUXfcp.c:1715: structure has no member named `sigmask_lock'
make: *** [build] Error 1

I’m not sure what is going on here, but there is indeed nothing called sigmask_lock in the task_struct. The lines in fcLINUXfcp.c that correspond to this error are trying to use spinlock on the irq, since our machine is only going to be used for relabeling a drive, I didn’t think it was a big deal to comment this out…


--- SourceBuild/fcLINUXfcp.c.uphill  2009-05-14 18:01:31.000000000 -0400
+++ SourceBuild/fcLINUXfcp.c  2009-05-14 18:01:50.000000000 -0400
@@ -1712,9 +1712,9 @@
        if( signal_pending(current) ) {

          iflg = 0;
-         spin_lock_irqsave(&current->sigmask_lock, iflg);
+         //spin_lock_irqsave(&current->sigmask_lock, iflg);
          flush_signals(current);
-         spin_unlock_irqrestore(&current->sigmask_lock, iflg);
+         //spin_unlock_irqrestore(&current->sigmask_lock, iflg);

          /* Only allow our driver unload to kill the KP */
          if( ldp->dpc_notify != NULL )

After making this change (you can apply the above as a patch…) The code compiles cleanly.


[root@fcal SourceBuild]# make
...
cp lpfcdriver lpfcdriver.o
ld -r -o lpfcdd.2.4.20-8.o lpfcdriver.o fcLINUXfcp.o lpfc.conf.o
ld -r -o lpfndd.2.4.20-8.o fcLINUXlan.o

The driver created is lpfcdd.2.4.20-8.o, inserting this resulted in an error that I didn’t bother rectifying (lazy, sorry, if anyone knows how to fix, let me know).


[root@fcal SourceBuild]# insmod lpfcdd.2.4.20-8.o
lpfcdd.2.4.20-8.o: The module you are trying to load (lpfcdd.2.4.20-8.o) is compiled with a gcc
version 2 compiler, while the kernel you are running is compiled with
a gcc version 3 compiler. This is known to not work.

The “fix” I opted for was to force the loading of the module.


[root@fcal SourceBuild]# insmod -f lpfcdd.2.4.20-8.o
Warning: The module you are trying to load (lpfcdd.2.4.20-8.o) is compiled with a gcc
version 2 compiler, while the kernel you are running is compiled with
a gcc version 3 compiler. This is known to not work.
Warning: loading lpfcdd.2.4.20-8.o will taint the kernel: no license
  See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Warning: loading lpfcdd.2.4.20-8.o will taint the kernel: forced load
Module lpfcdd.2.4.20-8 loaded, with warnings
[root@fcal SourceBuild]# tail /var/log/messages
May 14 18:07:41 fcal kernel: Emulex LightPulse FC SCSI/IP 4.20p
May 14 18:07:42 fcal kernel: !lpfc0:031:Link Up Event received  Data: 1 1 1 2
May 14 18:07:45 fcal kernel: scsi1 : Emulex LPFC (LP850) SCSI on PCI bus 01 device 40 irq 3
May 14 18:07:45 fcal kernel:   Vendor: SEAGATE   Model: ST3146807FC       Rev: 0006
May 14 18:07:45 fcal kernel:   Type:   Direct-Access                      ANSI SCSI revision: 03
[root@fcal SourceBuild]#

Now that the drive is recognized, we can continue with vardomskiy’s method using fwdl.

Using sysconfig -v I was able to verify that the NetApp names the ST3146807FC as X274_SCHT6146F10. Our filer is running OnTap 7.3.1 and the latest firmware for that drive is NA16, so looking in the /etc/disk_fw directory we find X274_SCHT6146F10.NA16.LOD

Compiling fwdl was trivial. So all that is left to do is use fwdl to update the firmware on the drive.


[root@fcal fwdl-1.2.3]# make
g++ -O2 -o fwdl -Dlinux -DDEBUG fwdl.C fwdl-linux.c
[root@fcal fwdl-1.2.3]#
But we don’t know what the device id is of the drive yet, I used seatools from seagate to perform this part.


[root@fcal root]# tar xf seatools_cli.tar
[root@fcal root]# ./st -l
Host adapter information:
SCSI host adapter emulation for IDE ATAPI devices
Emulex LPFC (LP850) SCSI on PCI bus 01 device 40 irq 3
 
Drive information:
 
/dev/sga SEAGATE  ST3146807FC      0006 286749487 blocks
 
[root@fcal root]#

Ok, now we can actually do the firmware upgrade.


[root@fcal fwdl-1.2.3]# ./fwdl /dev/sga X274_SCHT6146F10.NA16.LOD
Gathering inquiry data from the drive...done
Device Type:    Disk
Removable:    0
ISO Version:    3
Response Data Format:  12
Additional Length:  8b
Option Bits:    a
Vendor ID:    SEAGATE
Product ID:    ST3146807FC    
Revision Level:    0006
Vendor Specific:    3HY6LGTW
 
About to update drive firmware. This could render the drive unusable.
 
Are you certain you want to continue? [yN] y
About to update firmware...1...2...3...4...5...updating firmware...
done.
[root@fcal fwdl-1.2.3]# cd ..
[root@fcal root]# ./st -l
Host adapter information:
SCSI host adapter emulation for IDE ATAPI devices
Emulex LPFC (LP850) SCSI on PCI bus 01 device 40 irq 3
 
Drive information:
 
/dev/sga NETAPP   X274_SCHT6146F10 NA16 Cannot read capacity (Sense data = 03/31/00)
 
[root@fcal root]#

Almost done, the NetApp uses 520 Byte sectors so we need to reformat the drive now. I used sg3_utils to do this.


[root@fcal root]# cd sg3_utils-1.27/src
[root@fcal src]# ./sg_format --format --size=520 --verbose /dev/sga
    inquiry cdb: 12 00 00 00 24 00
    NETAPP    X274_SCHT6146F10  NA16   peripheral_type: disk [0x0]
      PROTECT=0
    mode sense (10) cdb: 5a 00 01 00 00 00 00 00 fc 00
    mode sense (10): requested 252 bytes but got 28 bytes
Mode Sense (block descriptor) data, prior to changes:
  Number of blocks=280790184 [0x10bc84a8]
  Block size=520 [0x208]
 
A FORMAT will commence in 10 seconds
    ALL data on /dev/sga will be DESTROYED
        Press control-C to abort
A FORMAT will commence in 5 seconds
    ALL data on /dev/sga will be DESTROYED
        Press control-C to abort
    format cdb: 04 18 00 00 00 00
 
Format has started
Format in progress, 0% done
Format in progress, 0% done
Format in progress, 0% done
Format in progress, 0% done
...
FORMAT Complete
[root@fcal root]# dmesg |tail -3
Attached scsi disk sda at scsi1, channel 0, id 12, lun 0
SCSI device sda: 286749488 512-byte hdwr sectors (146816 MB)
sda: unknown partition table

Now that the drive is formated and showing up as 144GB, we need to get the netapp to recognise the drive. Since I’m doing this piecemeal I don’t like the initialize all disks option put forward by vardomskiy. I instead tried owning the disk and then copying another drive to it in order to fool the netapp into thinking it had already labelled the disk.


fs> sysconfig -r
Broken disks                                                                                                                      
                                                                                                                                  
RAID Disk       Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)                                      
---------       ------  ------------- ---- ---- ---- ----- --------------    --------------                                      
bad label       0b.22   0b    1   6   FC:B   -  FCAL 10000 136000/278528000  137104/280790184
fs> reboot
   Starting AUTOBOOT press any key to abort...                                                                                      
Loading: 0xffffffff80001000/25888 0xffffffff80007520/15502440 Entry at 0xffffffff80001000                                        
Starting program at 0xffffffff80001000                                                                                            
Press CTRL-C for special boot menu                                                                                                
..................................................Special boot options menu will be available.                                    
                                                                                                                                  
NetApp Release 7.3.1: Thu Jan  8 01:24:50 PST 2009                                                                                
Copyright (c) 1992-2008 NetApp.                                                                                                  
Starting boot on Fri May 15 12:58:16 GMT 2009                                                                                    
Fri May 15 12:58:20 GMT [nvram.battery.state:info]: The NVRAM battery is currently ON.                                            
Fri May 15 12:58:25 GMT [diskown.isEnabled:info]: software ownership has been enabled for this system                            
                                                                                                                                  
                                                                                                                                  
(1)  Normal boot.                                                                                                                
(2)  Boot without /etc/rc.                                                                                                        
(3)  Change password.                                                                                                            
(4)  Initialize owned disks (7 disks are owned by this filer).                                                                    
(4a) Same as option 4, but create a flexible root volume.                                                                        
(5)  Maintenance mode boot.                                                                                                      
                                                                                                                                  
Selection (1-5)? 5
*> disk_list                                                                                                                    
     DISK    CHAN  VENDOR   PRODUCT ID       REV  SERIAL#              HW (BLOCKS   BPS) DQ                                      
------------ ----- -------- ---------------- ---- -------------------- -- -------------- --                                      
0b.19        FC:B  NETAPP   X274_HJURE146F10 NA14 404F7958             ff  284820800 520  N                                      
0b.20        FC:B  NETAPP   X274_HJURE146F10 NA14 40456113             ff  284820800 520  N                                      
0b.21        FC:B  NETAPP   X274_HJURE146F10 NA14 404C9761             ff  284820800 520  N                                      
0b.22        FC:B  NETAPP   X274_SCHT6146F10 NA16 3HY6LGTW00007428DWHP ff  280790184 520  N                                      
0b.16        FC:B  NETAPP   X274_SCHT6146F10 NA16 3HY107C9000073480CQK ff  280790184 520  N                                      
0b.17        FC:B  NETAPP   X274_SCHT6146F10 NA16 3HY0YW9J00007347WSB0 ff  280790184 520  N                                      
0b.18        FC:B  NETAPP   X274_SCHT6146F10 NA16 3HY0ZK88000073478DKV ff  280790184 520  N
*> diskcopy -s 0b.16 -d 0b.22                                                                                                    
                                                                                                                                  
You are about to copy over disk 0b.22 with the contents of disk 0b.16.                                                            
Retries at the SCSI layer are: ENABLED                                                                                            
I/O size is 4096 sectors                                                                                                          
Any data on disk 0b.22 will be lost!                                                                                              
                                                                                                                                  
Are you sure you want to continue with diskcopy? y                                                                                
                                                                                                                                  
Copying from disk 0b.16 to disk 0b.22.                                                                                            
600 MB copied -  
Copy operation of 68553 MB from disk 0b.16 to disk 0b.22 has completed.                                                          
                                                                                                                                  
NOTE: disk 0b.16 must be removed from the system prior to rebooting!              
*> halt

Now, take 0b.22 out of the netapp and reboot the filer. After it has rebooted, stick the drive back in to have it marked as spare.

                                                                                                                                                                                                                                                    
Fri May 15 10:43:40 EDT [ses.channel.rescanInitiated:info]: Initiating rescan on channel 0b.                                      
Fri May 15 10:43:49 EDT [raid.assim.disk.spare:notice]: Sparing Disk /0b.22 Shelf 1 Bay 6 [NETAPP   X274_SCHT6146F10 NA16] S/N [3e
Fri May 15 10:43:50 EDT [sfu.firmwareUpToDate:info]: Firmware is up-to-date on all disk shelves.
fs> sysconfig -r                                                                                                          
...                                                                                                                      
Spare disks                                                                                                                      
                                                                                                                                  
RAID Disk       Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)                                      
---------       ------  ------------- ---- ---- ---- ----- --------------    --------------                                      
Spare disks for block or zoned checksum traditional volumes or aggregates                                                        
spare           0b.22   0b    1   6   FC:B   -  FCAL 10000 136000/278528000  137104/280790184 (not zeroed)                        
fs>

Done. The drive is now available as a spare.

5 Responses to “Using generic drives in a Netapp FAS250”

  1. vardomskiy Says:

    Hey, cool.

    Since I was basically working on a shelf full of new drives, I didn’t think of copying the disk over in maintenance mode. I guess that’s the best option if you only work on a single drive.

    After re-reading my instructions, I realized that I forgot to mention that I had a spare ESH2 that I was able to pop into the DS14mk2 instead of FAS250 module. So in essence I was turning an FAS250 into a dumb FC shelf for the time. But if you have some other means of attaching FC disk to a Linux system and uploading firmware/formatting the disk a 3rd party disk enclosure), you of course could use it for the Linux stage.

    I also downplay the importance of seatools during the Linux stage, since the system I use was IDE only, and dmesg was making very clear what the device ids of the drives were. st is very nice in terms of information it provides, but I was very frustrated about the fact that it was segfaulting for me during the firmware update step.

    Lastly, it might make sense to issue ‘disk zero spares’ command if you plan to use that disk as a spare.

    Any way, glad that half a day of my research was useful to someone.

  2. uphill Says:

    I spent more than half a day :-) Thanks, your work was indeed very useful.

  3. Dana Mandell Says:

    Nice work!
    This all looks good, and I have been successful at “flashing” IDE and FC drive for use in NetApp, but has anyone tried SATA drives? Something must be very different since the blocks size is fixed at 512 bytes on SATA drives and cannot be reformatted to 520. (I noticed that even the NetApp SATA drives show up as having 512 byte blocks.)

  4. Old Moose Says:

    Get a ST31000340NS 1TB Barracuda. This drive had a Firmware-Bug so Seagate and NetApp had to release an Firmware-Update. X269_SMOOS01TSSX.NA01.LOD is the NetApp-Version for this drive. Now goto http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?&DocId=207963&NewLang=en and get SN06 ISO-Image. From this image extract the eltorito-Bootpart (Linux: geteltorito). In this Diskimage is a zip-file. Extract it, copy the NetApp-Firmware into it and modify the batchfile flash.bat. Remove the old firmware and the .txs-File to save some space.
    .
    .
    .
    :SEAFLASH2
    rem set model=ST31000340NS
    %exe% -m %family% -f X269_SMOOS01TSSX.NA01.LOD -i %model2% %options%
    if errorlevel 2 goto WRONGMODEL2
    if errorlevel 1 goto ERROR
    goto DONE
    .
    .
    .

    zip it again, put it into the diskimage and create a bootfloppy from it (dd). Attach the harddisk to a SATA-Controller, boot the PC from the floppy and flash the disk. After that you have a NetApp-Disk with Bad label…how to solve this is mentioned in the text above.

  5. Old Moose Says:

    Get a ST31000340NS 1TB Barracuda. This drive had a Firmware-Bug so Seagate and NetApp had to release an Firmware-Update. X269_SMOOS01TSSX.NA01.LOD is the NetApp-Version for this drive. Now goto http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?&DocId=207963&NewLang=en and get SN06 ISO-Image. From this image extract the eltorito-Bootpart (Linux: geteltorito). In this Diskimage is a zip-file. Extract it, copy the NetApp-Firmware into it and modify the batchfile flash.bat. Remove the old firmware and the .txs-File to save some space.
    .
    .
    .
    set options=-s -x -b -v -a 20
    .
    .
    :SEAFLASH2
    rem set model=ST31000340NS
    %exe% -m %family% -f X269_SMOOS01TSSX.NA01.LOD -i %model2% %options%
    if errorlevel 2 goto WRONGMODEL2
    if errorlevel 1 goto ERROR
    goto DONE
    .
    .
    .

    zip it again, put it into the diskimage and create a bootfloppy from it (dd). Attach the harddisk to a SATA-Controller, boot the PC from the floppy and flash the disk. After that you have a NetApp-Disk with Bad label…how to solve this is mentioned in the text above.

Leave a Reply