All,
I have recently started using the vCenter Data Protection (2TB VDP Appliance) and thought i'd share with you some of the problems and fixes that i've come across. It's documented articles explaining it's capabilities of variable block length de-duplication is great, and across all backup jobs over all time... just awesome. they claim it's capable of 98% data de-duplication on standard file shares... we shall wait and see.
Setup:
- Ensure that the Single Sign On capabilities are made AND create yourself a Backup User (giving them full administration privilidges inside your VCM)
- Dozens of Live VMs running on centralised SAN storage (EMC VNX 5300 over iSCSI)
- Deploy the OVF file to dedicated storage (I used a Freenas host with a 12x 450GB SAS Disk, Configured with a raid 6 data set, shared over multi-pathed iSCSI networking)
- Storage vMotioned the Configuration File and Boot Drive of the VDP appliance back to SAN storage
- Configuring the startup wizard using https://<IP of VDP>:8543/vdp-configure and rebooted the VM using the vSphere Client (right click > power > shutdown guest)
- Before powering on the VM; giving it 8GB of RAM and 16 processors (4sockets x 4cores) - nice performance gain. Powed it on now.
- After waiting for 20 minutes, the vSphere Web-Client has the modules installed for managing VM backups: goto: https://<IP of VCM>:9443/vsphere-client/ and login
This got me going..
Issues:
1. Backing up my first VM failed with Error Code: E10056. Restore failed due to existing snapshot.
Fixed with a Consolidate Disks command
2. Backing up the same VM failed again with Error Code: E10055. Failed to attach disk
Fixed by shutting down the VM; Editing Settings > Options > Advanced > General > Configuration Parameters
Adding the following: disk.EnableUUID = false
3. Backup Jobs take a long time - sometimes taking in excess of 24 hours!
Consider the first job as being 100% of the data being copied
Subsequent jobs are DELTAs; however a good overhead of reads take place to compare the previous blocks.. so this too can take time on a slow backup store.
- Check the Multi-pathing options
- Give more cache to the iSCSI target
- Enable battery backed write cache on the raid set
- Migrate the VDP's Host OS disk and Configuration file away from the VDP's data disks
- Enable Jumbo Frames for your iSCSI targets (specifically MTU=9000)
4. Still taking a long time to backup... 8+ hours..
The only thing I could do here is the following:
- VLAN the Traffic for iSCSI; I added vMotion traffic under another VLAN ID too.
- Dedicated Network Switches for iSCSI
- Go for 10GB if you can; don't aggregate 2x1GB interfaces together. iSCSI Multipathing can get better throughput without it.
- Consider reducing the response time for TCP ACK packets, increasing packet throughput furthermore. See here: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002598
5. Backup Job Fails with no error.. but I wanted to read the logs.. SSH to the VDP appliance and read the job log files:
ssh root@vdp.localdomain
<login>
tail -f /usr/local/avamarclient/var-proxy-1/<JOBNAME>-<JOBID>-<GUID>-vmimagew.log
<a few lines are cut out>
Running a CAT of the log and hunting for the errors I found:
2013-02-23 17:48:41 avvcbimage Error <0000>: vSphere Task failed (snapshot error=96): 'The operation is not allowed in the current state.'.
2013-02-23 17:48:41 avvcbimage FATAL <16018>: The datastore information from VMX '[SAN|R6-C] SRV38/SRV38.vmx' will not permit a restore or backup.
After a quick google: it's failed because the VM is either:
- Mid-Booting (not quite loaded all services yet)
- Missing VMware Tools
I will post some more events and errors shortly.. Anyone else got stuff to share on this one?