For future reference, here is how I finally proceeded, with a few comments on the various issues or pitfalls encountered:
1. Boot the machine with a Linux live system
First step was to boot the machine containing the disk to image, using a Linux live system.
NOTE: My first idea was to use an Ubuntu Live USB disk, but the machine did not support booting from USB, so I found it easier to use an old Knoppix live CD.
2. Image the disk using dd and pipe the data through ssh
Then, I copied all the disk content to a file image on my local server using dd and piping the data through ssh:
$ dd if=/dev/hdX bs=4k conv=noerror,sync | ssh -c blowfish myuser@myserver 'dd of=myfile.dd'
A few comments here: this method will read all the disk contents, so it can take very long (it took me 5hrs for a 80Gb disk). The bottleneck isn't the network, but really the disk read speed. Before launching the copy, I advice to check the BIOS/disk/system parameters to ensure that the disk and the motherboard are working at their highest possible speed (this can be checked using the command hdparm -i and by running a test with hdparm -Tt /dev/hdX).
NOTE: dd does not output progress of the operation, but we can force it to do so by sending the USR1 signal to the dd process PID from another terminal:
$ kill -USR1 PIDofdd
Note: Newer versions of dd support the status=LEVEL option (man dd)
status=LEVEL The LEVEL of information to print to stderr; 'none' suppresses everything but error messages, 'noxfer' suppresses the final transfer statistics, 'progress' shows periodic transfer statistics
3. Reclaim the unused space
At this point, the source machine is no longer needed and we will work exclusively on the destination server (running Linux as well). VirtualBox will be used to convert the raw disk image to the VHD format, but before doing so, we can zero out the unused blocks, so that VirtualBox does not allocate space for them in the final file.
In order to do so, I mounted the images as a loopback device:
$ mount -o loop,rw,offset=26608813056 -t ntfs-3g /mnt/mydisk/myfile.dd /mnt/tmp_mnt $ cat /dev/zero > zero.file $ rm zero.file
NOTE: The offset indicating the beginning of the partition within the disk image can be obtained by using parted on the image file:
$ parted /mnt/mydisk/myfile.dd (parted) unit Unit? [compact]? B (parted) print Model: (file) Disk /mnt/mydisk/myfile.dd: 80026361856B Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type File system Flags 1 32256B 21936821759B 21936789504B primary ntfs boot 2 21936821760B 80023749119B 58086927360B extended lba 5 26608813056B 80023749119B 53414936064B logical ntfs
NOTE2: The default Linux kernel NTFS driver provides read-only access, thus it is necessary to install and use the userspace ntfs-3g driver or writing to the disk will raise an error!
4. Create the VHD image using VBoxManage
At this point, we can use the VirtualBox utilities to convert the raw image to a VHD file:
VBoxManage convertfromraw myfile.dd myfile.vhd --format VHD