Speeding up Btrfs RAID1 with LVM Cache

Logical Volume Manager 2 (lvm2) is a very powerful toolset to manage physical storage devices and logical volumes. I’ve been using that instead of disk partitions for over a decade now. LVM gives you full control where logical volumes are placed, and a ton of other features I have not even tried out yet. It can provide software RAID, it can provide error correction, you can move around logical volumes while they are being actively used. In short, LVM is an awesome tool that should be in every Linux-admin’s toolbox.

Today I want to show how I used LVM’s cache volume feature to drastically speed up a Btrfs RAID1 situated on two slow desktop HDDs, using two cheap SSDs also attached to the same computer, while still maintaining reasonable error resilience against single failing devices.

Creating the cached LVs and Btrfs RAID1

The setup is as follows:

  • 2x 4TB HDD (slow), /dev/sda1, /dev/sdb1
  • 2x 128GB SSD (consumer-grade, SATA), /dev/sdc1, /dev/sdd1
  • All of these devices are part of the Volume Group vg0
  • Goal is to use Btrfs RAID1 mode instead of a MD RAID or lvmraid, because Btrfs has built-in checksums and can detect and correct problems a little bit better because it can determine which leg of the mirror is the correct one.
Continue reading “Speeding up Btrfs RAID1 with LVM Cache”

flexget on Ubuntu 10.04 LTS

If you follow the official instructions to install flexget with existing Python 2.6 and python-virtualenv, than you might encounter the following problem:

flexget@host:~$ flexget/bin/flexget
Traceback (most recent call last):
File "flexget/bin/flexget", line 5, in
from pkg_resources import load_entry_point
File "/home/flexget/flexget/lib/python2.6/site-packages/distribute-0.6.10-py2.6.egg/pkg_resources.py", line 2655, in
working_set.require(__requires__)
File "/home/flexget/flexget/lib/python2.6/site-packages/distribute-0.6.10-py2.6.egg/pkg_resources.py", line 648, in require
needed = self.resolve(parse_requirements(requirements))
File "/home/flexget/flexget/lib/python2.6/site-packages/distribute-0.6.10-py2.6.egg/pkg_resources.py", line 546, in resolve
raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: jsonschema>=2.0

At least on my system, there seems to be a jsonschema < 2.0 installed in the system site packages. This can be prevented by altering the initialization of the virtual Python environment as follows:

virtualenv --no-site-packages ~/flexget/

HOWTO: Fully encrypted vServer with Ubuntu 12.04

Update 2022-05-16: Today I learned that there are two official tutorials by Hetzner for Ubuntu 20.04. You might want to follow them instead:


In this blog post I am going to demonstrate how to easily setup a virtual server at Hetzner. This setup will work for most other vServer operators as well, but some adjustments may be required. Prerequisite is that you are able to access the console of  the server while booting, as you need to be able to enter the passphrase. You also need to be able to boot into some sort of “Rescue System” for the setup. This is no in-place setup. In Hetzner’s “Robot” this is pretty easy.

One thing to consider regarding security: fully encrypting a vServer might seem… senseless, as the host operator can easily copy the whole memory of the VM while running and extract the key this way. True. There is no way around this fact. My reason for wanting a fully encrypted system is more of the way that I want to be sure that the data is encrypted on the storage system. I want to protect from being unable to ever fully wipe the persistent data from disk in case I cancel the VM, the VM gets moved to a new host, or a failed disk is sent in to the manufacturer. For me, this is a compromise I can accept. YMMV.

You can also try this HOWTO under VirtualBox with the System Rescue CD ISO images. Actually, that’s where I verified all steps are working.

So, let’s dive into the fun of the HOWTO. BEWARE! THIS TUTORIAL WILL WIPE ALL DATA ON YOUR VSERVER! I TAKE NO RESPONSIBILITY IF YOU LOSE DATA!  IT MIGHT ALSO NOT WORK FOR YOU. USE THIS AT YOUR OWN RISK!

The following steps will partition the disk, setup LVM and LUKS, install Ubuntu 12.04 and prepare the system for reboot. Most parts can be copied line-by-line. Please beware that there are some parts in this tutorial that needs to be adjusted: UUIDs of partitions, hostname, username, and most important: network setup.

Continue reading “HOWTO: Fully encrypted vServer with Ubuntu 12.04”

OCZ Vertex2, Linux, and ancient nForce 430 chipset

Today I finally received my brand-new Ocz Vertex2 OCZSSD2-2VTXE120G 120GB and eagerly wanted to install it in my 4-year-old HP workstation which currently is running Ubuntu 10.10 exclusively.

After setting up the alignment according to some tutorials I found online, I started the setup process. Shortly after starting the copy step of the installation, the whole process came to a grinding halt with filesystem errors. Looking into the kernel debug messages it seemed like SATA commands were causing errors. After checking hardware, cables and switching SATA ports, I began researching the issue and soon found that the issue might be fixed in the next firmware version of the drive. So I wanted to upgrade from 1.23 to 1.24, which could only be done in Windows…

After installing a trial of Windows 7, I finally wanted to upgrade the firmware, but the drive was not detected, but was accessible. The release notes indicated that I would need to switch to AHCI mode. After several attempts, includig a BIOS update, I realized that there was no way to do this with my old hardware, as my nForce 430 chipset simply doesn’t support it.

So my only remaining option was to simply try the kernel arguments I read to be the fix for 1.24 with the 1.23 hardware.

So, if you add the following kernel option during installation and afterwards for every boot, the disk seems to work quite well (source):

libata.force=norst

Actually, this forces the ATA driver in Linux to not issue any reset commands on the bus. I really don’t understand why this improves/fixes the problem, but it seems the device has issues when being reset on my chipset. I can also notice this that in 2 out of 3 attempts if I reboot the PC the disk is not recognized any more before I reboot again.

Despite these issues, the SSD now runs with astonishing performance with the suggested 32 head / 32 sector alignment, and a 512kB partition alignment scheme. After an initial TRIM with hdparm‘s wiper.sh I enabled -o discard for my ext4 partition and could also verify using hdparm that this results in the sectors being trimmed. Please note, that you need to manually compile and install the latest hdparm version on Ubuntu 10.10, as the included version fails with the very long free block list and doesn’t handle splitting the sectors in multiple requests. The latest version doesn’t have this issue any more.

Remaining Windows Vista/7 “rearm count”

It is a well-known fact, that it is possible to extend the initial grace period for activating your (hopefully legitimate!) copy of Windows from 30 days to 120 days by using slmgr. This is a tool that is intended to allow the preparation of image-based installers for enterprise use by allowing to reset the initial grace period up to 3 times.

If you tend to forget the number of times you already reset the counter, you can easily check for yourself: simply run

slmgr -dlv

to get detailed licensing information, including the number of remaining re-arms and remaining grace time.

If you want to know when exactly your grace period runs out, use

slmgr -xpr

Note: This simply gives you more time, it won’t prevent you from having to buy and/or activate Windows. Re-arming is not a bug, it works as intended and is an important tool for use in corporate environments.

Novatel Merlin U740 using only Windows 7 onboard tools

I have lost the install CD of my Novatel Merlin U740, an older PCMCIA UMTS card. As a consequence I got no “Mobilink Connection Manager” after installing Windows 7 on my notebook. Fortunately I found this guide by Novatel Wireless which explains how to connect using only on-board tools in Windows Vista, by setting up a dial-up connection. It still works in Windows 7. The important part is to set the APN as part of the driver’s initialization string.

The telephone number you have to set is *99#, which should be provider-independent.

The following settings are for yesss.at only:
Username: web
Passwort: web

Remember to set the APN as part of the driver’s connection string in Window’s “Device Manager” as described in the PDF.

Again, for yesss.at this is: AT+CGDCONT=1,"IP","web.yesss.at"

For this to work properly, the SIM must not have a PIN set, as otherwise the SIM will be locked and the dialer cannot dial out. For me this is ok, as it is a pre-paid card which can hardly be abused if it gets stolen, but your situation might be different, so please consider the security implications. (I suspect that it should be possible to unlock the SIM card somehow using the AT+CPIN=1234 command, but I did not research how to separate several initialization strings, as it did not work immediately.)

The solution works quite well for me, even under Windows 7. Disadvantage is that there is no way to tell the signal strength and exact mode of operation (despite the color-coded status led on the Merlin U740).

Windows Vista Home/Business/Enterprise has a telnet client, too

For some unknown reason, Microsoft decided that only the “Ultimate” version of Windows Vista ships with the telnet client installed by default. It can, however, be easily installed on all the other versions as well.

  • Open the Control Panel
  • Select “Programs”
  • Select “Turn Windows features on or off”
  • Scroll through the list, select “Telnet client”
  • Press OK
  • Wait (for surprisingly long)

That’s it, voila, the telnet client is now installed on your Windows Vista Non-Ultimate.

Nice to know – Volume 2

udev renames you network interfaces

Sometimes udev renames your devices. This happened to me when upgrading a server, eth0 suddenly became eth1 and vice-versa. Of course, this broke nearly all firewall scripts on the server… There is a nice explanation how to get udev to name your devices the way you want.

Visual Studio 2005 Service Pack 1 on Microsoft Windows Server 2003

When installing Visual Studio Service Pack 1 under Windows Server 2003, it might fail because it cannot verify the signature. You should take time and visit the link provided in the error message, because it will take you to a hotfix that will correct the problem.

(via Mark Caroll’s Blog)

VMWare Server on Ubuntu 8.04

A nice tutorial for getting free VMWare Server 1.0.5 running on Ubuntu 8.04.

Nice to know – Volume 1

As I definitely should post more on my blog, I now try to start a new series: “Nice to know”. It will be a collection of interesting things I consider memorable but which don’t deserve their own blog-post.

Trickle

Tricke allows you to limit bandwith for processes that do not support bandwith limitation out-of-the-box. It works by preloading and simulating the socket API. You use it as a wrapper when starting the process, like trickle -d 80 someapp.

You can use it to limit rsync speed for instance (thanks to http://www.yak.net/fqa/404.html): rsync -auvPe “trickle -d 80 ssh” user@host:/src/ /dst/

VMWare Tools and Kernel 2.6.24

VMWare Tools out of the box do not install on kernel 2.6.24 (as used in Ubuntu 8.04 for instance). A possible solution is described here. It is based on using the open-source version of the VMWare tools (open-vm-tools).

Test-driven network management

Article Teaser RJ45 close-upTest-driven development has proven to increase quality of software in many cases. I believe that the same principle should be applied to network management. From time to time, I am occupied in managing quite large and distributed networks, consisting of many different network segments, routers, servers, etc.

Primary tool in managing any network is using monitoring software which tells you if everything is alright or if you should worry. For various reasons I have become a huge fan of Nagios for monitoring networks I am responsible for, especially for the simple extensibility by writing your own check scripts (plugins).

While working through some issues in a network, I suddenly decided to try an approach I spontaneously called “test-driven network management”¹. The steps are easy (and are a one-to-one translation of agile software-development principles):

  1. Write a Nagios test which checks for the requested/required feature.
  2. This test will fail.
  3. Implement a solution satisfying the test.

The same advantages of automated testing (better: unit testing) in software development also apply to the network management tasks:

  • The test documents what you want to achieve in a quite formal way.
  • You will (almost) immediately know when your solution breaks other requirements (if tests exist for them).
  • As networks tend to be even more fragile then software, you have to monitor whatever you implemented anyways 🙂

Whenever possible, I try to add a test (or tweak an existing one) for any trouble-ticket / feature request I come around. In my experience, customer satisfaction tends to increase, because you start noticing problems before they do and you also implement measures to prevent the same problems to occur over and over again.

¹ I am quite sure there is another technical term for it, as I am quite sure I am not inventing anything new here… If you know how this is called by others, please tell me in the comments.

[tags]development, network, sysadmin, network management, test-driven development, nagios[/tags]