Tuesday, May 21, 2013

Book Review: Rebel Code

I purchased Rebel Code by Glyn Moody because I was giving a presentation at a local technical conference on the history of open source software. I chose to present this topic because I realized many up-and-coming technical workers and enthusiasts either weren't alive when many milestone events occurred or weren't cognizant of them or their significance.

This book far exceeded my expectations. I was an early adopter of Linux and open source software in the early 1990s, so I was witness to some of the innovations and big events that took place, but I had no idea about the details. Moody's book delves deep into the evolution of the early Linux kernel, how it lacked any networking capability at all, the controversy surrounding adding a network stack to the kernel, and other issues that came up that ultimately shaped Linux, its maintainer Linus Torvalds, and his lieutenants.

While the bulk of Moody's story explores the roots of Linux and its early history, it also explores other relevant open source projects that have made a significant mark such as GNU, Apache, Sendmail, Samba, and BIND. I learned several things about these projects and those involved that I hadn't known before.

Telling the history of the open source movement would not be complete without coverage of the companies that made open source their business or changed their business because of open source. It's disappointing how many of them are gone now, but when this book was published (2002) most were still ticking. Gone now are organizations like Netscape Communications, Caldera, Pacific Hi-Tech, and VA Linux/VA Research, but their roles in the movement can not be forgotten.

The only downside of this book is that Moody hasn't prepared an updated revision in the decade or so since it was published. In the late 1990s and early 2000s, much of the open source movement saw Microsoft as the enemy, the obstacle to the movement's success, and Moody covers this well. In the years since, however, I think the movement has started to recognize that Microsoft is not the roadblock they saw it as. It seems like every year for the last fifteen years, someone has declared it to be "the year of Linux on the desktop," but while Linux has gained more desktop users, it's still nowhere near that kind of a conquest... And that's okay.

In summary, I highly recommend this book as a way of gaining critical insight into the landmark years of the 1990s that defined the open source movement.

Tuesday, May 14, 2013

Mounting Windows LDM partitions in Fedora Linux

I've been working on editing video from the Openwest 2013 conference and one of the videographers gave me a hard disk with video he had captured from his tape-based camcorder.

I connected the disk to my computer via a SATA-to-USB dock and waited for the familiar notification that a new USB device had been connected, but it never happened. I took a closer look at /var/log/messages to see what was going on. I could see the hard disk was being detected and was assigned a block device (/dev/sdk), but nothing beyond this.

I did a fdisk -l /dev/sdk on the disk and discovered it had GPT partitioning. So, I then ran parted on the device and discovered it had the following partitions.

Number  Start   End     Size    File system  Name                          Flags
 1      17.4kB  1066kB  1049kB               LDM metadata partition
 2      1066kB  134MB   133MB                Microsoft reserved partition  msftres
 3      134MB   750GB   750GB                LDM data partition

I had never seen anything like this before, so I did some searching online. I discovered LDM is essentially Windows' version of LVM. I found lots of forum messages with people discussing their difficulty accessing data stored on LDM partitions from Linux, but there was no clear solution.

So, I gave up on it for the moment.

The next day, I was telling my brother-in-law about it and did another search. This time, I came across this page which describes a command-line tool ldmtool that will create the necessary device-mapper device so the partition can be mounted via the mount command.

I did a yum search ldm on my Fedora box and found an available package called libldm which had the description "A tool to manage Windows dynamic disks." Sounds good to me. It included the ldmtool and now I'm off to the races!

After doing a sudo ldmtool create all, I could see a new device in the /dev/mapper directory.

Doing a sudo mount /dev/mapper/ldm_vol_blahblah /mnt/scratch mounted the partition in /mnt/scratch.

Not fun, but effective.

Monday, May 13, 2013

Unit testing means isolation?

I've been tasked with heading up the design and specification of a testing framework for my team at work. At first, I was looking at comprehensive testing frameworks like Tapper and Jenkins, but I realized we really need to think about the way our unit tests are run and then how those tests are automated and results tracked later.

Because we're using Perl, it's a no-brainer that our tests will generate TAP, so we just need to make sure whatever encompassing frontend automation we add to the mix later on understands TAP.

Attributes of unit tests

Unit tests are a somewhat nebulous concept. A lot of it has to do with how pure you want to be. The purist's unit test has the following attributes:

  • Can be be fully automated
  • Operates in complete isolation from other systems/subsystems.
    • Use mocks, stubs, fakes and test harnesses may help accomplish this isolation when necessary.
    • Doesn't depend on precedence or order of other tests run.
  • Runs in memory. Doesn't touch databases, remote data stores (e.g. HTTP, etc.)
  • Always returns predictable results.
  • Runs "fast."
  • Tests a single "unit" or logical concept.
  • Readable.
  • Maintainable.
  • Trustworthy.

(A tip of the hat to the author of The Art of Unit Testing in Ruby, Java, and .NET because I based the above list on one that appears on their website and probably in the book as well.)

How pure?

One of our projects is an API layer that acts as middleware between remote libraries and a billing system backend. Each of these API method invocations results in a call to billing system methods that, in turn, touch a relational database and may also touch yet-another API.

Creating unit tests for this middleware API in complete isolation would mean mocking logic provided by the billing system code. Some of that code is, well, pretty intense.

So, my question is this: Can we bend the rules on total and complete isolation in exchange if we make sure other rules/guidelines are adhered to?

The billing system can't really run without a relational database backend. Some of the articles I've read about writing unit tests for systems that depend on a database backend is to run tests instead on a lightweight in-memory database or something like a temporary SQLite instance.

Because the billing system is not operative at all without certain data in the database at a minimum, we pretty much have to have a database and it has to have a minimal amount of data in it. My current thought is that we load that minimum dataset and then create a dump or snapshot of the database and use that as the basis for testing.

Doing this, unit tests would go through startup and teardown similar to the following:

  • Create temporary database
  • Load snapshot/dump data into temporary database
  • Start temporary instance of billing system which associates with the temporary database
  • Run API unit tests
  • Drop temporary database

I'm not sure yet if this process of loading a snapshot of billing system data would qualify as "fast," but it could guarantee consistent, predictable test results. We may also be able to load the snapshot once for many tests run in a group, thereby minimizing the overhead of instantiating the test database.