GSoC Status Update – Week 3

This is a status update for my Google Summer of Code 2013 project - implementing advanced statistics importers for Amarok. Please read the first post if you would like to know more about the project.

For the previous week I had scheduled "Reimplementing FastForward importer, cleaning up data-reading module. At the end of the week I expect to have working implementation of FastForwardProvider." That's done now, and I can't say it took a lot of coding.

It took a fair amount of thinking, though. Classes that inherit StatSyncing::Provider and StatSyncing::Track have to fulfill certain conditions for working in multi-threaded environment. In short, the former must be reentrant, and the latter thread-safe.

Surprisingly, there are various definitions of reentrancy and thread-safety. The Qt's one basically boils down to this: a class is reentrant when different threads can safely work on different objects, and thread-safe when they can work on the same object. [Wikipedia disagrees]Wikipedia reentrancy. In it's definition, a method is reentrant when it can be interrupted in the middle and started again. In that sense, the function in the following example (courtesy of Wikipedia) is thread- safe, but not reentrant; an invocation interrupted in the function body would leave the mutex locked, and any subsequent invocation would hang on mutex_lock().

int function()
{
    mutex_lock();
    // ...
    function body
    // ...
    mutex_unlock();
}

But leaving this small tangent aside, that wasn't the only thing I had to think through. I had a small case of premature optimization going on: I didn't want to open a new SQL connection for every request (Amarok stores it's main collection's data in SQL database). It sounds simple enough to do, but coupled with the fact that "a connection can only be used from within the thread that created it", and multithreaded nature of providers, this was an interesting problem to get right. So I rediscovered what every other Qt developer already knew: if you need something to run in a specific thread, use Signals & Slots mechanism. With Qt::BlockingQueuedConnection and using emit foo() instead of slotFoo(), we basically get an effect of executing a function - synchronously - in a different thread, and with an added benefit of not having to bother with synchronization ourselves.

It gets a little bit more awkward when both caller and callee can happen to be in the same thread, as Qt::BlockingQueuedConnection blocks caller's thread until called slot is done:

    connect( this, SIGNAL(retrieveAllData()), SLOT(slotRetrieveAllData()), Qt::BlockingQueuedConnection );
    // ...
    if( thread() == QCoreApplication::instance()->thread() )
        slotRetrieveAllData();
    else
        emit retrieveAllData();

But hey, the pros outweigh the cons, and it's what matters. And after I was done with FastForwardProvider and FastForwardTrack, I decided to add some lazy initialization to the mix...

So, to wrap this up. Aside from multi-threaded fun and figuring out the right file system hierarchy, I also spent a lot of time thinking about the bigger picture, that is how and when importers would actually load up the configuration and register themselves with right controllers. It's something I initially overlooked in my proposal, and something I'll try to address this week, possibly skewing my schedule a little bit. More on this topic coming next week.

Oh, and I'm not done with FastForwardProvider, not really. It's more of a prototype. As more providers come to life, they will see a lot of changes aiming at deduplication and easing the creation of new ones.

Thanks for reading!

GSoC Status Update - Week 2

This is a status update for my Google Summer of Code 2013 project - implementing advanced statistics importers for Amarok. Please read the first post if you would like to know more about the project.

I didn't think writing tests could be so much fun! Neither did I think they could take me so much time. What I scheduled for the past week was "implementing the initial test suite for importers (exam week)," and it was supposed to be a fast & easy task, allowing me to focus on my exams. And yet after my last test that week I found myself spending every day intensively coding. Most of the time was spent on figuring out how to approach testing and writing the code to make that chosen approach possible; writing tests themselves actually was the fast & easy task I thought it to be.

At the end, I ran git diff --stat for my changes: 26 files changed, 1246 insertions(+), 69 deletions(-). I was surprised.

I managed to test things the way I wanted to. I created two test suites, TestITunesImporter and TestFastForwardImporter, both inheriting from TestImportersCommon, which holds common tests and utility functions. I slightly modified both importers so that I could statically provide them with arbitrary database paths to read from and arbitrary collections to write to - one of my goals for this week was *not to modify existing importers in any major way.

I created two collections: localCollection and fileCollection; the latter fulfilling a role of FileTrackProvider, giving importers information about tracks located at nonexistent paths; the former being the collection tracks were synchronized to. To easily check, clear, and modify their contents I extended capabilities of Collections::CollectionTestImpl class, which stores collection in-memory. The 'init' and 'clean' test procedures take care of filling the fileCollection with necessary data, clearing localCollection, and resetting statistics after each test.

The code has no idea which importers it's dealing with, operating on DatabaseImporter, a superclass of both. The tests don't know even that. A test sets preconditions by modifying collections' contents, calls blockingImport() method of TestImportersCommon, and then checks the resulting state. An elegant, implementation-agnostic way, which fulfills my second goal for the week: write tests for current importers in a way that would make them easily reused for reimplemented importers. Done and done.

So here's something I learned that week: how to run an asynchronous task in Qt and block until it's done. That's a useful thing for testing! Here's the full implementation of blockingImport():

void
TestImportersCommon::blockingImport()
{
    QScopedPointer<DatabaseImporter> importer( newInstance() );
    QEventLoop loop;

    connect( importer.data(), SIGNAL(importSucceeded()), &loop, SLOT(quit()), Qt::QueuedConnection );
    connect( importer.data(), SIGNAL(importFailed()), &loop, SLOT(quit()), Qt::QueuedConnection );
    connect( importer.data(), SIGNAL(importError(QString)), &loop, SLOT(quit()), Qt::QueuedConnection );
    connect( importer.data(), SIGNAL(importFailed()), this, SLOT(importFailed()) );
    connect( importer.data(), SIGNAL(importError(QString)), this, SLOT(importFailed()) );

    importer->startImporting();
    loop.exec();
}

And this is as personal as it gets when it comes to interacting with importers in tests. (As you can see, I'm playing with a new plugin for code snippets. Configuring it fully is yet another thing I have planned for later.)

You can check out my progress on my public Amarok clone. The branch is named gsoc-importers.

Thanks for reading!

GSoC Status Update - Week 1

This is a status update for my Google Summer of Code 2013 project - implementing advanced statistics importers for Amarok. Please read the first post if you would like to know more about the project.

Long days and pleasant nights.

The first week of GSoC has passed. What I had scheduled for that week (and I'm still getting around to placing the schedule around here) was "Drafting up test cases for FastForward and iTunes importers."

I admit that it doesn't sound too exciting, and neither does the next one which is all about implementing these drafted tests. The reason for that is that I'm currently in the middle of final exams; as of writing this posts, three of them are behind me and three are left to go. The last exam is July 2nd, so all of the real fun will start after that.

So, let's write about what I actually did that week, which amounts to this: I went through the current importers implementation and noted what they currently do. I then compared this with what the reimplemented importers should do, and then noted on my whiteboard what kind of thing I should then be testing. I also familiarized myself with QTestLib and KDE helper methods for testing and prepared a blank test suite, so I'm basically all set to implement things. And that's about all.

I want to write a bit about why I want to implement these tests at all. Their purpose isn't to test the implementation of current importers, because they're being replaced anyway. What I want to do is assert how the logic works right now, and then carry on these assertions to the new implementation. The bottom line is: they'll make sure that the new implementation is no worse than the old one, which is one of my project's goals. :)

Signing off before I write everything I have about tests, and then have nothing to write about next week. Thanks for reading!

Hello World! Me, the blog, and GSoC

Hey! My name is Konrad Zemek, I'm a student at AGH University of Science and Technology, Kraków, Poland. I study Computer Science, currently finishing my second year. I'm a programmer. Mainly a C++ programmer, but that's just because I write in C++ professionally - I know a few other languages and I'm fast to pick up new ones. I could write about this a lot more, but I guess it all just comes to that: I'm a programmer, and a good one.

And I have other traits, too! I love gaming, particularly video gaming. I eat through books, mostly of the fantasy kind. I ride a bicycle almost every single day, and I'm learning to play the electric guitar, dreaming that I could someday justify buying a Stratocaster.

I'll get to placing my photo, not made hastily with a cellphone, somewhere around here - in the meantime I redirect you to my Google+ page.

The blog

I really tried not to commit too much time to deploy this blog, as I'm behind my schedule already. And believe me, it was hard - I'm the type of person who spends weeks reading about color theory and typography before deploying a website. This time, at least from the visual side, everything is pretty generic - still, I see great many hours of tweaking and customizing this blog in my future.

Just not now.

It's a bit more interesting from the technical side. I deployed the WordPress application on the Amazon cloud, AWS; its code (the "application" part) resides in a git repository, which I push to "Elastic Beanstalk" (a Platform- as-a-Service) after every update, which in turn spins up a generic instance of GNU/Linux and deploys the application there. Nothing on this instance is persistent, so no actual content can be stored there. The blog connects to a SQL database, which is provided by another part of AWS. Then there's persistent content that is neither a part of the application nor it belongs in the database, like uploaded images and other post attachments - these are stored by Amazon S3 service. CloudFlare provides DNS, caching, and some anti-bot screening.

There are a lot of things on the technical side that I'm itching to talk about, but I've got more important thing to write about, that being...

Google Summer of Code

The main purpose of this blog, or rather the reason it came into existence, is to write about Google Summer of Code, more specifically about my own GSoC project. The title of this proposal is Reimplement Amarok 1.4 (FastForward) & iTunes importers on top of Statistics Synchronization framework, and add Amarok 2.x and Rhythmbox as synchronization targets. Amarok is a legendary music player, part of the KDE software suite (I'd say it's a Linux music player, but that's not entirely true).

Every but the most basic music player collects data about music being played - it's called personal metadata, and includes information like number of times a given track has been played, or user's rating of the track. For users who use those features, it's quite a big deal - it allows for playing favorite tracks, or maybe the tracks which were not listened to enough, all without any need to spend time setting up custom playlist. Or perhaps you like to share your most- loved tracks through a service like Last.fm? None of that would be achievable without personal metadata.

Currently, if you're using iTunes or an old Amarok version, you are able to share this metadata with Amarok, although there's no easy way to keep it synchronized. At a basic level, my project aims to add that very capability - to easily resynchronize Amarok with iTunes, previous Amarok versions, and Rhythmbox, Ubuntu's default music player. There are also some stretch goals, like being able to do a two-way synchronization (update metadata on the other player), and I'm also quite confident of reaching those. I will post my weekly progress updates here during the course of next three months, and I think my planned schedule will end up in a widget somewhere on this site. There also will be more details coming. I'll get to it, I promise!

So that's me done for this post, before I run out of things to say in the next one. Thanks for reading. :)