Just got the “OK” on my thesis subject from the professor, yay! Need to get some paperwork done next, and after that my thesis will be underway officially. Bureaucracy…

I’m writing this entry from a bus, hooray HKL (Helsinki transport authority) for free WLAN!

So, yesterday I managed to copy our department’s official LaTeX template and set it up for my own needs - and write some 2 pages of actual text for the introduction! As I mentioned before, all of the text will probably end up scraped off the face of the earth later, but most importantly, some of my ideas are now actually down in writing in a format nearly comprehensible for other people. Only some 60 pages to go ;)

On the tool front, I’ve been using TeXlipse, which I grew to like when using it in a couple of earlier university projects. In addition to basic LaTeX syntax highlighting capabilities, it offers some neat little completions, such as automatically matching a \begin{something} with an \end{something}, and also automatically completing my BibTex citations for me (see screenshot below.) TeXlipse does have its quirks, but overall it’s pretty great. Its development seems to have died down, though, so trying to open an empty BibTex file will cause you problems to the unforeseeable future.

BibTex completion in Texlipse

TeXlipse has a simple but efficient model for generating a PDF of your LaTeX sources: you simply point out the main LaTeX file in your project, select the output format (and the command used to produce the output), and you’re off. Now, whenever you make changes to your source files and save, TeXlipse bakes you a fresh PDF in the background. Thus, my writing cycle is simply write, save and “reload” in my PDF reader.

In addition to TeXlipse, I’m using a tool called Rubber for keeping the LaTeX compilation process nice and clean when run from the command line (Rubber and LaTeX, gotta love the kinky vocabulary of scientific writing.)

All of my LaTeX source files are under version control, naturally. My tool of choice is Git, which I’ve been playing around with in all of my recent projects. My usage of Git is far from complex, and I’m definitely not bending the tool’s limits: I’m basically just enjoying the possibility of committing locally when I’m offline.

Now, off to get some sleep. More coherent thoughts tomorrow, hopefully.

Most of the articles I’ve come across while browsing through the ACM digital library and IEEE Xplore have been very good, quality-wise. No wonder, since most of the articles listed on these sites have gone through a peer-review process and have appeared either at conferences or in scientific papers worldwide.

That’s not to say that all of them have been good. For example, I’m just glancing through a paper entitled Software Quality and Agile Methods, that - judging by its name - should fit the subject of my thesis quite nicely. The paper was released in the proceedings of the 28th Annual International Computer Software and Applications Conference (COMPSAC’04), but in reality, the paper seems to me as if written by a group of students for their Software Engineering 101 course (no offense to the authors, perhaps this has actually been the case.)

The paper is quite superficial in that it spends the first 3 or 4 pages giving an overview on agile methods in general, and then comparing them with the waterfall model, and not making any great strides at that. The research is more of a rehashing of known facts apart from a few conclusions the authors make. It also makes some seemingly unverified statements, such as the authors’ problem statement:

Can agile methods ensure quality even though they develop software faster [...] ?

While there are certainly results that indicate that agile methods might result in a faster turnaround time for a piece of software, this can’t be generalized just like that. And for me, agile methodologies encompass more important features than finishing the project quickly. For example, producing software that the customer really needs (and not just what she thinks she needs at first.)

The authors continue with

Testing and simulation are dynamic techniques.
Sometimes static techniques are used to support
dynamic techniques and vice versa. The waterfall
model uses both static and dynamic techniques.
However, agile methods mostly use dynamic
techniques.

Profound thoughts ;)

On a more serious note, all I’m saying that it’s worthwhile to really dissect and assess your source material, instead of just taking it as the [your-deity-of-choice]’s honest truth.

There is only one thing more painful than learning from
experience and that is not learning from experience.

– Archibald MacLeish

Haven’t blogged on my progress in a few days; been busy reading through the list of interesting papers I’ve gathered during the past few weeks. Some low-level stuff, such as the principles behind SAFE, a static analysis tool for Java programs, and then some higher level topics, such as Agile software development and quality, and whether the two go hand in hand.

A lot of new concepts to wrap my head around. Doing some research after working on a semi-legacy system (aren’t they all…) for the last year is absolutely refreshing. I’m hoping to start the actual writing process (and refresh my LaTeX skills) next week: most of the initial text will probably end up in the trash can, but I really need to get into the whole business of writing after a longish break.

I’ll probably write my thesis in English, which is generally frowned upon at my department: you need to have good reason to divert from using Finnish. But since the paper is related to a work project that’s in turn related to something with international ties, I believe I have a just cause for using English. And it’s always good to have a larger potential audience for whatever I come up with. Plus, I hate writing technical stuff in Finnish, since you’re always faced with having a cutting-edge technical term that has no Finnish counterpart, and end up with a self-translated substitute, or using the original term within a sentence that’s in Finnish otherwise: I don’t like either of the two options.

Btw, the title of this entry is subliminal message for you all to go and read a book of the same name by Stephen King: a piece on writing and being a writer, but also one hell of an autobiography (I’m not a writer, nor do I claim to be, but I thoroughly enjoyed the book, so don’t be frightened by the name!)

Off to a slow start after a busy weekend.

I’m yet to write any actual content for my thesis, but I’m actively using a wiki for keeping track of the ideas I have, articles I’ve read and so forth. My choice of wiki engine is TWiki, which fits my needs quite nicely.

Pros:

  • It looks good and is quite polished for an open source tool
  • Easy on the server requirements, as no database is required: everything’s stored in flat files
  • You can create multiple webs (kinda like spaces for any Confluence users out there); so, I can have multiple wikis in one instance, basically. Currently, I have one for my thesis, one for personal stuff, one that I’ve shared with a bunch of friends etc - access rights can be set per web
  • It is truly a wiki: for example, much of the configuration is done by changing values in special wiki pages

Cons:

  • It is truly a wiki: setting access rights is done by adding usernames to a variable in a special purpose wiki page. Quite error-prone and not exactly as easy as clicking on checkboxes per-user / similar
  • The wiki syntax is somewhat unintuitive at times, and could be less verbose: for example, [[http://google.com][this is what a link looks like in wiki markup]]
  • It’s written in Perl; won’t be touching the source even if provided with a longish stick

Once I get underway writing the actual thesis, I’ll definitely be using - and writing in - LaTex. In my previous school projects, I’ve used Texlipse, which is a rather nifty Latex plugin for Eclipse. But now, having used a wiki for much of the initial work, I’d love to keep writing using a wiki, with approriate LaTeX -syntax support thrown in, and using LaTeX to manage references to source articles. I’ll have to investigate my options further.

As I mentioned in my previous post, Conqat’s configuration mechanism is rather complex - the price of being as flexible as it is. The authors state the following (emphasis mine):

The downside of the tool’s design is a fairly complex configuration mechanism. Setting up a complete quality
assessment configuration results in a configuration file of about 300 lines
and requires thorough understanding of the processors involved. However, the aim was to develop a tool which is configured rather infrequently and run repeatedly with the same configuration (or a subset thereof). Besides that we expect the typical user to be an expert in the field and therefore don’t consider this issue a problematic one.

Granted, these types of tools are usually configured once and only fine-tuned thereafter. But a 300 line XML configuration to enable a thorough set of analyses? Definitely the kind of thing I’d expect to narrow down the potential user base of the application. And being an expert in quality analysis doesn’t make one an expert in configuring a analysis tool they happened to stumble upon.

Personally, I’d got about distributing the tool with sensible defaults, just requiring the user to set the source directory, and off you go. Now, just getting the LOC/Javadoc -ratio analysis (which is quite irrelevant IMHO, but that’s another blog post) and the clone detection working, I needed to browse thru the wiki for 15 minutes or so. In that sense, Conqat flunks the “five-minute-test”, thus scaring away some potential users. In its defense, Conqat *is* an academical project and not a polished commercial / open-source tool.

A higher-level configuration format / tool would definitely come in handy. Personally, I’d just like to click on some checkboxes to enable certain analyses. Or do the corresponding thing in a higher level configuration file. Hide the flexibility under the hood and only expose it as necessary. Like so:


analyses: loc, clone, pmd

Got a couple of quick answers to my questions from two of the Conqat guys. Also received a copy of the (not-yet-publicly-available) CloneClipse tool that can be used to visualize duplicates found in code. Haven’t had a chance to try it out yet, as I’ve been delving a bit deeper into Conqat’s design and internals.

The architecture seems extremely flexible, but this does come at the cost of everything being a bit tedious to set up: you’re basically defining processors, and then piping and filtering them - using XML. If I’ve understood correctly, Conqat provides the concept of blocks to configure things on a higher level.

In other news, I found a colleague whose willing to work as instructor for my thesis. The number of instructors for a thesis done for a company is bewildering: you need two instructors from the university (namely, a professor and then somebody with a degree acting as an assistant or similar) and one from the employer.

The initial subject of my thesis is something along the lines of “Continuous quality control in agile software projects”. I hope to elaborate on the subject in a blog entry later, once I finish and manage to return the official 2-page specification of the subject to my thesis instructors.

While I was looking for material related to the subject, I ran a cross Conqat, a tool for continuous quality assessment developed over at the Technical University of München. After a quick evaluation, it seems pretty useful and something I’ll definitely investigate further. I sent a short inquiry to the authors regarding any future developments of the project.

“code discipline? i like it!”

The first words of my (potential) thesis supervisor upon hearing the preliminary subject of my thesis.

So, on a rainy September afternoon, I’m sitting here in the library of my university faculty, tying in some loose ends and starting a project that’s been on my todo list for a long while now: a master’s thesis in computer science. I’ll document my progress here. Stay tuned.