Issue #6 | May 2012

Welcome

Welcome to the May edition of the OSS Watch newsletter. We are delighted to start with a guest blog post from Ross Gardler of OpenDirective. Ross' experience as Vice President of Community Development at The Apache Software Foundation and a mentor at the Outercurve Foundation puts him in the perfect position to consider 'What makes a community led project work?'

We follow this with a blog post from Rowan Wilson who is looking at an activity of standards creation bodies where they try to get patent owners to agree to 'fair, reasonable and non-discriminatory' (FRAND) terms in order to allow wide-spread use of the standard. In particular, Rowan tells us how this relates to the UK Government’s Cabinet Office seeking to create a policy around the use of open standards in government IT.

Finally, Sander van der Waal blogs about a very pertinent issue facing many researchers: what to do with the data collected as part of a research project. Sander urges us not to keep our data under the desk and introduces us to a project called DataStage that may well provide the solution to this perennial problem.

We hope that you are enjoying the newsletter and if you have any comments about it, or anything else, please do get in touch at info@oss-watch.ac.uk.

...
In this issue

Blog: What makes a community led project work?

Article1Ross Gardler considers 'What makes a community led project work?' using his wealth of experience at The Apache Software Foundation.

Blog: FRAND or FOSS?

Article2Rowan Wilson looks at the use of FRAND terms in relation to standards and the Cabinet Office's work to use open standards in government IT.

Blog: Don't keep your data under your desk

Article3Sander van der Waal urges us not to keep our data under the desk and introduces us to a project called DataStage.

...

From the Blog

What makes a community led project work?

Published as a guest post on May 8, 2012

This guest post has been contributed by Ross Gardler of OpenDirective. Ross is Vice President of Community Development at The Apache Software Foundation and a mentor at the Outercurve Foundation. Ross has been active in open development of open source software for over ten years.

OSS Watch has been participating in the development of Apache Rave, a ‘next-generation portal engine, supporting (Open)Social Gadgets as well as WC3 widgets’. As Sander observes in this blog, the Rave ecosystem is made up of a ‘diverse range of collaborators’ from both the academic and commercial sectors. These partners are sharing resources in order to build a critical piece of software at lower cost as well as to increase innovation around that product.

A few days ago I posted an evaluation of the Apache OpenOffice project’s journey through the Apache Incubator (all code entering the Apache Software Foundation (ASF) must pass through the incubator). That post looked at what makes an Apache project different from many other open source project. This post repeats many of the same points, but rather than examine them from the point of view of OpenOffice I will examine why predominantly academic team behind Apache Rave chose to go to the ASF.

Read more… »

In Apache projects, a Project Management Committee (PMC) oversees each project on behalf of its users, contributors, committers and the foundation itself. Upon entering incubation the PMC is guided by mentors from the foundation. Upon graduation mentors either retire or become equal members of the PMC. For the Rave community the provision of mentors meant that the project team could avoid the mistakes of many other open source projects. As a result, the team got an honourable mention in the Black Duck Open Source Rookie of the Year awards. Not bad for a team with no significant knowledge of open source software development on a large scale. Now it has graduated, it no longer has mentors actively overseeing its work, but it still has the backing of over 100 full Apache projects and another 50-odd incubating projects.

New committers and PMC members are elected by the PMC based on merit. It should be relatively easy for anyone to gain influence on an Apache project. In the ASF this is achieved through rewarding merit. If you contribute to the project you are rewarded with influence over the project. In environments where staff turnover can be high, such as academic research, this is important with respect to continuity. It also removes the opportunity for someone to insist on a level of control based purely on the cash they wield. In an Apache project it is all about the delivery.

All decisions unrelated to individuals happen on the public mailing list, discussions on the private list is kept to a minimum. This behaviour has no special bearing on academic projects compared to non-academic projects. For both types this rule ensures maximum inclusivity which results in maximum engagement with potential contributors.

‘If it didn’t happen on the dev list, it didn’t happen’ – meaning no decision about the project can be made outside of the public development list. Proposals can be drawn up elsewhere, but decisions occur on the public list. Academic projects, like open source projects in general, often involve collaborators from a variety of geographic regions. This can make it difficult to ensure that everyone is kept informed and engaged. Apache projects require that all significant decisions are made in public so that no participant (or potential participant) is excluded from the process.

Where possible, decisions are made by consensus reached through discussion. There are voting rules but the ASF prefers not to have to vote. Apache Rave began life as a merger between three pre-existing projects. It was important that all three parties were equally engaged in the project. Had there been a pre-defined leader this would, probably, have made some participants feel less engaged. Initially the consensus driven approach can be hard to understand, however, over time natural leaders emerge in specific areas of the project. At this point consensus is easily achieved since each decision is led by the person best equipped to lead it.

Releases are created according to the ASF’s licence requirements. The Apache License is a permissive licence that allows anyone to do anything they want with the code. This allows for maximum flexibility in business cases for engaging with the project which in turn encourages third party contributions. Whilst conforming with Apache policies is more onerous than might be found elsewhere, they are designed to ensure that people can use and contribute to your software with minimal legal risk. Risk is something that universities and companies alike tend to avoid.

Trademarks and logos used by ASF projects belong to the ASF. Protecting trademarks is an important part of open source software. By running the Rave project inside the ASF much of the legal infrastructure and experience is in place should an issue arise in the future.

Apache projects are managed by a diverse group of people, each representing their own interests within the project. Apache decision making processes prevent ‘block votes’ controlling the process by ensuring each voice is equally loud. A number of people are contributing to Apache Rave, each with their own motivations. Each contributor must be assured that what they do today will still be useful tomorrow. Apache projects adopt a model that means it is not possible for third parties to gain control of a project. Consequently, researchers and product developers do not run the risk of losing influence over the code.

As can be seen from the above list of required behaviours found in Apache projects, the focus is on ensuring the project provides maximum opportunities for collaboration and innovation. There are other ways of achieving this but for the initial participants in Apache Rave (Universities of Bolton, Oxford and Indiana, SurfNet, Mitre Corp. and Hippo) the ‘Apache Way’ was deemed to be the most suitable. The same can be said of Apache Wookie which is used in Rave and was also helped by OSS Watch as it moved to Apache.

If your project wants to explore the opportunities that foundations (not just the ASF) can offer your project OSS Watch is here to help.


FRAND or FOSS?

Published by Rowan Wilson on May 4, 2012

Standards in technology are generally considered to be a good thing. Having documented technologies that can be implemented by all means that businesses can compete on equal terms and consumers benefit from the effects of this competition. Of course, before a technology can be standardised, individual technology players need to do the work of innovation to develop the techniques the standard will encompass. Sometimes these technology players will have sought to protect their investment in innovation by obtaining a patent for the innovative technology they have created. Patents are designed to provide a monopoly over a specific technological process for the owner, so how does this monopoly fit in with the idea of a standard?

The answer is that it doesn’t, really. In situations where implementing a standard would necessarily infringe on someone’s patent, the standards creation bodies will usually try to get the patent’s owner to agree some terms which will guarantee them a return for their investment but which will still allow everyone in the market to actually use the standard in their products. These kinds of terms are often referred to as RAND or FRAND – standing for (fair), reasonable and non-discriminatory.

FRAND is a slippery term. There’s no single definition, which makes determining what is and is not FRAND hard. Most people agree that the general principle behind FRAND is that the fees or other requirements for use of the patents in question are not ridiculously high and are the same for anyone who wishes to implement the standard, whether your best friend or fiercest competitor.

Read more… »

That sounds like a good idea to most people, and for more traditional hardware and closed source implementations of standards it arguably is. There can be problems, however, when software under a free or open source software wishes to implement a standard available under FRAND terms. For example, the GNU GPL family of licences all contain conditions that say – in essence – that if a distributor of the software is forced to pay for the use of a patent in the software, they must either cease distribution or obtain a licence for everyone (the schoolroom chewing gum scenario). These conditions are designed to deter patent owners from pursuing distributors of GPL software, but they mean that payable FRAND standards and GPL software do not play well together.

Even where the licence is not GPL, there can be problems with the interaction between FRAND and FOSS. One way in which patent owners make their patents available for use in a standard is by issuing a ‘non-assert’ promise. These are unilateral undertakings to not assert their patent rights, and in this context they are usually conditional on the patent being used in an implementation of the standard (not unreasonably). However in the context of open development, this can be something of a nightmare. You may write a piece of code that implements the standard and release it under a FOSS licence, confident that you are protected from patent litigation by the non-assert. An unwary downstream developer looks at your code – specifically the bit that implements the patent – and thinks: “that’s a nice bit of code – I’ll use that for my next project…” Of course, unless by some happy accident their next project is also implementing the standard then their use of the same code will not be protected by the non-assert, creating a potentially very dangerous problem.

The question of the compatibility of FRAND terms with FOSS software has become a vexed one recently due to the UK Government’s Cabinet Office seeking to create a policy around the use of open standards in government IT. The idea here is to reduce the currently crippling costs of government IT systems by opening the procurement process up to more competition. One of the perceived problems with the current situation is that there are only a few providers of solutions who can cope with the government’s massive requirements, and that the monolithic solutions they provide are often very hard to substitute once they are in place. The solution, or part of it anyway, is to break up the requirements into smaller deliverables that could be provided by more and smaller companies. How do you get these smaller solutions to work together? Use standards, preferable ‘open’ ones. That ought to create a level playing field for all sizes of providers, and alongside that make it easier to pitch FOSS solutions – with their problems with more restrictive standards and tendency to be supported by SMEs – to government.

Initially the Cabinet Office just stated that they would mandate open standards in future government procurements. Unfortunately this ran into problem of definition. Just as with FRAND – no-one has a single, snappy definition of what and open standard actually is. It’s easy to assume – with Justice Potter Stewart – that we will know one when we see one, but in practice there are polarised views in this area. The Cabinet Office’s initial definition was not to everyone’s liking. To resolve this potential confusion, not to say conflict, the Cabinet Office launched a consultation exercise to help pin down exactly what an open standard is, according to the largest possible group of respondents. The deadline for this has since been extended after it emerged that a perception of bias might have been introduced by the conduct of the process.

Some evidence of the ructions that lead to the consultation exercise can be seem in the documents columnist Glynn Moody obtained through a Freedom of Information request. I will not attempt to summarise this weighty sheaf, but I would recommend glancing through them if you want to see how lobbying of the government over IT matters looks in its naked state. At issue is the idea that – as in the Cabinet Office’s initial definition – open standards should be entirely royalty free. Now obviously ‘at no cost’ is about as low a barrier to entry as one can get, at least in monetary terms, so it’s easy to see why the Cabinet Office adopted this definition from its original home at the W3C. For one thing, it would get around the GPL-compatibility issue mentioned above, and if used instead of a non-assert, also the ‘mode of use’ problem I have cited. However it would also exclude some existing technical standards (although not many – most are already royalty free), and clearly some players are not going to be happy with that…

OSS Watch is interested in the outcome of this process because – as a non-advocacy group – we are keen that all potential solutions are able to be assessed on their merits alone. We would strongly recommend that everyone responds to the UK government consultation exercise, in order that a truly communal definition of open standards can be achieved.


Don’t keep your data under your desk

Published by Sander van der Waal on May 2, 2012

It is a well-known problem for researchers. Data is being collected for a research project and no decision has been made about how to manage the data during the project. Naturally, once you have finalised the project and start publishing on the end results, you may deposit your final dataset in a institutional repository such as your university’s DSpace or E-prints repository, or you may even put it in Dryad. However, that is not sufficient to keep your data safe while you are still working on it. Often, such data ends up on a computer that just happens to lie around in the office or department, or even on the researcher’s local machine.

Read more… »

People that are conscious about back-up issues may be using a solution like Dropbox, SkyDrive or Google Drive, but some issues exist around data ownership and rights that may prevent you from wanting to use these services.

So what would be easier than just saving it in a folder, as you would with tools like Dropbox, but have it backed up by the institution, version-controlled automatically and keeping it within the trusted boundary of your organisation? And still allowing you to optionally share the folders with your research group, or a wider group of people, whichever is appropriate.

This is what the open source tool DataStage offers you. Developed as part of the DataFlow project, it is a piece of software that will be installed at, usually, the departmental level of your institution, but it can also be hosted in a virtual ‘cloud’ infrastructure. It allows you as a user to simply map a network drive to it. You save files as normal, and everything will be handled for you. Near the end of the project, when you start publishing and want to make the datasets available to a wider public, you can push any dataset to a SWORD-compliant repository, such as the ones mentioned above or to a DataBank instance.

The beauty about an open source project like DataStage is that anyone is welcome to use the software and contribute towards its ongoing development. You can imagine there are many more use cases for a tool like this, which are unrelated to research data. Take for example the popular Raspberry Pi project. In a classroom situation where where all the kids have their own little computer, they can submit their homework via DataStage to the teacher who can centrally check everything on the main server and mark their work. This smart different application was highlighted by David Shotton in its presentation during the DataFlow Launch Workshop on 2 March.

Are you curious about what DataStage can do for you? Come and download our beta release to try it out and join us on the DataFlow mailing list to tell us about your experiences and what may be improved. We would love to hear from you!


This newsletter contains Creative Commons licensed photos by Flickr users CRASH:candy, tomas carrillo, ian boyd, and cindy47452.