Given how much we use open source software, it's in our interest to more actively support it. We therefore made the decision to open-source some of our own libraries. This article discusses some of the choices and challenges that we faced on this journey.
What to Open Source
Over the years, we’ve extracted commonly used functionality into a number of libraries. These are mainly .NET libraries which we distribute as NuGet packages via an internal company NuGet feed. Choosing which libraries were suitable to be published as open source was the first decision to make, and there were a few things we had to think about:
- Usefulness – the libraries had to be useful to the wider development community, so nothing too specific to our own use cases
- Uniqueness – libraries solving a problem that isn’t already solved by someone else were to be preferred
- Client Confidentiality – we wouldn’t publish anything that would expose client IP
- Dependencies – ideally, we wanted the libraries to be standalone with no dependencies so that other developers could use and contribute to the code as easily as possible
Initially we have chosen three libraries that fit the above criteria, although we may choose to publish more in future:
- Random contains a number of extensions to the .NET Random class to generate randomized data, such as random dates or characters
- Typescript generates TypeScript classes and interfaces from C# code, solving the age-old problem of keeping server-side and client-side models in sync
- Locality is a C# library to interact with the Geonames API
Benefits of an Open Source Presence
There are numerous benefits to publishing some of our own code as open source libraries, some of which are discussed below. The process was going to take time, and the benefits had to be examined to make sure that the time investment would be worth it.
Giving Back to the Community
Open source libraries are vast, with developers now expecting to be able to find projects for almost anything they need. With that being said, we believed we had libraries that could fill in the gaps and would prove useful to other developers and organisations. Making this contribution allowed us to give back to the community.
There is a saying in software (Linus' law): “given enough eyeballs, all bugs are shallow”. The implication is that open source software improves quality by increasing the number of people that can potentially find issues. By publishing our code, external contributors can find bugs, provide feedback and even add additional features.
It’s a Technical Challenge
Let’s face it, developers love a challenge! The various technical challenges involved in open-sourcing software were exciting to us as they were an opportunity to expand our knowledge in some fairly esoteric areas such as code signing.
The Challenges we Faced
Getting to the point of being able to publish the libraries certainly wasn’t straightforward. Throughout this process, we faced a number of challenges and had to make many difficult decisions – some of which were expected and some which caught by surprise. We hope that by sharing the that hurdles we overcame, we will be able to help other organisations who are embarking on the same journey.
Open source software is generally issued under a license such as GPL, MIT or Apache 2.0. We chose the MIT license as it is a popular license used widely across other libraries that we use, such as .NET Core. It’s also fairly permissive in nature; for example, it allows commercial use, which is important if the libraries are ever to be used in commercial projects by other organizations.
.NET libraries are distributed using the NuGet package management system, therefore we published our open source libraries as NuGet packages (see here). To prevent people from releasing packages under our name, we decided to sign our packages using a code signing key. NuGet has functionality for doing this out-of-the-box, but it wasn't straightforward. The main issue we found is that the ‘dotnet pack’ command does not -- at the time of writing, at least -- support signing packages, leading us to use NuGet.exe (ensuring that the latest version of NuGet was installed on the build server). The ‘sign’ command is not very well documented, so it took some trial-and-error to get to the correct arguments:
sign “$(Build.ArtifactStagingDirectory)\*.nupkg” -CertificatePath “$(certificate.secureFilePath)” -CertificatePassword “$(CertificatePassword)”
.NET provides mechanisms to strong-name and digitally sign assemblies (this blog post describes the difference between them). The primary reason for us strong-naming the packaged assemblies was because strong-named assemblies can only reference other strong-named assemblies, so without it anyone else strong-naming their assemblies would not have been able to use our packages. This is likely to include a lot of enterprises and anyone distributing code to third-parties, so skipping this would have significantly reduced our potential audience.
Digitally signing an assembly allows consumers to confirm that the assembly did originate from the claimed publisher (e.g. Audacia) and that therefore it hasn’t been tampered with. Without signing, a malicious third-party could distribute an altered assembly in our name and consumers would have no way to determine that this has happened. There is a clear reputational risk here, therefore we decided that any publicly distributed assembly must be signed.
When setting up CI/CD pipelines for the open source repositories, we had to be careful not to allow people to maliciously edit steps in our build scripts. Azure DevOps includes protections against such attacks by only building pull requests when a member of the Audacia organisation approves it.
We also had problems when a pull request from a fork of a repository by non-contributors was built. The build failures were specifically failures to sign the assembly, therefore we disabled signing for builds from forks. This was safe to do as the pull request builds do not publish any packages, and are purely for validation purposes; i.e. to ensure that pull requests don’t break the build. To publish a package, a build has to be instigated from a member of our Azure DevOps organisation.
We included each repository’s commit history when we uploaded the code to GitHub, so that future contributors can see the full context behind changes that have been made. We had to be careful that there was nothing in our commit history that compromises security, such as hard-coded credentials, or that risked reputational damage, such as inappropriate commit messages. Staff security was also of a priority; all contributors to the library were consulted prior to publication, to ensure that they were happy to have their names published in the public commit history.
Every developer writes utility programs and helper libraries to assist with common tasks and reuse code. Often engineering teams will consolidate these programs as shared packages and distribute them for all developers to benefit from. Most teams might stop there, effectively gatekeeping useful code that could be shared with the rest of the community.
Publishing some of our source code publicly has been a way for us to give something back to the open source community. We would encourage other companies to think about establishing an open source presence as it is something from which we can all benefit. Now that we have an established process as to how to open-source a library we will be continuing this journey and moving more libraries out into the open-source world.