Drupal and Composer: Part 1 — Understanding Composer
As any developer working with Drupal 8 knows, working with Composer has become an integral part of working with Drupal. This can be daunting for those without previous experience working with the command line, and can still be a confusing experience for those who do. This is the first post in an explorative series of blog posts I will be writing on Composer, hopefully clearing up some of the confusion around it. The four blog posts on this topic will be as follows:
- Part 1: Understanding Composer
- Part 2: Managing a Drupal 8 site with Composer
- Part 3: Converting Management of an Existing Drupal 8 Site to Composer (Coming Soon)
- Part 4: Composer for Module Developers (Coming Soon)
So without further ado, let’s get started.
Composer: What is it?
The Wikipedia page (https://en.wikipedia.org/wiki/Composer_(software)) describes Composer as follows:
That’s an accurate description, though a little wordy. So let’s break it down a little further to understand what it means.
Programmers like to use the term DRY - Don’t Repeat Yourself. This means that whenever possible, code should be re-used, rather than re-written. Traditionally, this referred to code within the codebase of a single application, but with Composer, code can now be shared between applications as well. DRY is another way of saying don’t re-invent the wheel; if someone else has already written code that does what you want to do, rather than writing code that does the same thing, it’s better to re-use the code that that has already been written. For example, the current standard for authentication (aka logging in) to remote systems is the OAuth 2 protocol. This is a secure protocol that allows sites or applications to authenticate with other sites, such as Facebook, Google, Twitter, Instagram, and countless others. Writing OAuth 2 integrations is tricky, as the authentication process is somewhat complex. However, other developers have written code that handles OAuth 2 integration, and they have released this code on the internet in the form of a library. A library is basically a set of code that can be re-used by other sites. Using Composer, developers can include this library in a project, and use it to authenticate to the remote API, saving the developer from having to write that code.
Composer allows developers to do the following:
- Download and include a library into a project, with a single command
- Download and include any libraries that library is dependent upon
- Check that system requirements are met before installing the library
- Ensure there are no version conflicts between libraries
- Update the library and its dependencies with a single command
So how does Composer work?
Composer itself is a software/program. After a user has installed Composer, they can then say ‘Composer: download Library A to my system’. Composer searches remote repositories for libraries. A repository is a server that provides a collection of libraries for download. When Composer finds Library A in a repository, it downloads the library, as well as any libraries that Library A is dependent upon.
In this article, the term Library is used. Libraries are also known as Packages, and referred to as such on https://getcomposer.org/
A project is the codebase, generally for a website or application, that is being managed by Composer.
By default, the main repository Composer looks at is https://packagist.org/. This is a site that has been set up specifically for Composer, and contains thousands of public libraries that developers have provided for use. When a user says ‘Composer download Library A’, the Composer program looks for Library A on https://packagist.org/, the main public Composer repository, and if it finds the Library, it downloads it to your system. If Library A depends upon (aka requires) Library B, then it will also download Library B to your system, and so on. It also checks to make sure that your system has the minimum requirements to handle both Library A and Library B and any other dependencies, and also checks if either of these packages have any conflicts with any other libraries you've installed. If any conflicts are found, Composer shows an error and will not install the libraries until the conflicts have been resolved.
While packagist.org is the default repository Composer searches, projects can also define custom repositories that Composer will search for libraries. For example, many developers use Github or Bitbucket, popular services that provide code storage, to store their code in the cloud. A project owner can set up Composer to look for projects in their private Github, Bitbucket, or other repositories, and download libraries from these repositories. This allows for both the public and private code of a project to be managed using Composer.
What happens when I install a library?
Composer manages projects on a technical level using two files: compser.json and composer.lock. First we’ll look at the composer.json file. This file describes the project. If a developer is using private repositories, the repositories will be declared in this file. Any libraries that the project depends on are written in this file. This file can also be used to set specific folder locations into which libraries should be installed, or set up scripts that are executed as part of the Composer install process. It’s the outline of the entire project.
Each library has a name. The name is combined of two parts, first a namespace, which is an arbitrary string that can be anything but is often a company name, or a Github user name etc. The second part is the library name. The two parts are separated by a forward slash, and contain only lower case letters. Drupal modules are all part of the drupal namespace. Libraries are installed using Composer’s require command. Drupal modules can be installed with commands like:
// Drupal core. composer require drupal/core // Drupal module. composer require drupal/rules // Drupal theme. composer require drupal/bootstrap
When the above commands are run, Composer downloads the library and its dependencies, and adds the library to the composer.json file to indicate that your project uses the library. This means that composer.json is essentially a metadata file describing the codebase of your project, where to get that code, and how to assemble it.
Composer and Git, Multiple Environments and Multiple Developers
Composer and Git work really well with each other. To understand how, let’s first look at traditional site management using Git. Developer A is creating a new Drupal project, purely managed with Git:
- Developer A downloads Drupal core
- Developer A creates a new Git repository for the code they have downloaded, and commits the code to the repository
- Developer A pushes the code to a central repository (often Github or Bitbucket)
- Developer A checks out (aka pulls) the code to this server.
This all sounds good, and it actually works very well. Now let’s imagine that Developer B comes onto the project. Developer B uses Git to download the code from the central repository. At this point, the codebase in Git exists in four locations:
- Developer A’s computer
- Developer B’s computer
- The central repository
- The production server
At the moment, the codebase only consists of Drupal core. The Drupal core code is being managed through Git, which would allow for changes to be tracked in the code, yet it’s very unlikely that either Developer A or Developer B, or indeed any other developers that come on the project, will actually ever edit any of these Drupal core files, as it is a bad practice to edit Drupal core. Drupal core only needs to be tracked by developers who are developing Drupal core, not by projects that are simply using it. So the above setup results in sharing and tracking a bunch of code that is already shared and tracked somewhere else (on Drupal.org).
Let’s look at how to start and use Composer to manage a project. Note that this is NOT the best way to use Composer to manage a Drupal site, and is simply an example to show how to use Composer (see part 2 of this series for specifics on how to use Composer to manage a Drupal site).
- Developer A creates a new project folder and navigates into it.
- Developer A initializes the project with
composer init, which creates a composer.json file in the project folder
- Developer A adds the Drupal repository at https://packages.drupal.org/8 to composer.json, so that Drupal core, modules and themes can be installed using Composer
- Developer A runs
composer require drupal/core, which installs Drupal core to the system, as well as any dependencies. It also creates composer.lock (which we'll look at further down the article)
- Developer A creates a new Git repository, and adds composer.json and composer.lock to the Git repository
- Developer A pushes composer.json and composer.lock to the central repository
- Developer A sets up the production server, and checks out the code to this server. At this point, the code consists only of the composer.json and composer.lock files. Additional servers can be set up by checking out the code to any server.
- Developer A runs
composer installon the production server. This pulls all the requirements and dependencies for the project as they are defined in composer.json
Now when Developer B comes on the project, Developer B uses Git to download the codebase to their local computer. This codebase contains only composer.json and composer.lock. However, when they run
composer install they will end up with the exact same codebase as the production server and on Developer A’s machine.
Now the codebase exists in the same four locations, however the only code being tracked in the Git repository is the two files used to define the Composer managed project. When an update is made to the project, it is handled by running
composer update drupal/core, which will update both composer.json and composer.lock. These files are then updated in the Git repository, as they are the files specific to our project.
The difference between the traditional Git method, and the above method using Composer, is that now Drupal core is considered to be an external library, and is not taking up space unnecessarily in our project's Git repository.
Projects can, and pretty much always do, have versions. Drupal 8 uses semantic versioning, meaning that it goes through versions 8.1, 8.2, 8.3… and so on. At the time of writing the current version is 8.6.3. If a new security fix is released, it will be 8.6.4. In time, 8.7.0 will be released. Composer allows us to work with different versions of libraries. This is a good thing, however it opens up the risk of developers on a project working with different versions of a library, which in turn opens up possibility of bugs. Composer fortunately is built to deal with versions, as we will look at next.
Tracking Project Versions
So how does Composer handle versions, allowing developers to ensure they are always using the same library versions? Welcome the composer.lock file. The composer.lock file essentially acts as a snapshot of the all the versions of all the libraries managed by composer.json. Again, I’ll refer back to the Composer managed site described above. When we first run
composer require drupal/core in our project, a few things happen:
- The current (most recent) version of Drupal is downloaded to the system
- All libraries that Drupal depends on are also downloaded to the system
- composer.json is updated to show that Drupal is now a dependency of your project
- composer.lock is created/updated to reflect the current versions of all Composer managed libraries
So composer.json tracks which libraries are used, and composer.lock is a snapshot tracking which versions of those libraries are currently being used on the project.
Synchronizing Project Versions
The problem with developers using different versions of libraries is that developers may write code that only works on the version of the library that they have, and other developers either don’t yet have, or maybe they are using an outdated version of the library and other developers have updated. Composer projects manage library versions using the commands
composer install and
composer update. These commands do different things, so next we'll look at the differences between them.
Composer Install and Composer Update
Imagine that Composer didn’t track versions. The following situation would happen (again, this is NOT how it actually works):
- Drupal 8.5.6 is released.
- Developer A creates a new project, and sets Drupal core as dependency in composer.json. Developer A has Drupal 8.5.6
- Drupal 8.6.0 is released
- Developer B clones the Git project, and installs the codebase using
composer install. Composer downloads Drupal core. Developer B has Drupal 8.6.0
The two developers are now working on different versions of Drupal. This is dangerous, as any code they write/add may not be compatible with each other's code. Fortunately Composer can track libraries. When a user runs
composer install, the versions defined in composer.lock are installed. So when Developer B runs
composer install, Drupal 8.5.6 is installed, even though Drupal 8.6.0 has been released, because 8.5.6 is listed as the version being used by the project in composer.json. As such, developers working on Composer managed projects should run
composer install each time they pull updates from remote Git repositories containing Composer managed projects.
As has been discussed, the composer.lock file tracks the versions of libraries currently used on the project. This is where the
composer update command comes in. Let’s review how to manage version changes for a given library (this is how it actually works):
- Drupal 8.5.6 is released.
- Developer A creates a new project, and sets Drupal core as dependency. The composer.lock file records the version of Drupal core used by the project as 8.5.6.
- Drupal 8.6.0 is released
- Developer B clones the Git project, and installs the codebase using
composer install. The composer.lock file lists the version of Drupal core being used on the project as 8.5.6, so it downloads that version.
- Developer A sees that a new version of Drupal has been released. Developer A runs
composer update drupal/core. Composer installs Drupal 8.6.0 to their system, and updates composer.lock to show the version of Drupal core in use as 8.6.0.
- Developer A commits this updated composer.lock to Git, and pushes it to the remote repository.
- Developer B pulls the Git repository, and gets the updated composer.lock file. Developer B then runs
composer install, and since the version of Drupal core in registered as being used is now 8.6.0, Composer updates the code to Drupal 8.6.0.
Now Developer A and Developer B both have the exact same versions of Drupal on their system. And still the only files managed by Git at this point are composer.json and composer.lock.
Tying it all together
Developers should always run
composer.install any time they see that a commit has made changes in the composer.lock file, to ensure that they are on the same codebase as all other developers. Developers should also always run composer.install anytime they switch Git branches, such as between a production and a staging branch. The dependencies of these branches may be very different, and running
composer install will update all dependencies to match the current composer.lock snapshot. The
composer update command should only be used to update to new versions of libraries, and the composer.lock file should always be committed after running
composer update. Finally, any time a developer adds a new dependency to the project, they need to commit both the composer.json file and the composer.lock file to Git.
Before moving on to the next blog post in this series, you should understand the following:
- What the composer.json file does
- What the composer.lock file does
- When to use
- When to use
- How Git and Composer interact with each other
In Part 2, we'll look specifically at building and managing a Drupal project using composer.