Version Control System, referred to by acronym VCS, are systems that track changes made to a digital asset over time. By digital asset, we mean a file, image, video or a simple blob of data on a computer. Any digital entity that can be changed over time can be tracked using the Version Control system. This tutorial is targeted for software engineers and hence we will take examples of code files as the digital assets that we want to track. However, we should know that a Version control system can keep track of changes happening to any type of digital asset. If you want to keep track of all your documents over time, you can do it withVCS.
Anyone who has ever written code must've had days when they would have maintained different versions of their files. Whether it's adding documentation to your code, rearranging the methods, adding or removing some classes or resources, we all have maintained several versions. Most of the time these versions would have been maintained by giving a suitable name to the file as shown below
- GetCustomer_1.java
- GetCustomer_14Mar.java
- GetCustomer_Final.java
Naturally, you must've wished for an organized way of storing these versions to see progressive changes made to the code. Additionally, if you are working in a team with multiple people working on different features, maintaining versions of all features quickly turns into a nightmare. This is just a tip of the problem that you would face without a Version Control System. You can only imagine what kind of problems teams would fall into if there is no VCS available. Teams would waste a good amount of time fixing the code management issues rather than coding. Some of the problems that we can face in code management are mentioned below
Challenges in Code Management
1. Maintaining multiple copies of the same code files based on changes done on different dates - Let us take an example of a hypothetical feature to get customer details. while working on the feature to get customer details, one would name the file GetCustomer.java using different naming conventions to track its versions. Ex: Code completed at end of 15 Feb would be named GetCustomer_15Feb.java, followed by GetCustomer_16Feb.java and so on. And if you want to save the code at multiple checkpoints (temporary stable versions of code) in a day then the naming conventions for the file would become even more complicated.
2. Track changes and code reusability - Suppose you're working in a team of 6 members writing code for a particular feature. It's imperative to ensure that there's no duplicate or ambiguous code. Also, team-members should reuse code already developed by other team members. And also ensure that everyone uses the latest version of a code developed by others. This would be an excruciating task for a team consisting of multiple people. Every day teams will spend a lot of time merging the code to create the latest stable version. This task will become even more daunting if team members are working in different locations. To add to the complexity, now we will have to keep a manual track of who updated which part of the code.
3. Trace the file's history to find the version of the code which introduced the bug - Code undergoes regular changes either during development or maintenance phases. It's quite possible to inadvertently change something that causes the code to misbehave and not work as expected. In such situations, it'd be helpful to view the timeline of changes in code to pinpoint the exact cause of the bug. At this point shuffling through a lot of old code files to find out the culprit change becomes very hard.
4. Maintain back-up of your files - In an unfortunate situation of hard-disk being corrupted or being subject to some malicious virus, working copy of your code is highly vulnerable. As a result, it's highly likely to lose all of your work in absence of proper backup of your code.
5. Comment out blocks of code to disable a functionality without deleting it - When the code is still in development, you might write a small sample piece of code to see how a particular method behaves. Irrespective of whether this experimental block of code tests your feature or not, you'd like to retain it for reference. It's not feasible to maintain multiple versions of files by commenting and un-commenting different blocks of code.
These issues eventually eat up the productivity of the team. Team members spend a lot of time just managing the code as opposed to writing code. As more and more code is added, teams spend the majority of the time just managing the code and the project progress slows down. Version Control System was built just to avoid these problems. Let us see what a version control system is and how it can help us.
What is Version Control System?
You can think of a Version Control System (aka VCS) as a kind of database. It lets you save a snapshot of the complete project at any point in time. Every change made to the project files is tracked, along with
- Who made the change
- Why the change was made (references to problems fixed or features added in the change)
Later when it is required to take a look at an older snapshot/version, VCS shows how exactly it differed from the previous one.
Version Control Systems (VCS) are tools that help teams manage and track changes in code over time.
Note: Git is the most popular Version Control System.
With reference to the image above, on 24 Feb, 17 your project has a new file added to it. This file could be a source-code file, a properties file, an image or any other type of file. Once your project is tracked by VCS, any addition, edit or deletion of files from your project will be automatically tracked and maintained. In short, every time a change is made to the project, VCS creates and stores a snapshot in the form of versions.
Note: Snapshot is the entire state of your project at a given point of time.
Version Control System is also popularly known as:
- Revision Control System
- Configuration Management System
- Source Control Management System
Though different names, these tools simply aid in maintaining and tracking changes in your project over time. Consequently, teams can focus on business logic and improve their productivity. So VCS is basically a convention of modern software teams to maturely handle code as part of their professional practices.
Next, let's skim through the primary features of a VCS and how they help in overcoming the challenges listed above:
Features of VCS
- Maintain independent branches of code for team members to track their respective changes
- Ease of comparison of code across different branches
- Merging of code from several branches of multiple team members
- The convenience of tracing changes in code to find the version that introduced a bug
- Annotate changes with the name of author and message in a particular version of the code. This helps to easily identify the intent of the change
- Simple comparison across versions to help resolve conflicts in code during merging
- Revert changes made to the code to any state from its history.
These striking features demonstrate the great flexibility provided by VCS in terms of code-management. Consequently, VSC has become an intricate part of the workflow of software teams. Don't fret if a lot of terms in this list of features don't make sense to you right now. A detailed explanation of these terms will follow in upcoming tutorials. As part of the journey towards becoming an effective and efficient Software engineer, it's imperative to learn about VCS.
Information nugget: Git is a widely popular Version Control System developed by Linus Torvalds. More specifically, Git is a Distributed VCS.
Top Version Control Systems
There are many Version Control Systems that are available in the market, but few top VCS are:
- GIT
- CVS
- SVN
- Assembla
- Mercurial
- Bazaar
In the next tutorial, let’s take a look at the types yielded by evolution in VCS.