Git – VSO (Onboarding/Migration from TFS)

Overview

Before we start with Git (in VSO), let’s just visit/revisit the difference between Centralized VCS (version control system) and distributed VCS. With centralized VCS, repositories are stored on a central server and developers checkout a working copy, while with distributed VCS, developers themselves maintain a copy of the entire repository with its history. Carrying this forward, TFS stores the entire repository along with its history on a central server, while with Git, developers clone the entire repository including the entire history to their machines and maintain it. So, as we can see, the version control paradigm itself is different when it comes to Git.

Advantages of moving to Git-

  • Branching and merging – Branching and merging work like a breeze in Git. Every time a developer has to test something on the current branch, all he has to create a new branch on his own local repo, without touching the main branch’s code (Since everybody maintains the entire source base, VSO acts as a reference which can be thought of as untouched code base).
  • Offline repo access – Every change made, every code check in, the entire change track is available offline. One never has to be online to retrieve history of any check in (think, relief from VPN connection).
  • Saving your team from build breaks – Any change you make, you do them on your own branch, this saves the hassle of rushed up check in and thus causing potentially unstable changes to your team’s reference repo.
  • You commit constantly, thus saving your work every time you code – As most of the developers code on their own local branch (without affecting the central repo), you can always check in your code to your own branch. This saves the loss of change which might happen when doing a blind “get latest and overwrite”, especially during the peak time of essential and important code changes.

Disadvantages of moving to Git-

  • Initial learning curve – The first and the most-cried about disadvantage of using Git is the initial learning curve.
  • Code duplication redundancy – Everyone maintains the code base, so it leads to the source code redundancy. But this again can be viewed as a boon as no one has to depend on the central server for code base. Anyone can clone the Git repo from any other developer’s machine too without going to Internet.

Before we begin

First off, this document relies heavily on command line tools. So, be warned.

The tools that might come in handy.

  • Git (obviously)
  • PoshGit (for a rich PowerShell integration)

Other things to do/keep in mind:

  • While installing Git, use the option to integrate with Windows command prompt and also let Git be added to the PATH variable.
  • Enable Alternate Credentials in VSO so that your PS/Cmd commands can talk directly with VSO. You can do that by opening your VSO account and follow the steps as shown in the below pictures.Profile
    User Profile
    You can also set a shorthand username if you want and then use that username whenever prompted on the PS/Cmd prompt.

Creating a Git repo

With this small introductory difference, let’s start with the basics in Git.

There are two things that can happen when it comes to project tracking in Git.

  1. You are starting the project. So, you basically need to import your project into Git.
  2. You are cloning an existing Git project from another Git server.
  • Let’s start with first method. If you are starting a new project and you want Git to track it, you will first have to set up your project location for Git. It can be done with a very simple command-
    git init
    The above command creates a new hidden sudirectory .git which contains all the necessary files including the repo skeleton. Now, this has to be kept in mind that initializing the folder with Git hasn’t added all your files to the tracker. Only the folder was setup for Git tracking.If you use PoshGit (just launch PowerShell in the current folder and run git init or open PowerShell in a Git folder, PoshGit automatically picks it up) for the above command, you will see that a red lettered count has appeared beside a name master. This indicates that there are count amount of files/folders which is still to be tracked by Git. Now we can add all these files for Git tracking by the command –

git add * // Adds all the files and folders in the current folder. You can also add individual files similarly.

Let’s commit these changes for an initial commit and our first git project is ready.

git commit –m “First commit”

Note: There would be cases where you would want some files to exist but wouldn’t want them to be checked in. The prime examples are the bin and obj folders. You can do so by mentioning their path in the .gitignore file which exists in the root folder of your repository.

  • Now, let’s look into the second case, which would be the most used use case. If you have to get an existing Git repo, so that you can start contributing/deving, you can do that with a simple command –

git clone <VSO_URL> FolderName

The above command creates a directory FolderName, initializes .git directory and checks out a working copy of the latest version from the main branch with all its history. Cloning in Git can be equated with Creating a Workspace and do a Get latest in TFS.

Note: Checkout in TFS is not the same as checkout in Git. In TFS, when we checkout a file, we open it up for editing. While in Git, when we do a checkout, we switch branch to its last/latest commit. So, if we do a checkout on the current working branch, we lose all our changes in our working branch and our Git pointer points back to the last commit.

Migrating from TFS to Git in VSO

The simplest way to migrate from TFS with all the commit history is to use the Codeplex project git-tf which depends on JRE. You can migrate your commits to Git with git-tf installed with the following command –

git tf clone http://TFSPath:8080/tfs/CollectionName $/Project/Main FolderName –deep

This will clone the TFS with its entire commit history in the folder – FolderName. A sample output is shown below.

GIT TF

The problem that we would face right now is that the username against which all the check ins have been made have their aliases instead of their email names which Git uses to track users. If we do “git log” to check the commits made, we would see something like this –

Rename

As is evident from the above image, we have the domain aliases instead of the emails. We need to re-write the history of these commits to reflect the emailNames.

Git provides filter-branch command to overwrite history, which obviously makes it a very powerful tool. Using this command is considered a bad practice in an ongoing project. So one should use it with caution and only in cases of rarest of the emergencies.

We are going to use filter-branch command in a git script to re-write git history to map the users with their commits. The following script replaces all the emails in the above git log with their email addresses –

git filter-branch -f –env-filter ‘ // Mark that there are two hyphens before env-filter
ALIAS=”FAREAST\\\alpha” // “alpha” is the assumed alias under the blurred line above
CORRECT_EMAIL=”alpha@microbeta.com”
if [ “$GIT_COMMITTER_EMAIL” = “$ALIAS” ];
then
export GIT_COMMITTER_EMAIL=”$CORRECT_EMAIL”
fi
if [ “$GIT_AUTHOR_EMAIL” = “$ALIAS” ];
then
export GIT_AUTHOR_EMAIL=”$CORRECT_EMAIL”
fi’ — –all

You can find the list of all the aliases to be updated above by the following command –

git log –format=’%aE’ | sort –u

Difference between local branch and remote/origin branch

One of the core fundamentals where Git and TFS differ is in their implementation of branch mechanism. To begin with, as Git is distributed VCS, the notation of branching exists on every developer’s repo. This goes on to say that a developer can create on his own local system, without even touching the reference central server, when used in integration with VSO. So, the branch on developer’s machine can look like this –

Local repo

Even when the remote reference branch looks like this –

Remote repo

As can be seen from the above two pictures that the first local repo has an extra branch Harsh2 which doesn’t exist on the remote reference repo in VSO. However, as we know from our TFS experience, branching in TFS happens on the central TFS server and we just map our workspaces and checkout files for editing.

Branching and Merging

No introduction to Git is good enough without actually explaining how Git manages branching and merging. Branching in Git is as simple as

git branch BranchName

You can, then, switch to this branch by

git checkout BranchName

Or simply, you can use a shorthand

git checkout –b BranchName

to create and switch directly to branch – BranchName.

The branching mechanism is so simple that developers branch out from the feature branch even for a bug fix so that they don’t end up checking in a faulty code even by mistake. Once done with the required changes, they switch back to feature branch, do a merge and then delete their test branch, if they deem fit.

Let’s take this in more detail with a use case.

Your team Alpha is working on project Omega which uses Git as VCS. A bug suddenly came up in the main branch which needs your immediate attention. How will you go about it?

  • Commit all your changes that you were working on (in your local branch).
    • Command – git commit –m “Some relevant message”
  • Switch to the main branch (say – main).
    • Command – git checkout main
  • Create a new branch from the main branch for hotfix and switch to that branch.
    • Command – git checkout –b hotfix
  • Work out all your changes for the fix and then commit.
    • Command – git commit –m “Hotfix for the bug abcd”
  • Switch back to the main branch and merge.
    • Command – git checkout main
      git merge hotfix
  • Now that you are done with the fix and merge, you can now delete the hotfix branch, switch back to your local dev branch and carry on with your work.
    • Command – git branch –d hotfix // -d – Delete option

As can be seen above, branching and merging is obviously a breeze in Git.

Often, the files that a developer is working on, isn’t only being worked upon by him. So, this would obviously lead to conflicts when checking in. One can merge that with git mergetool, or do a simple merge inside VS and then do a check in.

Note: During any point during development, if you want to check the status of your changes in your project, you can use the command “git status”. It gives a list of all files that have been changed, added, deleted or renamed. Git not only keeps track of all the files which it has been tracking in the Git folder, it also can see all the files which have been created but not yet tracked. The command “git status” shows these files in a separate bucket as untracked files. A sample “git status” output could look like this –

Status
Rollbacks

The simplest way to rollback a change on a file to the last commit is

git checkout FileName

If you want to rollback all your changes to the tracking files in the current folder, you can try

git checkout *

If, however, you want to reset the entire repo to the previous committed state, you can use

git reset –hard // Mark that there are two hyphens before hard

You need to be cautious here, however, as every time you do a reset, Git resets the repo one commit back. So, if you do a reset for the second time, Git will reset your repo back to the second last commit.

If you want to remove untracked files (files which you had created but not added to Git tracking), you can try

git clean –d –f // d – directories, f – forced

You again need to be cautious here however, as there is no going back. So, if you want to preview all the damages you are about to commit by using the above command, you can do a dry-run by using the option –dry-run (two hyphens before dry-run) or –n.

Fetch/Pull/Push (and Sync in VS):

When you are done with all your changes in your local branch and have committed everything. You would probably want to sync all these changes to the central server. But before you do that, you probably want to do a “Get latest” and merge if any change exist. The commands which can be used for these are fetch and pull.

  • git fetch
  • git pull

In simplest terms, “git pull” does a “git fetch”, following it up with a “git merge”. One can do a “git fetch” any time to update the remote tracking branches. This won’t affect the local branches though. Developers, in fact, do this on a regular basis to keep their remote tracking branches updated. You do a “git pull” when you think your local branch is ready to be updated with the remote changes. “Sync” is not a Git concept however. Sync does only pull operations with many other options available.

A beautiful thing to mention here about Git is that if you had committed your changes to your local branch and had done a “git pull” later, the merge which would happen to your local branch is again treated as a new commit to your local branch. So when you would push this to remote branch, you would see that you actually made two commits instead of one, first was your change commit and the second one was a merge commit, if Git had to do a merge on “git pull”.

Now, that you have pulled all the remote changes and merged it with your changes, you would want to push it to VSO. You can do so by simply running the following command –

git push

Note: You would be asked to enter your alternate credentials when using the above commands from PowerShell.

Some other issues and their workarounds:

  • Currently, if your file path exceeds the 255 character limit set by Windows, you won’t be able to check in or checkout any such file for edit. There are two ways to solve this problem.
    • Keep your root project folder as the first level folder in your directory and keep it’s name as short as possible.
    • You can create a virtual drive using Command Prompt or PowerShell on your Project folder itself and then do a check in. This will shorten your directory path. Let’s say you want to create a virtual drive on the below path, you’ll use the command subst for substituting the path with a virtual drive and then switch to the drive just created.
      SubstAfter you are done with all the commits, you can switch back to your original drive and then delete the drive with the option /d.
      Subst - DeleteIf, however, you forget to delete the drive, it would automatically get deleted on a system reboot.
  • You can also impersonate someone else and commit as him through Visual Studio’s Git settings. But you can still find out who actually made the commit in VSO. It’s just that you would have to view the complete git commit in VSO to know who did the commit.
Advertisements

Published by

Harsh

Developer at Microsoft by the day, a wannabe physicist by the night.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s