If you've been curious about GitHub then this short tutorial in the Open source Java projects series is for you. Get an overview of the source code repository that has changed the way that many developers work, both individually and collaboratively. Then try GitHub for yourself, using common Git commands to branch and commit your own open source project.
GitHub is a social coding website and source-code hosting service that uses Git as its version control system. Launched in 2008, GitHub already boasts nearly 1.7 million people hosting nearly 3 million repositories. Like most social networks, GitHub allows users to create and follow feeds associated with each other's projects. It also extends the social paradigm to include network graphs that show repository usage. You can think about GitHub as a social network, a la Facebook, but just for software developers.
Bringing together social elements with a free repository to host open source projects, GitHub aims to cultivate a supportive and active community for the betterment of the software industry. The more active a project is, the more people will find it, and hopefully contribute to it. GitHub also offers commercial project support at a nominal cost.
In addition to following projects, GitHub allows users to follow individual software developers. This makes it easy to keep up with what friends and colleagues are doing and review their code, as well as seek out well-known programmers and follow their work. A regularly updated feed presents an opportunity to watch someone practice their craft. For developers, there's a lot to learn from studying each other's code and methodology; for instance, being able to see what code other developers push to their projects, and when, is a great way to learn at a high level about the release development cycle.
Follow the story on JavaWorld
Social coding with GitHub enables developers to learn from each other in a new way while storing and updating code using a popular, well-featured version control system. In this edition of Open source Java projects I will help you get started with GitHub. First I'll provide an overview of the platform, then introduce some Git basics, including command-line options that you'll use frequently in GitHub. Finally, I'll walk through a simple
commit example that demonstrates the everyday power of this distributed code repository.
Get started with GitHub
GitHub accounts come in several flavors, grouped by individual or commercial account and by public or private repository. Open source developers are allowed unlimited public repositories, or for a small fee can choose to host between five and 20 private repositories. Commercial developers pay more (about twice as much as open source developers as of this writing) and can scale to up to 125 private repositories. See the GitHub homepage for a complete listing of plans and pricing.
You will need a GitHub account in order to follow along with this article. Go to the GitHub website and click on the Signup and Pricing link at the top of the page. Click "Create a free account" and complete the account-creation process.
If you want setup instructions for your operating system, see the GitHub tutorial. Note that the installation process automatically installs a GUI client and prompts you to manually install GitHub's command-line tools. I recommend that you take this option in case you ever want to do something quickly on the command line.
Git: A primer
You will need to be at least somewhat familiar with Git in order to effectively use GitHub. A point of interest to most geeks is that Git was designed and developed by Linus Torvalds, the founder of Linux. In this section I provide an overview of Git and describe how it works. Toward the end of the article I present a review of a few of the more popular commands to help you become productive quickly. This is by no means an exhaustive tutorial but it should help you get started.
When software developers think about a version control system (VCS), we tend to think of a central repository that we'll use to download source code, make changes locally, and then submit those changes back to the central repository. Git is a little different. It is a distributed version control system, meaning that it really isn't a central repository but rather multiple clones of repositories. So the "master repository" exists somewhere (like in GitHub) but we work locally on clone repositories.
Git's distributed architecture provides a significant benefit over non-distributed version control systems in that developers can locally check-in and check-out code, create branches, and more. For a major change in a traditional VCS you would create a personal branch and check code into that branch. When you were done with your changes, you would merge that branch into the main branch.
Distributed version control
Git changes the VC paradigm because you can work locally and merge all of your changes in a singular commit (you can keep your local history when you merge or you can combine all changes into one check-in). So the central repository is not littered with branches and dozens of historical notes, but only information about feature changes that have been made. In essence, Git uses branches as they were intended: to develop a new feature set, to maintain a release, or to fix bugs associated with a release.
When you install Git on your local machine and "clone" a repository, you receive the entire repository, including historical information about all of the source code in the project. You then work against your local repository, adding new files, removing files, and changing files in a staging environment until you actually commit them to the local repository. Git maintains versioning information about all of your changes and you can easily roll back to any point in your history. Finally, when you are ready, you can synchronize your local repository with a remote one.
Changes are synchronized to a remote repository via a push while changes in a remote repository are synchronized with your local repository via a pull. Because you have a full clone of the repository locally, you are not limited to simply working against that repository's main branch. You can create branches to contain your changes and then either push or pull them as appropriate.
See Resources if you need a more complete tutorial introduction to Git. I'll focus on GitHub for the remainder of this article.
Social coding with GitHub
After you have created a GitHub account you can start following the work of other software developers or watching projects that interest you. You can find people or projects to follow by searching for them directly; or, if you're looking for ideas you can use GitHub's "Explore" function to find projects based on your interest. Explore GitHub displays trending repositories as well as featured ones. In addition to these, you can explore all repositories by clicking on the "Repositories" button on the toolbar. If you want to search for projects coded in a specific programming language you can click "Languages" on the toolbar, then choose the language that you want to explore. Figure 1 shows the most watched (i.e., trending) Java repositories at the time of this writing.
Storm was the most watched Java repository on GitHub at the time that I checked. Once you find a project that you're interested in, click on it and you'll see a "Watch" option, as shown in Figure 2.
If you click "Watch" then you'll be subscribed to follow the project and will be able to see changes made to it on your GitHub homepage. Figure 3 shows my GitHub homepage, which contains a listing of updates to various Spring projects.
Following individual developers works the same way as following projects. For example, I recently decided to follow my friend Tom Akehurst, as shown in Figure 4.
From a social perspective, GitHub empowers you to easily find developers and projects that you might be interested in and receive updates about them. Of course, the flip side is also true: GitHub is an excellent place to show off your work and get feedback and recognition from your peers.
Using GitHub for project updates and maintenance
Remember that the main driver behind GitHub is to promote the development of open source software projects. So if you've built something good, why not contribute your code to GitHub and make it freely available to the world?
You can create a new repository from the GitHub website or from the client application on your desktop. In this section we'll first walk through creating a repository from the website, then I'll show you how to update and commit a file via the command-line.
Setting up a GitHub repository
Assuming that you have an account set up, log in and you'll see something similar to what's shown in Figure 5. Click "Create a Repository" and you'll be guided through the process. First, you'll click on the first item in your top-right toolbar, "Create a New Repo."
Click that link and give your repository a name and description. I created a new repository to host my GeekCap utilities, which is a set of helper classes that include sorting algorithms and a re-sortable list, a class that easily extracts icons from the Java Look-and-Feel Graphics Repository, ZIP utilities, and more. While not the coolest project, I included it because most of my other projects use one or more of these utilities, so it's good to have them stored in an accessible place. I named my project
geek-util and gave it a description: "Geekcap Utilities: helpful classes that are used by other Geekcap.com projects."
Once your project is set up you should see a screen like the one shown the Figure 6.
Figure 6. Repository created (click to enlarge)
The screenshot in Figure 6 shows a listing of what you can do with your new repository, as well as an example of creating a
README file and pushing it into your repository. I have an existing Maven project that I need to add for the first time, so I start by adding my
pom.xml file and my
src directory. Below are the Git commands that I entered for the initial push of the project into the repository:
Listing 1. Git commands for creating a repository
git init git add src git add pom.xml git commit -m 'Initial commit' git remote add origin https://github.com/geekcap/geek-util.git git push -u origin master
Here's where familiarity with Git is important if you want to use GitHub. Fortunately, the main Git commands are relatively intuitive:
- git init creates an empty Git repository. Specifically, this creates the
.gitdirectory, which the
gitcommand will recognize as a repository.
- git add adds files to the repository; in this case I added my
- git commit commits changes to the repository. All I did was to add the
srcdirectory. You would also use this command after modifying the contents of a file or deleting files via the
- git remote add origin adds the specified URL as the origin server for the Git repository. As you saw in Figure 6, the origin server is created on GitHub for you and the URL is provided in the setup documentation.
- git push uploads all committed changes to the specified server. In this case I've pushed the initial commit that contains the
srcdirectory to the origin server, which I previously set.
You can use Git from your IDE or from the command line; I just I happen to be a command-line junkie. Executing
git help shows the most common commands, which are summarized in Listing 2.