Creating a dynamic d3 visualization from the GitHub API

GitHub Repo Visualization
Try out the GitHub Repo Visualization here

As someone who works with data on a daily basis, I’m always impressed and inspired by interactive charts and dashboards. I’ve built plenty of dynamic dashboards within Excel (here) and, more recently, within Google spreadsheets (here, here and here), but never my own custom web charts. I’ve wanted to learn d3 for a while, but until recently didn’t have the necessary Javascript chops to do this.

This year I’ve focussed on deepening my coding skills, so I’ve finally been able to give d3 a proper go. And let me tell you, it’s brilliant. It’s exciting to hook up a data source to a custom chart that changes dynamically, and be able to see it on a live website, which other people can view.

In this post, I’m going to discuss the steps I took to create this d3 visualization of the GitHub API.

The app is live here!

1. Introduction to d3 visualization

d3.js is a Javascript library created by Mike Bostock (website, GitHub, Twitter), formerly of the New York Times. No doubt you’ve seen d3 in action on the web, for example on the New York Times website here or here. Since it’s the most ubiquitous web charting package out there, even if you’re not a NYT reader, chances are you’ll almost certainly have come across it.

d3 is a Javascript library, which means you’ll need to know at least basic Javascript to get started with it. It’s also important to have a reasonably good grasp of HTML/CSS as you’ll need these when creating d3 charts. Alright, ready to go then?

If you’re new to d3, I’d recommend starting with this excellent tutorial from Scott Murray: Interactive Data Visualization for the Web. It introduces some of the fundamental concepts in a very clear way. Recommended.

2. Getting data from the GitHub API

GitHub is a web-based Git repository hosting service, which hosts copies of your code in the cloud. It’s synced with the Git version control system so that a history of changes is stored along with a copy of your source code. It also incorporates a rich social layer so you can do things like follow or be followed by other individuals/organizations and collaborate on projects together. This makes the API a veritable treasure trove of information, just crying out to be mined and visualized!

Let’s dive right into the GitHub API and check out some of the data available and see what the API endpoints are. The full documentation for the API can be found here.

I started by looking at the endpoint for my username, as follows:

Try putting your own username into the API and see what you get back. It returns a JSON packet full of information and links to other API endpoints, which shows us places to explore next. For example, here’s my user information:

Check out the line I’ve highlighted (line 14). It’s another API endpoint, so grab that and open it in your browser. It displays information about all of my repositories in GitHub.

The JSON data returned by this endpoint contains a list of all my repositories, with links to many more API endpoints containing details for specific repositories. Searching through this, I found the API endpoint for each specific repository (e.g. see the highlighted line, line 32 of this code snippet):

Navigating to this endpoint for the github_api_viz repository:

turns up all the information and API endpoints specific to this repo, and within this I see my final endpoint:

which gives an output showing the languages found in the repo, and the number of characters per language. The output from this endpoint is as follows:

Cool! This looks exactly like the sort of data that we can turn into a chart.

To summarize, I dig through the GitHub API, going one level deeper each time until I get the language data I’m after:


There’s so much more in the API, so I encourage you to explore. This language endpoint is simply what I’ve chosen to use for this particular project. I’m working on another visualization using the commit history of a particular repo, which I’ll share once I have something concrete.

So, the first step with our application is to write some javascript code to dig through the API endpoints and grab the language data to pass to our d3 chart.

I used jQuery to call the API. For example, this snippet shows me making a GET request to the API to get user information and then displaying that user’s login name in my browser window:



I call the repositories and languages endpoints in a similar manner, which can be seen in the javascript code here.

3. Creating the d3 chart

Ok, let’s discuss the fun d3 component. As you can see from the GIF at the start of this post, I’ve decided on a bar chart to represent the data, which I think is the most appropriate given the API output is an array of language objects with number of characters (e.g. [ {Javascript: 14977}, {HTML: 1227}, {CSS: 551} ] ) as shown above.

There are two main parts to creating this chart: i) setting it up; and ii) updating it when I load new data. Let’s deal with each in turn.

Setting up the initial chart

The first step is to create some variables for our d3 chart. It’s good practise to avoid magic numbers (i.e. unique values with unexplained meaning) in your code, so let’s assign a width (w), height (h) and margin to variables, which we can easily use throughout our code and know what they are. It has the additional benefit of making it much easier to change their values throughout our program, by simply changing the variable declarations.

The next step is to create an SVG element to contain the d3 chart:

Next I define the x and y scales. These are functions that will map our data to fit to the physical space (width for x, height for y) that we have allotted our chart. The x scale is ordinal (to display the language names evenly spaced) and the y scale is linear (it’s a straightforward number, which shows the number of characters in my code). The code is as follows:

Then I create actual x and y axes, mapping the data to the SVG element, as follows:

This gives me the basis for creating a chart. In my application, I’ve also added labels and a title to the chart, which can be found in the code on GitHub.

Updating the chart to show data

I’ve put the repository languages data into an array called dataset, which I can pass to my d3 chart. It’ll depend on how you’ve setup your API calls and application architecture as to how you go about doing this. There are many ways of doing this, no doubt some better than what I’ve done, but it works for now, and as my experience increases I’ll refactor the code in the future.

Assuming we have this array, dataset, which contains language objects with key/value pairs, in the repo, as follows:

[ {JavaScript: 14977}, {HTML: 1227}, {CSS: 551} ]

then I can add bars to my chart with the following steps:

First, update the scales to reflect the new dataset:

Second, update the x and y axes:

Third, create variables for the bars, essentially selecting these elements in the SVG canvas (even though we haven’t created them yet):

Fourth, add any new bars to the variable bars with:

Fifth, remove any bars if necessary:

And finally, sixth, update the bars display, adding a transition to do this smoothly:

And that’s a wrap!

See the app in action here. The full source code can be viewed here on GitHub.

4. Conclusion and next steps

Although it’s a relatively simple dataset and chart, it involves a fair amount of code to get the data from the API into a format suitable for d3. d3 is highly customizable, which is fantastic and one of the reasons it’s so popular, but it means we have to work quite hard, and write quite a bit of code, to create this relatively simple bar chart. The Scott Murray book I mentioned earlier is a great place to start learning, as he walks through all the steps to create a bar chart.

Looking to the future, I’m keen to create more d3 visualizations based off the GitHub API. I’m looking at creating another bar chart showing the additions and deletions of lines of code to each commit to a repository, with the additions as positive bars above the x-axis and the deletions as negative bars below the x-axis. I’ll be sure to post an update here when I’ve got it working.

One thought on “Creating a dynamic d3 visualization from the GitHub API”

Leave a Reply

Your email address will not be published. Required fields are marked *