Github API: Retrieve all commits for all branches for a repo

I have encountered the exact same problem. I did manage to acquire all the commits for all branches within a repository (probably not that efficient due to the API).

Approach to retrieve all commits for all branches in a repository

As you mentioned, first you gather all the branches:

# https://api.github.com/repos/:user/:repo/branches
https://api.github.com/repos/twitter/bootstrap/branches

The key that you are missing is that APIv3 for getting commits operates using a reference commit (the parameter for the API call to list commits on a repository sha). So you need to make sure when you collect the branches that you also pick up their latest sha:

Trimmed result of branch API call for twitter/bootstrap

[
  {
    "commit": {
      "url": "https://api.github.com/repos/twitter/bootstrap/commits/8b19016c3bec59acb74d95a50efce70af2117382",
      "sha": "8b19016c3bec59acb74d95a50efce70af2117382"
    },
    "name": "gh-pages"
  },
  {
    "commit": {
      "url": "https://api.github.com/repos/twitter/bootstrap/commits/d335adf644b213a5ebc9cee3f37f781ad55194ef",
      "sha": "d335adf644b213a5ebc9cee3f37f781ad55194ef"
    },
    "name": "master"
  }
]

Working with last commit’s sha

So as we see the two branches here have different sha, these are the latest commit sha on those branches. What you can do now is to iterate through each branch from their latest sha:

# With sha parameter of the branch's lastest sha
# https://api.github.com/repos/:user/:repo/commits
https://api.github.com/repos/twitter/bootstrap/commits?per_page=100&sha=d335adf644b213a5ebc9cee3f37f781ad55194ef

So the above API call will list the last 100 commits of the master branch of twitter/bootstrap. Working with the API you have to specify the next commit’s sha to get the next 100 commits. We can use the last commit’s sha (which is 7a8d6b19767a92b1c4ea45d88d4eedc2b29bf1fa using the current example) as input for the next API call:

# Next API call for commits (use the last commit's sha)
# https://api.github.com/repos/:user/:repo/commits
https://api.github.com/repos/twitter/bootstrap/commits?per_page=100&sha=7a8d6b19767a92b1c4ea45d88d4eedc2b29bf1fa

This process is repeated until the last commit’s sha is the same as the API’s call sha parameter.

Next branch

That is it for one branch. Now you apply the same approach for the other branch (work from the latest sha).


There is a large issue with this approach… Since branches share some identical commits you will see the same commits over-and-over again as you move to another branch.

I can image that there is a much more efficient way to accomplish this, yet this worked for me.

Leave a Comment