Vulnerable Git repositories in the top 1m domains


Disclaimer: this information is for research purposes only. It is intended to help website owners and software engineers better secure their systems.


TLDR:

At the time of our research (late 2017) our crawler discovered 6242/1,000,000 vulnerable git repositories.

That is 0.62 % of the most popular websites in the world*.

*We only tested root directories (eg example.com/.git). Testing all link and folders would more than likely yield many more results.

What is a git repository?

A git repository is a revision control system (or version control system). It is the most popular tool for version control used by software developers.

According to the man git pages: "Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals."

What is a vulnerable git repository?

A 'vulnerable' git repository is a git repo which has been left unintentionally public on the internet. There are plenty of cases where git repositories should be left on the internet - an obvious example being open source software. However, many git repositories should not be public and are. The most common mistake developers make is disabling directory access and assuming this means their repository is safe.

What does this mean? Let's explain with an example.

In short, developers assume that because their repository cannot be accessed at example.com/.git/ they are safe. However, often the raw files - objects, configuration, commits - are still public.

It doesn't take a great deal of effort to find the first commit by viewing the refs/heads/master file. From there, you can usually work backwards to download the entire repo. This process can be automated, and open source tools to automate this process exist on github.

As a developer, when using git repositories in your website root, please check example.com/.git/config, not just example.com/.git/.

How did we collect the data?

A simple golang program was written to request each domain for config files, testing both http and https versions, for example:

If a domain returned a 200 response, the config file was parsed to find out if it were a real git config file.

Why are there so many vulnerable git repositories?

As described above, this is due to a simple mistake which can be made by anyone. In alternative research, we found vulnerable repositories in multiple conglomerate sized companies (including a major search engine) with public bug bounty programs. The mistake originates from thinking you are safe by disabling directory listing.

Impact

Often developers commit all kinds of data to a git repository - including database passwords, secret keys, and even SSH keys. If your website is vulnerable an attacker could potentially access all data stored in your git history.

From there, they could use the data to launch further attacks or gain deeper access into your system.

Resolution

By far the best way to solve this is to avoid putting your git repository in your website root. Often content management systems and web application frameworks include all code within the web root. This is terrible practice. Your application can nearly always run with only one file (for example index.php in the website root, by changing the references within.

If this is not possible for you, then you should use a post-receive hook to push your code from a bare repository to the website root. Alternatively, leave it in your website root and ensure you disable not only directory listing, but direct file access as well.


- published 14 March 2018 Article written by Oli.