Scraping Gambling Websites (and crypto markets) for Fun and Profit

How to use web scraping to find arbitrage opportunities on popular Bookies websites (and crypto currency markets)

An interest in the fluctuation of crypto currency markets led me to an interest in arbitrage, which eventually led me to an interest in finding arbitrage opportunities between Australian gambling websites such as tab and sportsbet.

This article will explain some of my conclusions, and show some of the code and ideas used to collect and process data.

What is arbitrage?

Arbitrage is when you buy and sell anything (generally a currency) at the same time in order to generate a profit off price inconsistencies.

For example:

This is an example of three way arbitrage, and of course is virtually impossible due to the high fees charged by currency exchanges.

However, arbitrage is possible with crypto currency markets due to their high degree of fluctuation - and it's possible with sports betting markets.

Read more about crypto market arbitrage or sports betting arbitrage

Crypto Market Arbitrage

Web scraping wasn't used to find arbitrage opportunities, however similar techniques were used so it is worth a discussion here.

Let's define some crypto currency markets:

ltc_btc,eth_btc,bch_btc,btc_usdt,eth_usdt,ltc_usdt,bch_usdt,etc_eth,etc_btc,etc_usdt

Now we need to find paths that we can use to find potential arbitrage opportunities. At first glance, you could come up with some manually (eg btc->ltc->bch->btc), however best to do things with a function so we can cover all possibilities. We used a doubly linked list:

        // In Golang we might create a doubly linked list like this.
        
        type Node struct {
            StartCurrency   string
            FinishCurrency  string 
        }
        type Path struct {
            IsHeadNode      bool
            Prev            *Node 
            Next            *Node
        }

And then 'bruted' all possible paths, with all possible starting currencies.

If we 'start' with all currencies above, we will find 41 possible paths between two and four markets long. For example ltc_btc->ltc_btc or ltc_btc->ltc_usdt->btc_usdt.

The next step was to collect some data to try and find arbitrage opportunities. Thankfully I was able to use okex.com's API here.

Once data was imported, the next step was just a matter of updating the data regularly and searching for arbitrage opportunities.

To do this, we could add a couple more items to our Node struct:

        type Node struct {
            StartCurrency   string
            FinishCurrency  string 
            FromPrice       float64 
            ToPrice         float64 
            ProfitLoss      float64 
        }
        type Path struct {
            Prev            *Node 
            Next            *Node
            ProfitLoss      float64 
        }

The new fields above could be used to calculate our profit loss at each node, and show the net profit loss of each path.

What was the result off all this?

A literal money generating machine - I became an overnight millionaire with just a few hundred lines of Golang and some mysql!

That was sarcasm, however, there were loads of opportunities with quite high margins - up to 0.05%. This may not seem like a lot, however if you are trading instantly, and there are hundreds of these opportunities each day - you could in the short term turn a high profit.

Downfalls

Despite the potential (hundreds of possible ~0.05% profit trades per day) I decided not to pursue this further for a couple of reasons:

  1. The trades would have to be actioned instantly: in order to achieve this one would need a fairly large 'bank' stored with each currency.
  2. Crypto market pricing fluctuates dramatically. This point is interesting as it is also the reason arbitrage situations are so common, however this also brings a huge risk: if we were to store a large amount of funds in each currency to trade arbitrage, how soon would our short term profit turn into -40%, or -300%?

Sports Betting Arbitrage

Considering the wild fluctuations of crypto markets led me to search for a marked with more consistency, but enough fluctuation for arbitrage opportunities to appear.

Eventually I found sports betting was a logical alternative: a relatively consistent market with some rather wild 'in game' fluctuations, and some minor price differences between different bookies.

There were a couple of paths I could take here.

1. Betfair

The first was to use Betfair.com.au's API to explore different markets and find arb situations within betfair - without using a second or third gambling provider. This proved difficult, which makes sense - anyone can create a bot which looks for the obvious arb situations (such as betting on two sides of a tennis match when the odds are above 2.0).

2. Web scraping Bookies sites

Collecting data from bookies websites allows us to compare pricing between the same markets on opposing sides of an outcome. In Australia, there are not many bookies so this is a somewhat difficult task - they seem to have aligned their odds reasonably well with each other.

Arbitrage between sports betting bookies is not illegal, however they will ban you if they find out: as it is one of few techniques when you can turn a long term gambling profit.

Despite reading countless articles on the internet claiming that 'web scraping in 2019+ is becoming impossible due to the complex nature of websites', that is in fact absolute bullsh!t. Often these people claim that AJAX and Javascript is too complex for their HTML parsing web scraper, but what they fail to realise is there are a couple of ways of obtaining data - generally an AJAX heavy website is calling API endpoints, which means you can likely get your hands on some nicely formatted JSON (much nicer than HTML!)

For example, simply visiting tab.com.au and sportsbet.com.au, opening Chrome developer tools and clicking the network tab will show us a number of API endpoints returning some already nicely structured data.

All we would then have to do is collect data from these endpoints - throw it into a database - and compare market names, outcomes and pricing data from the two bookies. If we add a couple more bookies, we increase our chances of finding an arbitrage situation.

Conclusion

In Australia from my somewhat limited research (two bookies and just a few hundred markets), arbitrage situations are hard to come by (however they do exist) due to the low number of bookies - they all seem to align their odds. Overseas with a larger number of providers this may prove less difficult.

- published 30 December 2018. Article written by Oli.


If you wish to get your hands on some betting data scraped from publicly available sources, feel free to contact us here :-)