Tuesday, November 20, 2012

Blocking Phishing Sites

The other day I got the question how the popular browsers block you from going to malicious websites. Interesting, that you had to connect back to a database seems logical but the actual inner workings were a mystery to me so I looked it up.

Firefox, Safari and Chrome all use the same technology called the Google Safe Browsing API. Microsoft Internet Explorer uses a technology called SmartScreen Filter.

Google Safe Browsing API is a complete website with the technical details. Basically it works like this:

You type in the website you want to visit in your browser:
http://www.somehost.com/path/page.html?args

Your browser will need to check the following paths against the database:

http://www.somehost.com/path/page.html?args
http://www.somehost.com/path/page.html

http://www.somehost.com/path/

http://www.somehost.com/

http://somehost.com/path/page.html?args

http://somehost.com/path/page.html
http://somehost.com/path/
http://somehost.com/


As you can see this is quite a lot of strings and looking up strings in a database is usually a slow thing. The trick used here is that a hash is calculated and a 4-byte prefix is sent to the database. In case of a match the database will return all the matches and the client can then calculate the full hash to see if it is in the returned list. 

If the full hash matches the end user is informed about this else your page gets loaded.

As sources for their database they mention Antiphishing.org en Stopbadware.org but you can be pretty sure that it would not be the only sources.

At the Windows Live blog I could find more information on how the Microsoft SmartScreen Filter works. The example Microsoft gives is the following:

Let's say a malicious website is hosted at canada-pharmacy.us. This IP gets marked in the database as "bad", besides the URL the IP address is marked as "bad" too. SmartScreen will generalize this to IP's in the neighborhood. This is done based on ASN blocks, the way IP addresses are split up by owner.

DNS server rating are also part of the SmartScreen technology. The DNS servers that seems to know just a little too much about abusive domains is given a lower rating according to the blog. Unfortunately for a techie this is a meaningless description.

SmartScreens telemetry comes from reports from end users, third parties, traffic from URLs showing up in e-mails and logs. The feeds are fed into machine learning algorithms to either flag or pass a URL. When the algorithm is in doubt, the information is given to an analyst who will do the necessary research.

As a conclusion I would say make sure you have a browser that has such a technology enabled, it is not perfect but it is free and better then having nothing.


No comments: