Techcrunch » Google Presents Code Search
Google have today launched Code Search, a search engine and index of source code that is collected from publicly available sources. Google claims that the new code search engine will be able to find almost any code that it’s crawler can find, but in a few specific searches I did I couldn’t find some code that I had hosted on my own server – but this is sure to improve. It does seem that the Google index of source code is a lot broader than those found at competing sites Krugle and Koders. For instance, Google Code Search will index the content of zip and tarball files on open source sites such as openssl.org, while the other search sites seem to return a lot of results from sourceforge and a few other centralized repositories.The first thing you notice at Google Code Search is that you can use regular expressions in the query field when searching, and there are a lot of search options to help you further refine what you are looking for. On the front page of Google Code Search there is a nice overview with some pointers on using the service.
I was reading the above article and going through the comments when I realized that nundreds of people are at risk. Some people make backups of their wordpress install and without thinking just leave the zip file there. The reason I said without thinking, is that with Google’s new code search and a smart search you can get hundreds of database user names and passwords.
I quickly double checked and default install blocks access to the config file, but those who do a backup and then leave the file there to be index by Google… they should change their information.
Tags: Google, Interesting, Programming, Scary, Security, Wordpress




[...] Update: Patrick has an interesting post about the potential security holes created by codesearch . [...]
What I just posted to the wp-hackers list in response to an email about your post:
There really is nothing to worry about because if Google can spider it now, it could before, and people could download your backups.
The way I looked at it, was that if you uploaded your blog to a folder which a spider hit… and then you extracted the zip and got your blog going… the zip was still there… so it wouldn’t give a 404 and be removed. A person leaves it there “just in case” not realizing that there is a security issue with it.
Or, they put it in a backups folder or something with directory listing on. and somehow listed to that folder in a way that spiders hit it.
Once a spider knows a folder is there… they will scan it unless told not too. Even if it hadn’t been linked to in a long time… (as long as there is SOMETHING in there.)
There is something to worry about, though. You’re right, if you had a backup tarball in place before, the information was there all along. Someone could download your backups and grep them for passwords or the right filenames.
But Google just made it ten million times easier to find it. Before J.Random.Hacker had to download every backup tarball on the internet (or just yours, by a lucky strike). Now they can skip that step.
So … .htaccess your backups, or don’t keep them in the webroot.
Not just WordPress security, of course. Anything with a password in a config file that is protected by virtue of getting special treatment from the webserver.
db.inc.php … config.php … wp-config.php … .htaccess … configuration.php
[...] UPDATE: Just found this on he wordpress wp-hackers mailing list, about finding wordpress database usernames and passwords on google’s codesearch… Can’t protect stupid users from themselves… [...]
[...] Patrick has an interesting post about the potential security holes created by codesearch Google Presents Code Search and it’s threat to WordPress security. This is similar to problems with leaving any file on your server, even if its not directly web [...]