There is a lot of data on the web, meant to be looked at by people, but how do you turn it into a spreadsheet people could actually analyze statistically?
The technique to turn web pages intended for people into structured data sets intended for computers is called "screen scraping." It has just been made easier with a wiki/community
http://scraperwiki.com/.
They provide libraries to extract information from PDF, Excel files, to automatically fill in forms and similar. Moreover, the community aspect of it should allow researchers doing similar things to get connected. It's very good.
Road Accidents -
http://scraperwiki.com/scrapers/show/sefton-mbc-road-accidents/Port of London Arrivals -
http://scraperwiki.com/scrapers/show/port-of-london-arrivals/You can already find collections of structured data online, examples are Infochimps ("find the world's data").
http://infochimps.org/datasetsFreebase ("An entity graph of people, places and things, built by a community that loves open data.").
http://www.freebase.com/There's also a repository system for data, TheData ("An open-source application for publishing, citing and discovering research data").
http://thedata.org/home