Note: This section of the site is under heavy construction, tread carefully.

Bookmarks tagged with “scraper”

17 bookmarks by garrettc


sqlite-html

A SQLite extension for querying, manipulating, and creating HTML elements.

Bookmarked on #



Scrapism

“Web scraping describes techniques for automatically downloading and processing web content, or converting online text and other media into structured data that can then be used for various purposes. In short, the user writes a program to browse and analyze the web on their behalf, rather than doing so manually.”

Bookmarked on #


A javascript rendering service — Splash 3.5 documentation

Splash is a javascript rendering service. It’s a lightweight web browser with an HTTP API, implemented in Python 3 using Twisted and QT5.

Bookmarked on #


trafilatura

Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments) - adbar/trafilatura

Bookmarked on #




Scrapy

An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.

Bookmarked on #



Workbench – The data journalism platform of the future

"Valuable data is found by reporters on a daily basis but analytic processes are intimidating. Data teams get flooded with requests, reporters loose autonomy, and important stories are lost. We’re building Workbench to give reporters the power of data-science, no code required."

Bookmarked on #



MattMcFarland/SUq

SUq - A nodejs Scraping Utility for lazy people. MIT Licensed

Bookmarked on #


Morph

Get structured data out of the web * All code and collaboration through GitHub * Write your scrapers in Ruby, Python, PHP or Perl * Simple API to grab data * Schedule scrapers or run manually

Bookmarked on #



node.io

A distributed data scraping and processing framework running on node.js.

Bookmarked on #


Scraping for Journalism: A Guide for Collecting Data

Dan Nguyen gives and overview of how to pull data from Flash, PDF and HTMl using a wide set of tools.

Bookmarked on #


Vaguely live map of trains in the United Kingdom

Another nifty web app from Matthew Somerville, he of TheyWorkForYou and PledgeBank. View nearly real time info about trains on approach to particular stations.

Bookmarked on #