Skip to content

cantabular/data-services-helpers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

164 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

data-services-helpers

A module containing classes and functions previously used by The Sensible Code Company's Data Services team.

(The Sensible Code Company is now Cantabular Ltd.)

Warning: This is now in maintenance mode.

It has been updated to work with Python 3.14 but there are no guarantees on future maintenance.

Installation

For the current release:

pip install dshelpers

Usage

batch_processor

with batch_processor(callback_function(), batch_size=2000) as b:
    # loop to make rows here
    b.push(row)

Here, push on the batch_processor queues items in a list. When the context manager is exited, calls the callback_function with the list of items.

Often used to bundle multiple calls to scraperwiki.sqlite.save when saving data to a database.

install_cache

install_cache(expire_after=12 * 3600, cache_post=False)

For installing a requests_cache; requires the requests-cache package.

expire_after is the cache expiry time in seconds.

cache_post defines if HTTP POST requests should be cached as well.

download_url

download_url(url, back_off=True, **kwargs)

Retrieve the content of url, by default using requests.request('GET', url), and return a file-like object. If back_off=True, then this will retry (with backoff) on failure; otherwise, only one attempt is made. Returns the response.content as a StringIO object.

The **kwargs can be arguments that requests recognises, e.g. method or headers.

request_url

request_url(url, back_off=True, **kwargs)

As download_url, but returns the response object.

Tests

Run with make test.

About

Python module containing classes and functions that The Sensible Code Company's Data Services often used

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors