The Provider object is responsible for retrieving metadata about a given URL. It implements a method called request(), which takes a URL and any parameters, which it sends off to an endpoint. The endpoint should return a JSON dictionary containing metadata about the resource, which is returned to the caller.
| Parameters: |
|
|---|
Retrieve information about the given url. By default, will make a HTTP GET request to the endpoint. The url will be sent to the endpoint, along with any parameters specified in the extra_params and those parameters specified when the class was instantiated.
Will raise a ProviderException in the event the URL is not accessible or the API times out.
| Parameters: |
|
|---|---|
| Return type: | a dictionary of JSON data |
A registry for encapsulating a group of Provider instances. It has optional caching support.
Handles matching regular expressions to providers. URLs are sent to the registry via its request() method, it checks to see if it has a provider that matches the URL, and if so, requests the metadata from the provider instance.
| Parameters: | cache – the cache simply needs to implement two methods, .get(key) and .set(key, value). |
|---|
Register the provider with the following regex.
Example:
registry = ProviderRegistry()
registry.register(
'http://\S*.youtu(\.be|be\.com)/watch\S*',
Provider('http://www.youtube.com/oembed'),
)
| Parameters: |
|
|---|
Retrieve information about the given url if it matches a regex in the instance’s registry. If no provider matches the URL, a ProviderException is thrown, otherwise the URL and parameters are dispatched to the matching provider’s Provider.request() method.
If a cache was specified, the resulting metadata will be cached.
| Parameters: |
|
|---|---|
| Return type: | a dictionary of JSON data |
Create a ProviderRegistry and register some basic providers, including youtube, flickr, vimeo.
| Parameters: | cache – an object that implements simple get and set |
|---|---|
| Return type: | a ProviderRegistry with a handful of providers registered |
Create a ProviderRegistry and register as many providers as are supported by embed.ly. Valid services are fetched from http://api.embed.ly/1/services/python and parsed then registered.
| Parameters: |
|
|---|---|
| Return type: | a ProviderRegistry with support for embed.ly |
# if you have an API key, you can specify that here
pr = bootstrap_embedly(key='my-embedly-key')
pr.request('http://www.youtube.com/watch?v=54XHDUOHuzU')
Create a ProviderRegistry and register as many providers as are supported by noembed.com. Valid services are fetched from http://noembed.com/providers and parsed then registered.
| Parameters: |
|
|---|---|
| Return type: | a ProviderRegistry with support for noembed |
# if you have an API key, you can specify that here
pr = bootstrap_noembed(nowrap=1)
pr.request('http://www.youtube.com/watch?v=54XHDUOHuzU')
Parse a block of text, converting all links by passing them to the given handler. Links contained within a block of text (i.e. not on their own line) will be handled as well.
Example input and output:
IN: 'this is a pic http://example.com/some-pic/'
OUT: 'this is a pic <a href="http://example.com/some-pic/"><img src="http://example.com/media/some-pic.jpg" /></a>'
| Parameters: |
|
|---|
Very similar to the above parse_text_full() except URLs on their own line are rendered using the given handler, whereas URLs within blocks of text are passed to the block_handler. The default behavior renders full content for URLs on their own line (e.g. a flash player), whereas URLs within text are rendered simply as links so as not to disrupt the flow of text.
| Parameters: |
|
|---|
Parse HTML intelligently, rendering items on their own within block elements as full content (e.g. a flash player), whereas URLs within text are passed to the block_handler which by default will render a simple link. Also worth noting is that URLs that are already enclosed within a <a> tag are skipped over.
Note
requires BeautifulSoup or beautifulsoup4
| Parameters: |
|
|---|
Extract all URLs from a block of text, and additionally get any metadata for URLs we have providers for.
| Parameters: |
|
|---|---|
| Return type: | returns a 2-tuple containing a list of all URLs and a dictionary keyed by URL containing any metadata. If a provider was not found for a URL it is not listed in the dictionary. |
Extract all URLs from an HTML string, and additionally get any metadata for URLs we have providers for. Same as extract() but for HTML.
Note
URLs within <a> tags will not be included.
| Parameters: |
|
|---|---|
| Return type: | returns a 2-tuple containing a list of all URLs and a dictionary keyed by URL containing any metadata. If a provider was not found for a URL it is not listed in the dictionary. |
A reference implementation for the cache interface used by the ProviderRegistry.
Retrieve the key from the cache or None if not present
Set the cache key key to the given value.
A cache that uses pickle to store data.
Note
To use this cache class be sure to call load() when initializing your cache and save() before your app terminates to persist cached data.
Load the pickled data into memory
Store the internal cache to an external file
A cache that uses Redis to store data
Note
requires the redis-py library, pip install redis
| Parameters: |
|
|---|