News
counts(*, version=None, per_platform=False)
Retrieve the number of news assets in the metadata catalogue.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
version
|
str | None
|
The version of the endpoint (default is None). |
None
|
per_platform
|
bool
|
Whether to list counts per platform (default is False). |
False
|
Returns:
Type | Description |
---|---|
int | dict[str, int]
|
The number news assets in the metadata catalogue. If the parameter per_platform is True, it returns a dictionary with platform names as keys and the number of news assets from that platform as values. |
get_asset(identifier, *, version=None, data_format='pandas')
Retrieve metadata for a specific news.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
int
|
The identifier of the news to retrieve. |
required |
version
|
str | None
|
The version of the endpoint (default is None). |
None
|
data_format
|
Literal['pandas', 'json']
|
The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, in this case a dict. |
'pandas'
|
Returns:
Type | Description |
---|---|
Series | dict
|
The retrieved metadata for the specified news. |
get_asset_from_platform(*, platform, platform_identifier, version=None, data_format='pandas')
Retrieve metadata for a specific news identified by the external platform identifier.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
platform
|
str
|
The platform where the news asset is retrieved from. |
required |
platform_identifier
|
str
|
The identifier under which the news is known by the platform. |
required |
version
|
str | None
|
The version of the endpoint (default is None). |
None
|
data_format
|
Literal['pandas', 'json']
|
The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, in this case a dict. |
'pandas'
|
Returns:
Type | Description |
---|---|
Series | dict
|
The retrieved metadata for the specified news. |
get_assets_async(identifiers, *, version=None, data_format='pandas')
async
Asynchronously retrieve metadata for a list of news identifiers.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifiers
|
list[int]
|
The list of identifiers of the news to retrieve. |
required |
version
|
str | None
|
The version of the endpoint (default is None). |
None
|
data_format
|
Literal['pandas', 'json']
|
The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, in this case a list of dicts. |
'pandas'
|
Returns:
Type | Description |
---|---|
DataFrame | list[dict]
|
The retrieved metadata for the specified news. |
get_content(*, identifier, distribution_idx=0, version=None)
Retrieve the data content of a specific news.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
int
|
The identifier of the news asset. |
required |
distribution_idx
|
int
|
The index of a specific distribution from the distribution list (default is 0). |
0
|
version
|
str | None
|
The version of the endpoint (default is None). |
None
|
Returns:
Type | Description |
---|---|
bytes
|
The data content for the specified news. |
get_list(*, platform=None, offset=0, limit=10, version=None, data_format='pandas')
Retrieve a list of news from the catalogue.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
platform
|
str | None
|
Return metadata of news assets of this platform (default is None). |
None
|
offset
|
int
|
The offset for pagination (default is 0). |
0
|
limit
|
int
|
The maximum number of items to retrieve (default is 10). |
10
|
version
|
str | None
|
The version of the endpoint (default is None). |
None
|
data_format
|
Literal['pandas', 'json']
|
The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, i.e. in this case a list of dicts. |
'pandas'
|
Returns:
Type | Description |
---|---|
DataFrame | list[dict]
|
The retrieved metadata in the specified format. |
get_list_async(*, offset=0, limit=100, batch_size=10, version=None, data_format='pandas')
async
Asynchronously retrieve a list of news from the catalogue in batches.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
offset
|
int
|
The offset for pagination (default is 0). |
0
|
limit
|
int
|
The maximum number of items to retrieve (default is 10). |
100
|
batch_size
|
int
|
The number of items in a a batch. |
10
|
version
|
str | None
|
The version of the endpoint (default is None). |
None
|
data_format
|
Literal['pandas', 'json']
|
The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, in this case a list of dicts. |
'pandas'
|
Returns:
Type | Description |
---|---|
DataFrame | list[dict]
|
The retrieved metadata in the specified format. |
Raises:
Type | Description |
---|---|
ValueError
|
Batch size must be larger than 0. |
search(query, *, platforms=None, offset=0, limit=10, search_field=None, get_all=True, version=None, data_format='pandas', asset_type)
Search metadata for news type using the Elasticsearch endpoint of the AIoD metadata catalogue.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
search
|
The string to be matched against the search fields. |
required | |
platforms
|
list[str] | None
|
The platforms to filter the search results. If None, results from all platforms will be returned (default is None). |
None
|
offset
|
int
|
The offset for pagination (default is 0). |
0
|
limit
|
int
|
The maximum number of results to retrieve (default is 10). |
10
|
search_field
|
None | Literal['name', 'issn', 'description_html', 'description_plain']
|
The specific fields to search within. If None, the query will be matched against all fields (default is None). |
None
|
get_all
|
bool
|
If true, a request to the database is made to retrieve all data. If false, only the indexed information is returned. (default is True). |
True
|
version
|
str | None
|
The version of the endpoint to use (default is None). |
None
|
data_format
|
Literal['pandas', 'json']
|
The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, in this case a list of dict's. |
'pandas'
|
Returns:
Type | Description |
---|---|
DataFrame | list[dict]
|
The retrieved metadata in the specified format. |