Skip to content

News

counts(*, version=None, per_platform=False)

Retrieve the number of news assets in the metadata catalogue.

Parameters:

Name Type Description Default
version str | None

The version of the endpoint (default is None).

None
per_platform bool

Whether to list counts per platform (default is False).

False

Returns:

Type Description
int | dict[str, int]

The number news assets in the metadata catalogue. If the parameter per_platform is True, it returns a dictionary with platform names as keys and the number of news assets from that platform as values.

get_asset(identifier, *, version=None, data_format='pandas')

Retrieve metadata for a specific news.

Parameters:

Name Type Description Default
identifier int

The identifier of the news to retrieve.

required
version str | None

The version of the endpoint (default is None).

None
data_format Literal['pandas', 'json']

The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, in this case a dict.

'pandas'

Returns:

Type Description
Series | dict

The retrieved metadata for the specified news.

get_asset_from_platform(*, platform, platform_identifier, version=None, data_format='pandas')

Retrieve metadata for a specific news identified by the external platform identifier.

Parameters:

Name Type Description Default
platform str

The platform where the news asset is retrieved from.

required
platform_identifier str

The identifier under which the news is known by the platform.

required
version str | None

The version of the endpoint (default is None).

None
data_format Literal['pandas', 'json']

The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, in this case a dict.

'pandas'

Returns:

Type Description
Series | dict

The retrieved metadata for the specified news.

get_assets_async(identifiers, *, version=None, data_format='pandas') async

Asynchronously retrieve metadata for a list of news identifiers.

Parameters:

Name Type Description Default
identifiers list[int]

The list of identifiers of the news to retrieve.

required
version str | None

The version of the endpoint (default is None).

None
data_format Literal['pandas', 'json']

The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, in this case a list of dicts.

'pandas'

Returns:

Type Description
DataFrame | list[dict]

The retrieved metadata for the specified news.

get_content(*, identifier, distribution_idx=0, version=None)

Retrieve the data content of a specific news.

Parameters:

Name Type Description Default
identifier int

The identifier of the news asset.

required
distribution_idx int

The index of a specific distribution from the distribution list (default is 0).

0
version str | None

The version of the endpoint (default is None).

None

Returns:

Type Description
bytes

The data content for the specified news.

get_list(*, platform=None, offset=0, limit=10, version=None, data_format='pandas')

Retrieve a list of news from the catalogue.

Parameters:

Name Type Description Default
platform str | None

Return metadata of news assets of this platform (default is None).

None
offset int

The offset for pagination (default is 0).

0
limit int

The maximum number of items to retrieve (default is 10).

10
version str | None

The version of the endpoint (default is None).

None
data_format Literal['pandas', 'json']

The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, i.e. in this case a list of dicts.

'pandas'

Returns:

Type Description
DataFrame | list[dict]

The retrieved metadata in the specified format.

get_list_async(*, offset=0, limit=100, batch_size=10, version=None, data_format='pandas') async

Asynchronously retrieve a list of news from the catalogue in batches.

Parameters:

Name Type Description Default
offset int

The offset for pagination (default is 0).

0
limit int

The maximum number of items to retrieve (default is 10).

100
batch_size int

The number of items in a a batch.

10
version str | None

The version of the endpoint (default is None).

None
data_format Literal['pandas', 'json']

The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, in this case a list of dicts.

'pandas'

Returns:

Type Description
DataFrame | list[dict]

The retrieved metadata in the specified format.

Raises:

Type Description
ValueError

Batch size must be larger than 0.

search(query, *, platforms=None, offset=0, limit=10, search_field=None, get_all=True, version=None, data_format='pandas', asset_type)

Search metadata for news type using the Elasticsearch endpoint of the AIoD metadata catalogue.

Parameters:

Name Type Description Default
search

The string to be matched against the search fields.

required
platforms list[str] | None

The platforms to filter the search results. If None, results from all platforms will be returned (default is None).

None
offset int

The offset for pagination (default is 0).

0
limit int

The maximum number of results to retrieve (default is 10).

10
search_field None | Literal['name', 'issn', 'description_html', 'description_plain']

The specific fields to search within. If None, the query will be matched against all fields (default is None).

None
get_all bool

If true, a request to the database is made to retrieve all data. If false, only the indexed information is returned. (default is True).

True
version str | None

The version of the endpoint to use (default is None).

None
data_format Literal['pandas', 'json']

The desired format for the response (default is "pandas"). For "json" formats, the returned type is a json decoded type, in this case a list of dict's.

'pandas'

Returns:

Type Description
DataFrame | list[dict]

The retrieved metadata in the specified format.