unesco_reader.core

Core functions for the unesco_reader package

This module contains the core functions for the unesco_reader package. These functions are used to interact with the UIS API to get data, metadata, and available indicators and geo units, themes, and data versions. The module handles indicator and entity conversions and normalizes data for easy processing to dataframes. The module handles errors and logs hints from the API responses

Attributes

logger

Functions

_log_hints(→ None)

If there are any hints in the response, log them

_convert_codes(→ str | list[str])

Convert names to their respective codes

_convert_indicator_codes_to_code(→ str | list[str])

Convert indicators to their respective codes

_convert_geo_units_to_code(→ str | list[str])

Convert geo units to their respective codes

_normalize_footnotes(→ list[dict])

Normalize the footnotes column

_add_indicator_labels(→ list[dict])

Add indicator labels to the data

_add_geo_unit_labels(→ list[dict])

Add geo unit labels to the data. For regions, add both the region name and the region group

get_data(→ pandas.DataFrame | list[dict])

Get UIS data

get_metadata(→ list[dict])

Get the metadata for indicators

_indicators_df(→ pandas.DataFrame)

Return available indicators as a DataFrame. This function flattens the data for easy DataFrame conversion then returns the DataFrame.

available_indicators(→ pandas.DataFrame | list[dict])

Get available indicators

available_geo_units(→ pandas.DataFrame | list[dict])

Get available geo units

available_themes(→ pandas.DataFrame | dict)

Get the available themes and basic information including latest update and description

default_version(→ str)

Get the default data version

available_versions(→ pandas.DataFrame | list[dict])

Get available data versions and basic information including publication date and description

Module Contents

unesco_reader.core.logger
unesco_reader.core._log_hints(response: dict) None

If there are any hints in the response, log them

If a response from the api contains hints, it means there are some issues with the request or the data. This function logs the hints as warnings. There may be multiple hints, so they are logged one by one.

Parameters:

response – The response from the API

unesco_reader.core._convert_codes(indicators: str | list[str], mapper: dict) str | list[str]

Convert names to their respective codes

This function is used to convert geo units or indicators from names to their respective codes. If the name is already a code, it is left as is.

Parameters:
  • indicators – The indicator name or list of indicator names to convert to codes

  • mapper – The dictionary mapping names to codes

Returns:

The code or list of codes

unesco_reader.core._convert_indicator_codes_to_code(indicators: str | list[str]) str | list[str]

Convert indicators to their respective codes

This function converts the indicator names to their respective codes. If the indicator is already a code, it is left as is. If the indicator is not found, it is left as is and will be handled as an error by the API.

Parameters:

indicators – The indicator name or list of indicator names to convert to codes

Returns:

The indicator code or list of indicator codes

unesco_reader.core._convert_geo_units_to_code(geo_units: str | list[str]) str | list[str]

Convert geo units to their respective codes

This function converts the geo unit names to their respective codes. If the geo unit is already a code, it is left as is. If the geo unit is not found, it is left as is and will be handled as an error by the API.

Parameters:

geo_units – The geo unit name or list of geo unit names to convert to codes

Returns:

The geo unit code or list of geo unit codes

unesco_reader.core._normalize_footnotes(data: list[dict]) list[dict]

Normalize the footnotes column

The footnotes can have 1 or multiple records with keys - “type”, “subtype”, “value”. eg: {… ‘footnotes’: [{‘type’: ‘Source’, ‘subtype’: ‘Data sources’, ‘value’: “Country’s submission to UIS Survey of Formal Education Questionnaire A”}], … }

This function normalizes the footnotes into a single string for a dataframe column with the structure: “type, subtype: value” eg: “Source, Data sources: Country’s submission to UIS Survey of Formal Education Questionnaire A”

For multiple footnotes, the normalized string is concatenated with a semicolon.

unesco_reader.core._add_indicator_labels(data: list[dict]) list[dict]

Add indicator labels to the data

Parameters:

data – The data to which to add the indicator labels

Returns:

The data with the indicator labels added

unesco_reader.core._add_geo_unit_labels(data: list[dict]) list[dict]

Add geo unit labels to the data. For regions, add both the region name and the region group

Parameters:

data – The data to which to add the geo unit labels

Returns:

The data with the geo unit labels added

unesco_reader.core.get_data(indicator: str | list[str] | None = None, geoUnit: str | list[str] | None = None, start: int | None = None, end: int | None = None, labels: bool = False, geoUnitType: unesco_reader.config.GeoUnitType | None = None, footnotes: bool = False, *, raw: bool = False, version: str | None = None) pandas.DataFrame | list[dict]

Get UIS data

Query the UIS API for data based on the given parameters. At least one indicator or one geo_unit must be provided. If only indicators are provided, data for all geographies is returned, and vice versa. To see available indicators or geographies, use the available_indicators or available_geo_units functions respectively. If both a geo_unit and geo_unit_type are provided, the geo_unit_type is ignored.

Parameters:
  • indicator – The indicator code or name to request data for. If None, data for all indicators is returned. By default, None. To see all available indicators, use the available_indicators function.

  • geoUnit – The geo unit code or name to request data for. If None, data for all geo units is returned. By default, None. To see all available geo units, use the available_geo_units function.

  • start – The start year to request data for. Includes the year itself. Default is None, which returns the earliest available year.

  • end – The end year to request data for. Includes the year itself. Default is None, which returns the latest available year.

  • labels – If True, adds indicator and geo unit labels to the data. Default is False.

  • geoUnitType – The type of geography to request data for. Allowed values are NATIONAL and REGIONAL. If geoUnit is provided, this parameter is ignored. Default is both national and regional data

  • footnotes – If True, includes footnotes in the response. Default is False.

  • raw – If True, returns the data as a list of dictionaries in the original format from the API. Default is False.

  • version – The data version to use. Default uses the latest default version.

Returns:

A pandas DataFrame with the data or a list of dictionaries if raw=True.

unesco_reader.core.get_metadata(indicator: str | list[str] | None = None, disaggregations: bool = False, glossaryTerms: bool = False, *, version: str | None = None) list[dict]

Get the metadata for indicators

Get the metadata for the given indicators. If no indicator is provided, metadata for all indicators is returned. Optionally include disaggregations and glossary terms in the response.

Parameters:
  • indicator – The indicator code or name to get metadata for. If None, metadata for all indicators is returned. Default is None which returns metadata for all indicators. To see all available indicators, use the available_indicators function.

  • disaggregations – Include disaggregations in the response. Default is False.

  • glossaryTerms – Include glossary terms in the response. Default is False.

  • version – The data version to use. Default uses the latest default version.

Returns:

A list of dictionaries with the metadata for the indicators

unesco_reader.core._indicators_df(indicators: list[dict]) pandas.DataFrame

Return available indicators as a DataFrame. This function flattens the data for easy DataFrame conversion then returns the DataFrame.

Parameters:

indicators – The list of indicators to convert to a DataFrame

Returns:

A pandas DataFrame with the available indicators

unesco_reader.core.available_indicators(theme: str | list[str] | None = None, minStart: int | None = None, geoUnitType: unesco_reader.config.GeoUnitType | Literal['ALL'] | None = None, *, raw: bool = False, version: str | None = None) pandas.DataFrame | list[dict]

Get available indicators

This functions returns the available indicators from the UIS API with some basic information, including theme, time range, last data update, and total records. The data is filtered based on the given parameters.

Parameters:
  • theme – Filter indicators for specific themes. Can be a single theme or a list of themes. Default returns all themes. Use the available_themes function to see all available themes.

  • minStart – The earliest start year for the indicator data. Includes the start year itself. Default is None, which returns all available data.

  • geoUnitType – The type of geography for which data is available. Default is None which does not filter and gets any available type. Allowed values are “NATIONAL” (country-level data), “REGIONAL” (regional-level data), “ALL” (both national and regional data), or None for all types.

  • raw – If True, returns the data as a list of dictionaries in the original format from the API. Default is False.

  • version – The data version to use. Default uses the latest default version.

Returns:

A pandas DataFrame with the available indicators or a list of dictionaries if raw=True.

unesco_reader.core.available_geo_units(geoUnitType: unesco_reader.config.GeoUnitType | None = None, *, raw: bool = False, version: str | None = None) pandas.DataFrame | list[dict]

Get available geo units

Get all available geo units for a given API data version (or the current default version if no explicit version is provided), along with some basic information like the region group and type of geography.

Parameters:
  • geoUnitType – The type of geography to request data for. Allowed values are NATIONAL and REGIONAL. Default is None which returns all available types.

  • raw – If True, returns the data as a list of dictionaries in the original format from the API. Default is False.

  • version – The data version to use. Default uses the latest default version.

Returns:

A pandas DataFrame with the available geo units or a list of dictionaries if raw=True.

unesco_reader.core.available_themes(*, raw: bool = False) pandas.DataFrame | dict

Get the available themes and basic information including latest update and description

Parameters:

raw – If True, returns the data as a dictionary in the original format from the API. Default is False.

unesco_reader.core.default_version() str

Get the default data version

Returns:

The default data version string

unesco_reader.core.available_versions(*, raw: bool = False) pandas.DataFrame | list[dict]

Get available data versions and basic information including publication date and description

Parameters:

raw – If True, returns the data as a list of dictionaries in the original format from the API. Default is False.

Returns:

A pandas DataFrame with the available versions or a list of dictionaries if raw=True.