How to…  如何…

On other pages:  其他页面:

Access other SDMX data sources
访问其他 SDMX 数据源

sdmx ships with a file, sources.json, that includes information about the capabilities of many data sources. However, any data source that generates SDMX 2.1 messages is supported. There are multiple ways to access these:
sdmx 附带一个文件 sources.json,其中包含许多数据源的功能信息。但是,任何生成 SDMX 2.1 消息的数据源都受支持。有多种访问方式:

  1. Create a sdmx.Client without a named data source, then call the get() method using the url argument:
    创建一个 sdmx.Client ,无需命名数据源,然后使用 url 参数调用 get() 方法:

    import sdmx
    c = sdmx.Client()
    c.get(url='https://sdmx.example.org/path/to/webservice', ...)
    
  2. Call add_source() with a JSON snippet describing the data provider.
    用 JSON 片段描述数据提供者,调用 add_source()

  3. Create a subclass of Source, providing attribute values and optional implementations of hooks.
    创建一个 Source 的子类,提供属性值和可选的钩子实现。

Control logging  控制日志

sdmx.log is a standard Python logging.Logger object. For debugging, set this to a permissive level:
sdmx.log 是一个标准的 Python logging.Logger 对象。为了调试,请将其设置为宽松级别:

import logging

sdmx.log.setLevel(logging.DEBUG)

Log messages include the web service query details (URL and headers) used by Client.
日志消息包含 Client 使用的 Web 服务查询详细信息(URL 和头部)。

Use the ‘references’ query parameter
使用“参考文献”查询参数

SDMX web services support a references parameter in HTTP requests which can take values such as ‘all’, ‘descendants’, etc. This parameter instructs the web service to include, when generating a Data- or StructureMessage, the objects implicitly designated by the references parameter alongside the explicit resource. For example, for the request:
SDMX Web 服务支持 HTTP 请求中的 references 参数,该参数可以接受诸如“all”、“descendants”等值。此参数指示 Web 服务在生成数据或结构消息时,应将 references 参数隐式指定的对象与显式资源一起包含。例如,对于请求:

>>> response = some_agency.dataflow('SOME_ID', params={'references': 'all'})

the response will include:
响应将包括:

  • the dataflow ‘SOME_ID’ explicitly specified,
    数据流“SOME_ID”明确指定,

  • the DSD referenced by the dataflow’s structure attribute,
    数据流的 structure 属性引用的 DSD

  • the code lists referenced by the DSD, and
    DSD 引用的代码列表,和

  • any content-constraints which reference the dataflow or the DSD.
    任何引用数据流或 DSD 的内容约束。

It is much more efficient to request many objects in a single request. Thus, sdmx provides default values for references in common queries. For instance, when a single dataflow is requested by specifying its ID, sdmx sets references to ‘all’. On the other hand, when the dataflow ID is wild-carded, it is more practical not to request all referenced objects alongside as the response would likely be excessively large, and the user is deemed to be interested in the bird’s eye perspective (list of data flows) prior to focusing on a particular dataflow and its descendants and ancestors. The default value for the references parameter can be overridden.
一次请求中获取多个对象效率更高。因此, sdmx 为常见查询中的 references 提供默认值。例如,当通过指定 ID 请求单个数据流时, sdmxreferences 设置为“全部”。另一方面,当数据流 ID 使用通配符时,不请求所有引用的对象会更实用,因为响应可能过大,并且用户被认为在关注特定数据流及其子孙和祖先之前,更感兴趣的是鸟瞰图(数据流列表)。 references 参数的默认值可以被覆盖。

Some web services differ in how they handle references—for instance, ESTAT. See Data sources for details.
一些网络服务在处理 references 的方式上有所不同——例如,ESTAT。请参阅数据来源了解更多详情。

Use category schemes to explore data
使用类别方案来探索数据

SDMX supports category-schemes to categorize dataflow definitions and other objects. This helps retrieve, e.g., a dataflow of interest. Note that not all agencies support category schemes. A good example is the ECB. However, as the ECB’s SDMX service offers less than 100 data flows, using category schemes is not strictly necessary. A counter-example is Eurostat which offers more than 6000 data flows, yet does not categorize them. Hence, the user must search through the flat list of data flows.
SDMX 支持类别方案来对数据流定义和其他对象进行分类。这有助于检索感兴趣的数据流,例如。请注意,并非所有机构都支持类别方案。欧洲央行就是一个很好的例子。然而,由于欧洲央行的 SDMX 服务提供的的数据流少于 100 个,因此使用类别方案并非严格必要。一个反例是欧统计局,它提供超过 6000 个数据流,但并未对其进行分类。因此,用户必须搜索数据流的平面列表。

To search the list of data flows by category, we request the category scheme from the ECB’s SDMX service and explore the response:
要搜索按类别的数据流列表,我们请求欧洲央行 SDMX 服务的类别方案,并探索其响应

In [1]: import sdmx

In [2]: ecb = sdmx.Client('ecb')

In [3]: cat_response = ecb.categoryscheme()

Like any other scheme, a category scheme is essentially a dict mapping ID’s to the actual SDMX objects. To display the categorized items, in our case the dataflow definitions contained in the category on exchange rates, we iterate over the Category instance:
任何方案一样,类别方案本质上是一个将 ID 映射到实际 SDMX 对象的字典。为了显示分类项目,例如,在汇率类别中包含的数据流定义,我们遍历 Category 实例:

In [4]: sdmx.to_pandas(cat_response.category_scheme.MOBILE_NAVI)
Out[4]: 
MOBILE_NAVI
01                                  Monetary operations
02             Prices, output, demand and labour market
03                    Monetary and financial statistics
04                                   Euro area accounts
05                                   Government finance
06                  External transactions and positions
07                                       Exchange rates
08    Payments and securities trading, clearing, set...
09                                  Banknotes and Coins
10                  Indicators of Financial Integration
11               Real Time Database (research database)
Name: Economic concepts, dtype: object

In [5]: cat_response.category_scheme.MOBILE_NAVI
Out[5]: <CategoryScheme ECB:MOBILE_NAVI(1.0) (11 items): Economic concepts>

Added in version 0.5.  0.5 版本新增。

Select data frame layouts returned by to_pandas()
to_pandas() 返回的数据框布局

to_pandas() provides multiple ways to customize the type and layout of pandas objects returned for DataMessage input. One is the datetime argument; see Convert dimensions to pandas.DatetimeIndex or PeriodIndex. The other is the rtype argument.
to_pandas() 提供多种方法来自定义 pandas 对象的类型和布局,这些对象是针对 DataMessage 输入返回的。一种方法是 datetime 参数;请参阅将维度转换为 pandas.DatetimeIndex 或 PeriodIndex。另一种方法是 rtype 参数。

To select the same behaviour as pandaSDMX 0.9, give rtype = ‘compat’, or set DEFAULT_RTYPE to ‘compat’:
为选择与 pandaSDMX 0.9 相同的行为,请设置 rtype = ‘compat’,或将 DEFAULT_RTYPE 设置为 ‘compat’。

In [6]: sdmx.writer.DEFAULT_RYPE = 'compat'

With ‘compat’, the returned layout varies with the concept of “dimension at the observation level,” as follows:
使用“compat”,返回的布局会根据“观察层维度”的概念而变化,如下所示:

Dimension At Observation Level
观测层维度

Return Type  返回类型

AllDimensions

  • Series, without attributes, or
    Series ,没有属性,或者

  • DataFrame, with any attributes.
    DataFrame ,带有任何属性。

TimeDimension

Same as datetime = True —a Dataframe with:
与 datetime = True —一个 Dataframe 具有:

  • index: DatetimeIndex or PeriodIndex, and
    索引: DatetimeIndexPeriodIndex ,以及

  • columns: MultiIndex with all other dimensions.
    所有其他维度下的列: MultiIndex

Other Dimension  其他 Dimension

DataFrame with:

  • index: the dimension at observation level, and
    索引:观测级别的维度,以及

  • columns: MultiIndex with all other dimensions.
    所有其他维度下的列: MultiIndex

Limitations:  限制:

  • sdmx can only obey rtype = ‘compat’ when reading or converting an entire DataMessage; not a DataSet. While the concept of “dimension at observation level” is mentioned in the IM in relation to data sets, it is not formally included as an attribute of any class, or with any default value. (For instance, it is not included in the DimensionDescriptor of a DataStructureDefinition.) It can only be determined from the header of a SDMX-ML or -JSON data message.
    sdmx 只能在读取或转换整个 DataMessage 时遵守 rtype = ‘compat’;而不能对 DataSet 遵守。虽然 IM 中提到了“观察层维度”的概念,与数据集相关,但它并未正式包含为任何类的属性,或具有任何默认值。(例如,它未包含在 DimensionDescriptorDataStructureDefinition 中。)它只能从 SDMX-ML 或 -JSON 数据消息的头部确定。

  • Except for AllDimensions, each row and column of the returned data frame contains multiple observations, so attributes cannot be included without ambiguity about which observation(s) have the attribute. In these cases, attributes are omitted; use rtype = ‘rows’ to retrieve them.
    除了 AllDimensions 之外,返回的数据框的每一行和每一列都包含多个观测值,因此,如果不确定哪些观测值具有该属性,则无法包含属性。在这些情况下,省略属性;使用 rtype = ‘rows’ 来检索它们。

With the argument rtype = ‘rows’ (the default), data are always returned with one row per observation.
使用 rtype = ‘rows’ 参数(默认值),数据总是以每条观测值一行返回。

Convert SDMX data to other formats
将 SDMX 数据转换为其他格式

Pandas supports output to many popular file formats. Call these methods on the objects returned by to_pandas(). For instance:
Pandas 支持输出到许多流行的文件格式。在 to_pandas() 返回的对象上调用这些方法。例如:

msg = sdmx.read_sdmx('data.xml')
sdmx.to_pandas(msg).to_excel('data.xlsx')

pandaSDMX 0.9 could be used with odo by registering methods for discovery and conversion:
pandaSDMX 0.9 可以与 odo 配合使用,方法是注册用于发现和转换的方法。

import odo
from odo.utils import keywords
import pandas as pd
from toolz import keyfilter
import toolz.curried.operator as op

class PandaSDMX(object):
    def __init__(self, uri):
        self.uri = uri

@odo.resource.register(r'.*\.sdmx')
def _resource(uri, **kwargs):
    return PandaSDMX(uri)

@odo.discover.register(PandaSDMX)
def _discover(obj):
    return odo.discover(sdmx.to_pandas(sdmx.read_sdmx(obj.uri)))

@odo.convert.register(pd.DataFrame, PandaSDMX)
def _convert(obj, **kwargs):
    msg = sdmx.read_sdmx(obj.uri)
    return sdmx.to_pandas(
        msg, **keyfilter(op.contains(keywords(write)), kwargs)
    )

Deprecated since version 1.0: odo appears unmaintained since about 2016, so sdmx no longer provides built-in registration.
自 1.0 版本起已弃用:自大约 2016 年起,odo 似乎未维护,因此 sdmx 不再提供内置注册。

Added in version 0.4: sdmx.odo_register() was added, providing automatic registration.
0.4 版本新增: sdmx.odo_register() ,提供自动注册功能。