Cato XML Reference

Last updated: 2013-04-30

This document describes CatoXML, which is a number of inline semantic metadata extensions to “HouseXML” in the http://namespaces.cato.org/catoxml namespace.

“HouseXML” is an unofficial term for the XML schema of legislation drafted by the United States Congress (House and Senate) and documented at xml.house.gov.

These metadata extensions are collectively called “CatoXML”.

Definitions

Prefix cato: is bound to namespace http://namespaces.cato.org/catoxml

Attribute names are prefixed with @; e.g. @foobar for an attribute named foobar.

A metadata element is an element that expresses metadata about a span of text. CatoXML defines four metadata elements: cato:entity, cato:property, cato:funds-and-year, and cato:entity-ref. Certain HouseXML elements can also express metadata equivalent to CatoXML elements.

cato:entity Element

Used to contain text that creates an entity. Any child metadata elements are properties of the immediate parent entity.

Attributes of cato:entity

@entity-type: Required. States the type of the entity. Valid values are:

Note: the auth-authorization, auth-regulation, and auth-interpretation types are currently unused, but these names are reserved for future use.

cato:property Element

Used to contain text which is constitutive of an entity but which is not itself an entity or reference to an entity.

A cato:property element must be contained by a cato:entity element.

Attributes of cato:property

  1. @name: Required. States the name of this property. Property names are specific to a certain entity type. Two property names are defined:

  2. @value: States the machine-readable value of this property. If the property element contains text, then this attribute contains a normalized, machine-readable version of that text. If this attribute is omitted, then the value of this property is the text content of this element and it is not required to be machine-readble.

cato:funds-and-year Element

Used to contain text that indicates the amount of funds made available and the year during which those funds are made available by an authority entity. An authority entity may have multiple cato:funds-and-year elements.

This element exists as a shorthand for document markup to avoid the need for id references and empty elements for one or another of its property values. It expresses the same information as the following set of cato:entity and cato:property elements:

<entity entity-type="funds-and-year"><property name="amount"

value="1000">$1000</property> in <property name="year"

value="2011">2011</property></entity>

Attributes of cato:funds-and-year

  1. @amount: Required. States the amount of money in US dollars that the authority proposes to be set aside. This attribute’s value is a positive integer or the special value indefinite, indicating that no specific amount was named.
  2. @year: Required. States the fiscal years during which the stated amount may be spent. This attribute’s value is a set of fiscal years expressed as one of the following:

cato:entity-ref Element

Used to contain text that refers to but does not create an entity.

Attributes of cato:entity-ref

In addition to @entity-type, one and only one of the entity-id, entity-parent-id, or value attributes are required.

  1. @entity-type: Required. States the type of entity that the enclosed text references. Valid values are:

  2. entity-id: States the id of the entity that the enclosed text references. Entity ids must be unique among all others with the same entity-type.

  3. entity-parent-id: States the id of the parent entity of the entity that the enclosed text references. This attribute is used when the entity does not have an id or its id is not known but a parent entity is known.
  4. value: Expresses the content of the text of the entity-ref (not of the entity) in a consistent, documented, machine-parsable format specific to its entity-type. Different value attribute values may refer to the same entity.
  5. proposed: States whether the current entity reference is to an existing or a proposed entity. The value of this attribute is true or false. If this attribute is absent, then the value of this attribute is false. This attribute may be found on uscode or act entities.

Notes on entity-refs

The act, uscode, public-law, and statute-at-large entity-types lack an @entity-id or @entity-parent-id attribute because:

  1. There is no universally-agreed-upon unique identifier for the entities they cite.
  2. Different @value values may reference the same entity. This is unlike an @entity-id, where every entity has exactly one id.

Values for CatoXML cato:entity-ref value

All entity-ref value attributes use a series of slash-delimited segments. For example, usc/1/234 cites title 1, section 234 of the U.S. Code. This is equivalent to "1 U.S.C. 234" in the common citation format. The meaning and parsing of individual segments is determined by the value of the first segment.

uscode

statute-at-large

A reference to a page in a volume of the Statutes at Large. The normal citation "90 Stat. 2541" would be expressed as statute-at-large/90/2541. Segments are:

  1. Fixed string "statute-at-large". (Note for compatibility with HouseXML "statute" is singular.)
  2. Statutes at Large volume number.
  3. Statutes at Large page number. The page number may be an inclusive range if two numbers are joined by a double-period, e.g. 2541..2543 indicates pages 2541 through 2543.

act

A reference to an act by its popular name. There is very little uniformity among act citations so machine-parsable act citation values utilize a system of prefixes to indicate segment types. The normal citation "1861(s)(2) of the Social Security Act" would be expressed as Social Security Act/s:1861/ss:s/p:2. Segments are:

  1. A popular name for an act taken verbatim from the Office of the Law Revision Council’s table of popular names, or from the text contained by an HouseXML act-name element in the current document that names the current document, or the compact FDsys name of the bill with its version suffix (e.g., "113hconres2ih"). The latter two values are only used if the reference is to the current bill. A single act may have multiple popular names, and no attempt is made to establish one unique canonical popular name per act. The act name may contain any character except / (forward slash).
  2. Further optional segments are citations reflecting the parts of the document explicitly mentioned by the text of the citation:

  3. The final segment may contain the special value note or etseq, as with U.S.C. Section citations.

public-law

A reference to a Public Law. The normal citation "P.L. 111-12" would be expressed as public-law/111/12. Segments are:

  1. Fixed string public-law
  2. Congress number
  3. Law number
  4. Following the third segment, a public law citation value may use part-prefixed segments exactly as described in number 2 in the "acts" section above. For example, public-law/111/12/t:I indicates "title I of P. L. 111-12".

Mapping HouseXML metadata elements to CatoXML metadata elements

Certain elements in HouseXML can express the same information as a CatoXML element. If a HouseXML element is present in a document and would express the same information as a CatoXML element, no CatoXML element is added. This section defines rules for determining the semantically equivalent CatoXML for a HouseXML element.

Entity type HouseXML CatoXML
Committee <committee-name committee-id="CID"> <cato:entity-ref entity-type="committee" entity-id="CID">
Person <sponsor name-id="BIOID"> <cato:entity-ref entity-id="BIOID">
Person <cosponsor name-id="BIOID"> <cato:entity-ref entity-id="BIOID">
Act (Popular Name) <act-name>Name of Act</act-name> <cato:entity-ref entity-type="act" value="Name of Act"> Name of Act</cato:entity-ref> Note
U.S. Code Section <external-xref legal-doc="uscode" parseable-cite="Citation Value"> <cato:entity-ref entity-type="usc" value="Citation Value">
U.S. Code Chapter <external-xref legal-doc="usc-chapter" parseable-cite="Citation Value"> <cato:entity-ref entity-type="uscode" value="Citation Value">
U.S. Code Appendix <external-xref legal-doc="usc-appendix" parseable-cite="Citation Value"> <cato:entity-ref entity-type="uscode" value="Citation Value">
Public Law <external-xref legal-doc="public-law" parseable-cite="Citation Value"> <cato:entity-ref entity-type="public-law" value="Citation Value">
Statutes at Large <external-xref legal-doc="statute-at-large" parseable-cite="Citation Value"> <cato:entity-ref entity-type="statute-at-large" value="Citation Value">

Note:act-name's @parseable-cite is ignored because the vocabulary is unpublished. If it is ever released, its value may be used in a cato:entity-ref @entity-id attribute.

Entity Lookup Tables

Entity lookup tables are references for entities indexed by entity-id. They have the following structure shared by all entity types:

  1. entities root element.

  2. entity child elements of entities contain information regarding a particular entity. They have a basic structure shared by all entity types which may be extended by particular entity types.

Entity-Type specific extensions

Certain entity types make use of the various extension points provided by the lookup table format and described in the previous section. These entity-type specific extensions are documented below.

Committees ("committee" entities)

These committee and subcommittee id values are consistent with those found in the @committee-id attribute of the committee-name element of House XML.

Subcommittees indicate their parent Committee with the @parent-id attribute.

People ("person" entities)

Person @id values are Bioguide ids.

The @version attribute on the entity element indicates a congressional session. The lookup table is expected to contain a comprehensive list of every congressman who served during that session of congress.

The entity element may have the following additional attributes:

The name element includes a full name of the senator, with title, party, and state. E.g.: Rep. Gary Ackerman (D, NY-5).

The name element may have the following optional attributes:

Agencies and Bureaus (“federal-body” entities)

The @entity element may have the following additional attributes:

Additionally, the @role attribute of the name element may have the value leadership, which indicates that the name is the position of the senior director of the named federal body. This role is included because bills often direct an agency to do something using language that names the highest position in that agency. For example, "The Happiness Czar shall expend $5 million in fiscal year 2013 to promote happiness abroad". Here, "Happiness Czar" would be a <name role="leadership"> entry for the fictional "Bureau of Happiness".