zcloud.avro module

This module provides utilities for loading, validating, and writing Avro schemas and records, specifically tailored for zip design operations.

Warning

The schemas are currently fetched from package resources. This is acceptable for a small company with good documentation, but it is recommended to move to a schema registry in the future.

Classes

class zcloud.avro.AvroDataHandler(schema_name, output_file=None, ignore_metadata=False, **kwargs)[source]

Bases: object

A handler for Avro data operations, including loading schemas, validating records, and writing to Avro files.

Parameters:
  • schema_name (str) – The name of the schema to load.

  • output_file (str, optional) – The file to write records to. Defaults to “records.avro”.

  • **kwargs (dict) – Additional metadata required by the schema. Presence or absence of required metadata is checked against hint fields in the schema.

schema

The loaded Avro schema.

Type:

dict

zip_metadata

Metadata specific to zip design operations. This metadata includes a hint field not supported by the Avro spec, which is used to indicate required metadata when writing the Avro file.

Type:

dict or None

required_metadata_keys

List of required metadata keys.

Type:

list or None

optional_metadata_keys

List of optional metadata keys.

Type:

list or None

meta

Metadata for fastavro.

Type:

dict

records

List of records to be written to the Avro file.

Type:

list

output_file

The file to write records to.

Type:

str

keys

The keys of the schema. Update this if evolving the schema. Avoid modifying schema[“fields”] directly.

Type:

set

validate_record(record)[source]

Simple validation against the schema, checking that all required fields exist.

Parameters:

record (dict) – The record to validate.

Returns:

True if the record is valid, False otherwise.

Return type:

bool

write_to_avro(output_file=None)[source]

Write the collected records to an Avro file.

Parameters:

output_file (BinaryIO, optional) – The file to write records to. If None, writes to self.output_file.