ISCC - Utilities#
json_canonical(obj)
#
Canonical, deterministic serialization of ISCC metadata.
We serialize ISCC metadata in a deterministic/reproducible manner by using JCS (RFC 8785) canonicalization.
Source code in iscc_core\utils.py
sliding_window(seq, width)
#
Generate a sequence of equal "width" slices each advancing by one elemnt.
All types that have a length and can be sliced are supported (list, tuple, str ...). The result type matches the type of the input sequence. Fragment slices smaller than the width at the end of the sequence are not produced. If "witdh" is smaller than the input sequence than one element will be returned that is shorter than the requested width.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
seq |
Sequence
|
Sequence of values to slide over |
required |
width |
int
|
Width of sliding window in number of items |
required |
Returns:
Type | Description |
---|---|
Generator
|
A generator of window sized items |
Source code in iscc_core\utils.py
iscc_compare(a, b)
#
Calculate separate hamming distances of compatible components of two ISCCs
Returns:
Type | Description |
---|---|
dict
|
A dict with keys meta_dist, semantic_dist, content_dist, data_dist, instance_match |
Source code in iscc_core\utils.py
iscc_similarity(a, b)
#
Calculate similarity of ISCC codes as a percentage value (0-100).
MainType, SubType, Version and Length of the codes must be the same.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
a |
ISCC a |
required | |
b |
ISCC b |
required |
Returns:
Type | Description |
---|---|
int
|
Similarity of ISCC a and b in percent (based on hamming distance) |
Source code in iscc_core\utils.py
iscc_distance(a, b)
#
Calculate hamming distance of ISCC codes.
MainType, SubType, Version and Length of the codes must be the same.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
a |
ISCC a |
required | |
b |
ISCC b |
required |
Returns:
Type | Description |
---|---|
int
|
Hamming distanced in number of bits. |
Source code in iscc_core\utils.py
iscc_distance_bytes(a, b)
#
Calculate hamming distance for binary hash digests of equal length.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
a |
bytes
|
binary hash digest |
required |
b |
bytes
|
binary hash digest |
required |
Returns:
Type | Description |
---|---|
int
|
Hamming distance in number of bits. |
Source code in iscc_core\utils.py
iscc_pair_unpack(a, b)
#
Unpack two ISCC codes and return their body hash digests if their headers match.
Headers match if their MainType, SubType, and Version are identical.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
a |
ISCC a |
required | |
b |
ISCC b |
required |
Returns:
Type | Description |
---|---|
Tuple[bytes, bytes]
|
Tuple with hash digests of a and b |
Raises:
Type | Description |
---|---|
ValueError
|
If ISCC headers don´t match |