ord_schema.orm
Base ORM objects.
ord_schema.orm.add_datasets
ord_schema.orm.conftest
ord_schema.orm.database
Functions for creating/managing the PostgreSQL database.
- ord_schema.orm.database.add_dataset(dataset: Dataset, session: Session, rdkit_cartridge: bool = True) None
Adds a dataset to the database.
- ord_schema.orm.database.delete_dataset(dataset_id: str, session: Session) None
Deletes a dataset from the database.
- ord_schema.orm.database.get_connection_string(database: str, username: str, password: str, host: str = 'localhost', port: int = 5432) str
Creates an SQLAlchemy connection string.
- ord_schema.orm.database.get_dataset_md5(dataset_id: str, session: Session) str | None
Returns the MD5 hash of the current version of a dataset, if it exists in the database.
- ord_schema.orm.database.get_dataset_size(dataset_id: str, session: Session) int
Returns the number of reactions in a dataset.
- ord_schema.orm.database.prepare_database(engine: Engine) bool
Prepares the database and creates the ORM table structure.
- Parameters:
engine – SQLAlchemy Engine.
- Returns:
Whether the RDKit PostgreSQL cartridge is installed.
- ord_schema.orm.database.update_rdkit_ids(dataset_id: str, session: Session) None
Updates RDKit reaction and mol ID associations in the ORD tables.
- ord_schema.orm.database.update_rdkit_tables(dataset_id: str, session: Session) None
Updates RDKit PostgreSQL cartridge data.
ord_schema.orm.mappers
Table mappings for Reaction protos.
Notes
Foreign keys to the reaction table are done using the id column, not the ORD reaction ID (reaction_id). However, the reaction_id column is used when the reaction ID is specifically called for, as with crude inputs.
We use inheritance to handle messages that appear in more than one context; see https://docs.sqlalchemy.org/en/14/orm/inheritance.html. The possible constraints are:
Some message types are used more than once in a parent (such as Time in ReactionInput), which forces the use of either joined or single table inheritance.
Some messages can be repeated, while others are unique to their parent. These uniqueness constraints force the use of joined table inheritance since they are specific to their polymorphic type.
For convenience and consistency, we use single table inheritance for all message types, regardless of whether they are used in one context or more than one context. This means that we do not enforce the second constraint in the database.
- ord_schema.orm.mappers.build_mapper(message_type: Type[Message], parents: dict[Type[Message], list[tuple[Type[Message], str, bool]]]) Type
Creates a mapper class for a specific protocol buffer message type.
- Parameters:
message_type – Protocol buffer message type.
parents – Dict mapping message types to lists of (parent message type, field name, unique) tuples.
- Returns:
Generated mapper class.
- ord_schema.orm.mappers.build_mappers() dict[Type[Message], Type]
Creates ORM mapper classes for protocol buffer message types.
- Returns:
Dict mapping protocol buffer message types to mapper classes.
- ord_schema.orm.mappers.from_proto(message: Message, mapper: Type[Base] | None = None, key: str | None = None) Base
Converts a protobuf message into an ORM object.
- Parameters:
message – Protobuf message.
mapper – ORM mapper class. For top-level protos like Dataset and Reaction this can be left as None; it must be provided for Child subclasses to properly handle polymorphism.
key – Map key (we store maps as rows of (key, value) tuples).
- Returns:
ORM object.
- ord_schema.orm.mappers.get_message_type(full_name: str) Any
Fetches the class for a protocol buffer message type.
- ord_schema.orm.mappers.get_parents(message_type: Type[Message]) dict[Type[Message], list[tuple[Type[Message], str, bool]]]
Returns the parent message types for each message type.
- ord_schema.orm.mappers.to_proto(base: Base) Message
Converts an ORM object into a protobuf message.
- Parameters:
base – ORM object.
- Returns:
Protobuf message.
ord_schema.orm.rdkit_mappers
RDKit PostgreSQL cartridge functionality.
Notes
These tables live in a separate “rdkit” schema to avoid name conflicts between tables and extension types.
The RDKit-specific columns are populated by ord_schema.orm.database.add_rdkit; this allows the ORM to function normally even if if the RDKit PostgreSQL cartridge is not installed (the smiles column will be populated and the other columns will be empty).
Objects with this type are added to the ORM in from_proto() using the rdkit field.
- class ord_schema.orm.rdkit_mappers.CString
Bases:
UserDefinedType
PostgreSQL cstring.
- cache_ok: bool | None = True
Indicate if statements using this
ExternalType
are “safe to cache”.The default value
None
will emit a warning and then not allow caching of a statement which includes this type. Set toFalse
to disable statements using this type from being cached at all without a warning. When set toTrue
, the object’s class and selected elements from its state will be used as part of the cache key. For example, using aTypeDecorator
:class MyType(TypeDecorator): impl = String cache_ok = True def __init__(self, choices): self.choices = tuple(choices) self.internal_only = True
The cache key for the above type would be equivalent to:
>>> MyType(["a", "b", "c"])._static_cache_key (<class '__main__.MyType'>, ('choices', ('a', 'b', 'c')))
The caching scheme will extract attributes from the type that correspond to the names of parameters in the
__init__()
method. Above, the “choices” attribute becomes part of the cache key but “internal_only” does not, because there is no parameter named “internal_only”.The requirements for cacheable elements is that they are hashable and also that they indicate the same SQL rendered for expressions using this type every time for a given cache value.
To accommodate for datatypes that refer to unhashable structures such as dictionaries, sets and lists, these objects can be made “cacheable” by assigning hashable structures to the attributes whose names correspond with the names of the arguments. For example, a datatype which accepts a dictionary of lookup values may publish this as a sorted series of tuples. Given a previously un-cacheable type as:
class LookupType(UserDefinedType): """a custom type that accepts a dictionary as a parameter. this is the non-cacheable version, as "self.lookup" is not hashable. """ def __init__(self, lookup): self.lookup = lookup def get_col_spec(self, **kw): return "VARCHAR(255)" def bind_processor(self, dialect): ... # works with "self.lookup" ...
Where “lookup” is a dictionary. The type will not be able to generate a cache key:
>>> type_ = LookupType({"a": 10, "b": 20}) >>> type_._static_cache_key <stdin>:1: SAWarning: UserDefinedType LookupType({'a': 10, 'b': 20}) will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type object's state is safe to use in a cache key, or False to disable this warning. symbol('no_cache')
If we did set up such a cache key, it wouldn’t be usable. We would get a tuple structure that contains a dictionary inside of it, which cannot itself be used as a key in a “cache dictionary” such as SQLAlchemy’s statement cache, since Python dictionaries aren’t hashable:
>>> # set cache_ok = True >>> type_.cache_ok = True >>> # this is the cache key it would generate >>> key = type_._static_cache_key >>> key (<class '__main__.LookupType'>, ('lookup', {'a': 10, 'b': 20})) >>> # however this key is not hashable, will fail when used with >>> # SQLAlchemy statement cache >>> some_cache = {key: "some sql value"} Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'dict'
The type may be made cacheable by assigning a sorted tuple of tuples to the “.lookup” attribute:
class LookupType(UserDefinedType): """a custom type that accepts a dictionary as a parameter. The dictionary is stored both as itself in a private variable, and published in a public variable as a sorted tuple of tuples, which is hashable and will also return the same value for any two equivalent dictionaries. Note it assumes the keys and values of the dictionary are themselves hashable. """ cache_ok = True def __init__(self, lookup): self._lookup = lookup # assume keys/values of "lookup" are hashable; otherwise # they would also need to be converted in some way here self.lookup = tuple((key, lookup[key]) for key in sorted(lookup)) def get_col_spec(self, **kw): return "VARCHAR(255)" def bind_processor(self, dialect): ... # works with "self._lookup" ...
Where above, the cache key for
LookupType({"a": 10, "b": 20})
will be:>>> LookupType({"a": 10, "b": 20})._static_cache_key (<class '__main__.LookupType'>, ('lookup', (('a', 10), ('b', 20))))
Added in version 1.4.14: - added the
cache_ok
flag to allow some configurability of caching forTypeDecorator
classes.Added in version 1.4.28: - added the
ExternalType
mixin which generalizes thecache_ok
flag to both theTypeDecorator
andUserDefinedType
classes.See also
sql_caching
- get_col_spec(**kwargs)
Returns the column type.
- property python_type
Return the Python type object expected to be returned by instances of this type, if known.
Basically, for those types which enforce a return type, or are known across the board to do such for all common DBAPIs (like
int
for example), will return that type.If a return type is not defined, raises
NotImplementedError
.Note that any type also accommodates NULL in SQL which means you can also get back
None
from any type in practice.
- class ord_schema.orm.rdkit_mappers.FingerprintType(value)
Bases:
Enum
RDKit PostgreSQL fingerprint types.
- MORGAN_BFP = <sqlalchemy.sql.functions._FunctionGenerator object>
- MORGAN_SFP = <sqlalchemy.sql.functions._FunctionGenerator object>
- class ord_schema.orm.rdkit_mappers.RDKitBfp
Bases:
UserDefinedType
https://github.com/rdkit/rdkit/blob/master/Code/PgSQL/rdkit/rdkit.sql.in#L81.
- cache_ok: bool | None = True
Indicate if statements using this
ExternalType
are “safe to cache”.The default value
None
will emit a warning and then not allow caching of a statement which includes this type. Set toFalse
to disable statements using this type from being cached at all without a warning. When set toTrue
, the object’s class and selected elements from its state will be used as part of the cache key. For example, using aTypeDecorator
:class MyType(TypeDecorator): impl = String cache_ok = True def __init__(self, choices): self.choices = tuple(choices) self.internal_only = True
The cache key for the above type would be equivalent to:
>>> MyType(["a", "b", "c"])._static_cache_key (<class '__main__.MyType'>, ('choices', ('a', 'b', 'c')))
The caching scheme will extract attributes from the type that correspond to the names of parameters in the
__init__()
method. Above, the “choices” attribute becomes part of the cache key but “internal_only” does not, because there is no parameter named “internal_only”.The requirements for cacheable elements is that they are hashable and also that they indicate the same SQL rendered for expressions using this type every time for a given cache value.
To accommodate for datatypes that refer to unhashable structures such as dictionaries, sets and lists, these objects can be made “cacheable” by assigning hashable structures to the attributes whose names correspond with the names of the arguments. For example, a datatype which accepts a dictionary of lookup values may publish this as a sorted series of tuples. Given a previously un-cacheable type as:
class LookupType(UserDefinedType): """a custom type that accepts a dictionary as a parameter. this is the non-cacheable version, as "self.lookup" is not hashable. """ def __init__(self, lookup): self.lookup = lookup def get_col_spec(self, **kw): return "VARCHAR(255)" def bind_processor(self, dialect): ... # works with "self.lookup" ...
Where “lookup” is a dictionary. The type will not be able to generate a cache key:
>>> type_ = LookupType({"a": 10, "b": 20}) >>> type_._static_cache_key <stdin>:1: SAWarning: UserDefinedType LookupType({'a': 10, 'b': 20}) will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type object's state is safe to use in a cache key, or False to disable this warning. symbol('no_cache')
If we did set up such a cache key, it wouldn’t be usable. We would get a tuple structure that contains a dictionary inside of it, which cannot itself be used as a key in a “cache dictionary” such as SQLAlchemy’s statement cache, since Python dictionaries aren’t hashable:
>>> # set cache_ok = True >>> type_.cache_ok = True >>> # this is the cache key it would generate >>> key = type_._static_cache_key >>> key (<class '__main__.LookupType'>, ('lookup', {'a': 10, 'b': 20})) >>> # however this key is not hashable, will fail when used with >>> # SQLAlchemy statement cache >>> some_cache = {key: "some sql value"} Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'dict'
The type may be made cacheable by assigning a sorted tuple of tuples to the “.lookup” attribute:
class LookupType(UserDefinedType): """a custom type that accepts a dictionary as a parameter. The dictionary is stored both as itself in a private variable, and published in a public variable as a sorted tuple of tuples, which is hashable and will also return the same value for any two equivalent dictionaries. Note it assumes the keys and values of the dictionary are themselves hashable. """ cache_ok = True def __init__(self, lookup): self._lookup = lookup # assume keys/values of "lookup" are hashable; otherwise # they would also need to be converted in some way here self.lookup = tuple((key, lookup[key]) for key in sorted(lookup)) def get_col_spec(self, **kw): return "VARCHAR(255)" def bind_processor(self, dialect): ... # works with "self._lookup" ...
Where above, the cache key for
LookupType({"a": 10, "b": 20})
will be:>>> LookupType({"a": 10, "b": 20})._static_cache_key (<class '__main__.LookupType'>, ('lookup', (('a', 10), ('b', 20))))
Added in version 1.4.14: - added the
cache_ok
flag to allow some configurability of caching forTypeDecorator
classes.Added in version 1.4.28: - added the
ExternalType
mixin which generalizes thecache_ok
flag to both theTypeDecorator
andUserDefinedType
classes.See also
sql_caching
- get_col_spec(**kwargs)
Returns the column type.
- property python_type
Return the Python type object expected to be returned by instances of this type, if known.
Basically, for those types which enforce a return type, or are known across the board to do such for all common DBAPIs (like
int
for example), will return that type.If a return type is not defined, raises
NotImplementedError
.Note that any type also accommodates NULL in SQL which means you can also get back
None
from any type in practice.
- class ord_schema.orm.rdkit_mappers.RDKitMol
Bases:
UserDefinedType
https://github.com/rdkit/rdkit/blob/master/Code/PgSQL/rdkit/rdkit.sql.in#L4.
- cache_ok: bool | None = True
Indicate if statements using this
ExternalType
are “safe to cache”.The default value
None
will emit a warning and then not allow caching of a statement which includes this type. Set toFalse
to disable statements using this type from being cached at all without a warning. When set toTrue
, the object’s class and selected elements from its state will be used as part of the cache key. For example, using aTypeDecorator
:class MyType(TypeDecorator): impl = String cache_ok = True def __init__(self, choices): self.choices = tuple(choices) self.internal_only = True
The cache key for the above type would be equivalent to:
>>> MyType(["a", "b", "c"])._static_cache_key (<class '__main__.MyType'>, ('choices', ('a', 'b', 'c')))
The caching scheme will extract attributes from the type that correspond to the names of parameters in the
__init__()
method. Above, the “choices” attribute becomes part of the cache key but “internal_only” does not, because there is no parameter named “internal_only”.The requirements for cacheable elements is that they are hashable and also that they indicate the same SQL rendered for expressions using this type every time for a given cache value.
To accommodate for datatypes that refer to unhashable structures such as dictionaries, sets and lists, these objects can be made “cacheable” by assigning hashable structures to the attributes whose names correspond with the names of the arguments. For example, a datatype which accepts a dictionary of lookup values may publish this as a sorted series of tuples. Given a previously un-cacheable type as:
class LookupType(UserDefinedType): """a custom type that accepts a dictionary as a parameter. this is the non-cacheable version, as "self.lookup" is not hashable. """ def __init__(self, lookup): self.lookup = lookup def get_col_spec(self, **kw): return "VARCHAR(255)" def bind_processor(self, dialect): ... # works with "self.lookup" ...
Where “lookup” is a dictionary. The type will not be able to generate a cache key:
>>> type_ = LookupType({"a": 10, "b": 20}) >>> type_._static_cache_key <stdin>:1: SAWarning: UserDefinedType LookupType({'a': 10, 'b': 20}) will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type object's state is safe to use in a cache key, or False to disable this warning. symbol('no_cache')
If we did set up such a cache key, it wouldn’t be usable. We would get a tuple structure that contains a dictionary inside of it, which cannot itself be used as a key in a “cache dictionary” such as SQLAlchemy’s statement cache, since Python dictionaries aren’t hashable:
>>> # set cache_ok = True >>> type_.cache_ok = True >>> # this is the cache key it would generate >>> key = type_._static_cache_key >>> key (<class '__main__.LookupType'>, ('lookup', {'a': 10, 'b': 20})) >>> # however this key is not hashable, will fail when used with >>> # SQLAlchemy statement cache >>> some_cache = {key: "some sql value"} Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'dict'
The type may be made cacheable by assigning a sorted tuple of tuples to the “.lookup” attribute:
class LookupType(UserDefinedType): """a custom type that accepts a dictionary as a parameter. The dictionary is stored both as itself in a private variable, and published in a public variable as a sorted tuple of tuples, which is hashable and will also return the same value for any two equivalent dictionaries. Note it assumes the keys and values of the dictionary are themselves hashable. """ cache_ok = True def __init__(self, lookup): self._lookup = lookup # assume keys/values of "lookup" are hashable; otherwise # they would also need to be converted in some way here self.lookup = tuple((key, lookup[key]) for key in sorted(lookup)) def get_col_spec(self, **kw): return "VARCHAR(255)" def bind_processor(self, dialect): ... # works with "self._lookup" ...
Where above, the cache key for
LookupType({"a": 10, "b": 20})
will be:>>> LookupType({"a": 10, "b": 20})._static_cache_key (<class '__main__.LookupType'>, ('lookup', (('a', 10), ('b', 20))))
Added in version 1.4.14: - added the
cache_ok
flag to allow some configurability of caching forTypeDecorator
classes.Added in version 1.4.28: - added the
ExternalType
mixin which generalizes thecache_ok
flag to both theTypeDecorator
andUserDefinedType
classes.See also
sql_caching
- get_col_spec(**kwargs)
Returns the column type.
- property python_type
Return the Python type object expected to be returned by instances of this type, if known.
Basically, for those types which enforce a return type, or are known across the board to do such for all common DBAPIs (like
int
for example), will return that type.If a return type is not defined, raises
NotImplementedError
.Note that any type also accommodates NULL in SQL which means you can also get back
None
from any type in practice.
- class ord_schema.orm.rdkit_mappers.RDKitMols(**kwargs)
Bases:
Base
Table for storing compound structures and associated RDKit cartridge data.
- classmethod contains_substructure(pattern: str) ColumnElement[bool]
- id
- classmethod matches_smarts(pattern: str) ColumnElement[bool]
- mol
- morgan_bfp
- morgan_sfp
- smiles
- classmethod tanimoto(other: str, fp_type: FingerprintType = FingerprintType.MORGAN_BFP) ColumnElement[float]
- class ord_schema.orm.rdkit_mappers.RDKitReaction
Bases:
UserDefinedType
https://github.com/rdkit/rdkit/blob/master/Code/PgSQL/rdkit/rdkit.sql.in#L129.
- cache_ok: bool | None = True
Indicate if statements using this
ExternalType
are “safe to cache”.The default value
None
will emit a warning and then not allow caching of a statement which includes this type. Set toFalse
to disable statements using this type from being cached at all without a warning. When set toTrue
, the object’s class and selected elements from its state will be used as part of the cache key. For example, using aTypeDecorator
:class MyType(TypeDecorator): impl = String cache_ok = True def __init__(self, choices): self.choices = tuple(choices) self.internal_only = True
The cache key for the above type would be equivalent to:
>>> MyType(["a", "b", "c"])._static_cache_key (<class '__main__.MyType'>, ('choices', ('a', 'b', 'c')))
The caching scheme will extract attributes from the type that correspond to the names of parameters in the
__init__()
method. Above, the “choices” attribute becomes part of the cache key but “internal_only” does not, because there is no parameter named “internal_only”.The requirements for cacheable elements is that they are hashable and also that they indicate the same SQL rendered for expressions using this type every time for a given cache value.
To accommodate for datatypes that refer to unhashable structures such as dictionaries, sets and lists, these objects can be made “cacheable” by assigning hashable structures to the attributes whose names correspond with the names of the arguments. For example, a datatype which accepts a dictionary of lookup values may publish this as a sorted series of tuples. Given a previously un-cacheable type as:
class LookupType(UserDefinedType): """a custom type that accepts a dictionary as a parameter. this is the non-cacheable version, as "self.lookup" is not hashable. """ def __init__(self, lookup): self.lookup = lookup def get_col_spec(self, **kw): return "VARCHAR(255)" def bind_processor(self, dialect): ... # works with "self.lookup" ...
Where “lookup” is a dictionary. The type will not be able to generate a cache key:
>>> type_ = LookupType({"a": 10, "b": 20}) >>> type_._static_cache_key <stdin>:1: SAWarning: UserDefinedType LookupType({'a': 10, 'b': 20}) will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type object's state is safe to use in a cache key, or False to disable this warning. symbol('no_cache')
If we did set up such a cache key, it wouldn’t be usable. We would get a tuple structure that contains a dictionary inside of it, which cannot itself be used as a key in a “cache dictionary” such as SQLAlchemy’s statement cache, since Python dictionaries aren’t hashable:
>>> # set cache_ok = True >>> type_.cache_ok = True >>> # this is the cache key it would generate >>> key = type_._static_cache_key >>> key (<class '__main__.LookupType'>, ('lookup', {'a': 10, 'b': 20})) >>> # however this key is not hashable, will fail when used with >>> # SQLAlchemy statement cache >>> some_cache = {key: "some sql value"} Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'dict'
The type may be made cacheable by assigning a sorted tuple of tuples to the “.lookup” attribute:
class LookupType(UserDefinedType): """a custom type that accepts a dictionary as a parameter. The dictionary is stored both as itself in a private variable, and published in a public variable as a sorted tuple of tuples, which is hashable and will also return the same value for any two equivalent dictionaries. Note it assumes the keys and values of the dictionary are themselves hashable. """ cache_ok = True def __init__(self, lookup): self._lookup = lookup # assume keys/values of "lookup" are hashable; otherwise # they would also need to be converted in some way here self.lookup = tuple((key, lookup[key]) for key in sorted(lookup)) def get_col_spec(self, **kw): return "VARCHAR(255)" def bind_processor(self, dialect): ... # works with "self._lookup" ...
Where above, the cache key for
LookupType({"a": 10, "b": 20})
will be:>>> LookupType({"a": 10, "b": 20})._static_cache_key (<class '__main__.LookupType'>, ('lookup', (('a', 10), ('b', 20))))
Added in version 1.4.14: - added the
cache_ok
flag to allow some configurability of caching forTypeDecorator
classes.Added in version 1.4.28: - added the
ExternalType
mixin which generalizes thecache_ok
flag to both theTypeDecorator
andUserDefinedType
classes.See also
sql_caching
- get_col_spec(**kwargs)
Returns the column type.
- property python_type
Return the Python type object expected to be returned by instances of this type, if known.
Basically, for those types which enforce a return type, or are known across the board to do such for all common DBAPIs (like
int
for example), will return that type.If a return type is not defined, raises
NotImplementedError
.Note that any type also accommodates NULL in SQL which means you can also get back
None
from any type in practice.
- class ord_schema.orm.rdkit_mappers.RDKitReactions(**kwargs)
Bases:
Base
Table for storing reaction objects and associated RDKit cartridge data.
- id
- classmethod matches_smarts(pattern: str) ColumnElement[bool]
- reaction
- reaction_smiles
- class ord_schema.orm.rdkit_mappers.RDKitSfp
Bases:
UserDefinedType
https://github.com/rdkit/rdkit/blob/master/Code/PgSQL/rdkit/rdkit.sql.in#L105.
- cache_ok: bool | None = True
Indicate if statements using this
ExternalType
are “safe to cache”.The default value
None
will emit a warning and then not allow caching of a statement which includes this type. Set toFalse
to disable statements using this type from being cached at all without a warning. When set toTrue
, the object’s class and selected elements from its state will be used as part of the cache key. For example, using aTypeDecorator
:class MyType(TypeDecorator): impl = String cache_ok = True def __init__(self, choices): self.choices = tuple(choices) self.internal_only = True
The cache key for the above type would be equivalent to:
>>> MyType(["a", "b", "c"])._static_cache_key (<class '__main__.MyType'>, ('choices', ('a', 'b', 'c')))
The caching scheme will extract attributes from the type that correspond to the names of parameters in the
__init__()
method. Above, the “choices” attribute becomes part of the cache key but “internal_only” does not, because there is no parameter named “internal_only”.The requirements for cacheable elements is that they are hashable and also that they indicate the same SQL rendered for expressions using this type every time for a given cache value.
To accommodate for datatypes that refer to unhashable structures such as dictionaries, sets and lists, these objects can be made “cacheable” by assigning hashable structures to the attributes whose names correspond with the names of the arguments. For example, a datatype which accepts a dictionary of lookup values may publish this as a sorted series of tuples. Given a previously un-cacheable type as:
class LookupType(UserDefinedType): """a custom type that accepts a dictionary as a parameter. this is the non-cacheable version, as "self.lookup" is not hashable. """ def __init__(self, lookup): self.lookup = lookup def get_col_spec(self, **kw): return "VARCHAR(255)" def bind_processor(self, dialect): ... # works with "self.lookup" ...
Where “lookup” is a dictionary. The type will not be able to generate a cache key:
>>> type_ = LookupType({"a": 10, "b": 20}) >>> type_._static_cache_key <stdin>:1: SAWarning: UserDefinedType LookupType({'a': 10, 'b': 20}) will not produce a cache key because the ``cache_ok`` flag is not set to True. Set this flag to True if this type object's state is safe to use in a cache key, or False to disable this warning. symbol('no_cache')
If we did set up such a cache key, it wouldn’t be usable. We would get a tuple structure that contains a dictionary inside of it, which cannot itself be used as a key in a “cache dictionary” such as SQLAlchemy’s statement cache, since Python dictionaries aren’t hashable:
>>> # set cache_ok = True >>> type_.cache_ok = True >>> # this is the cache key it would generate >>> key = type_._static_cache_key >>> key (<class '__main__.LookupType'>, ('lookup', {'a': 10, 'b': 20})) >>> # however this key is not hashable, will fail when used with >>> # SQLAlchemy statement cache >>> some_cache = {key: "some sql value"} Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'dict'
The type may be made cacheable by assigning a sorted tuple of tuples to the “.lookup” attribute:
class LookupType(UserDefinedType): """a custom type that accepts a dictionary as a parameter. The dictionary is stored both as itself in a private variable, and published in a public variable as a sorted tuple of tuples, which is hashable and will also return the same value for any two equivalent dictionaries. Note it assumes the keys and values of the dictionary are themselves hashable. """ cache_ok = True def __init__(self, lookup): self._lookup = lookup # assume keys/values of "lookup" are hashable; otherwise # they would also need to be converted in some way here self.lookup = tuple((key, lookup[key]) for key in sorted(lookup)) def get_col_spec(self, **kw): return "VARCHAR(255)" def bind_processor(self, dialect): ... # works with "self._lookup" ...
Where above, the cache key for
LookupType({"a": 10, "b": 20})
will be:>>> LookupType({"a": 10, "b": 20})._static_cache_key (<class '__main__.LookupType'>, ('lookup', (('a', 10), ('b', 20))))
Added in version 1.4.14: - added the
cache_ok
flag to allow some configurability of caching forTypeDecorator
classes.Added in version 1.4.28: - added the
ExternalType
mixin which generalizes thecache_ok
flag to both theTypeDecorator
andUserDefinedType
classes.See also
sql_caching
- get_col_spec(**kwargs)
Returns the column type.
- property python_type
Return the Python type object expected to be returned by instances of this type, if known.
Basically, for those types which enforce a return type, or are known across the board to do such for all common DBAPIs (like
int
for example), will return that type.If a return type is not defined, raises
NotImplementedError
.Note that any type also accommodates NULL in SQL which means you can also get back
None
from any type in practice.
- ord_schema.orm.rdkit_mappers.rdkit_cartridge() bool
Returns whether to use RDKit PostgreSQL cartridge functionality.