Context:
I am building an application prototype using the Python Flask framework that is intended to store stratigraphical (geological) information inside a postgresql JSON column.
Each row represents a location where the data is coming from and some other metadata. The essential stratigraphic information can contain an arbitrary number of attributes,and it is not wanted to restrict the set of attributes in any way. Each of them is connected with a stratigrapic unit described by its upper and lower bound in terms of depth.
Approach:
A stratigraphical record could look like this in JSON: (note that the units can overlap in depth)
{
"strat_units": [
{
"id": 1,
"top": 0,
"bottom": 20,
"attribute": "parameter 1",
"value": 0.1
},
{
"id": 2,
"top": 10,
"bottom": 30,
"attribute": "parameter 2",
"value": 0.2
}
]
}
From this example it is obvious that the structure of the data makes it neccessary to store it in a non-relational fashion. But as the application will also deal with geographic analysis, it seems to be favourable to stick to PostgreSQL with PostGIS enabled.
In order to store the required data inside Postgres, I use the following SQLAlchemy Model:
class Site(Base):
site_name = db.Column(db.String(128), nullable=False)
geom = db.Column(Geometry(geometry_type='POINT', srid=4326))
stratigraphy = db.Column(JSON)
Question:
Given the above data structure, how can I query, access and manipulate the data, stored inside the JSON column, using plain SQL and SQLAlchemy (ORM)?
For example:
- How can I
SELECT
all rows that have a stratigrahic unit with"parameter x" > y
- How can I
SET
or manipulate the data using SQLAlchemy, targeting with the row ID and the strat_unit ID?
UPDATE:
The data model has changed, but the question stays the same. The model now looks like this:
class Record(Base):
"""represents single stratigraphic units"""
__tablename__ = 'records'
__table_args__ = {"schema":"public"}
# ID of corresponding site:
site_id = db.Column(db.Integer, db.ForeignKey('public.sites.id'))
# depth values:
depth = db.Column(db.Numeric, nullable=True)
upper_boundary = db.Column(db.Numeric, nullable=True)
lower_boundary = db.Column(db.Numeric, nullable=True)
# stratigraphic properties, represented as key/value store
properties = db.Column(JSONB)
With the properties JSON looking like:
{"key": "value", "loss_on_ignition": "0.1"}
How can I
- get all unique keys in all rows?
- get all rows containing 'key'?
- get all rows having 'key' = 'value'?
- update a row/key?
cast
? – Anzel Dec 29 '14 at 2:30key
orvalue
, use index operation perhaps withas_text
. – Anzel Apr 4 at 22:59