3

Context:

I am building an application prototype using the Python Flask framework that is intended to store stratigraphical (geological) information inside a postgresql JSON column.

Each row represents a location where the data is coming from and some other metadata. The essential stratigraphic information can contain an arbitrary number of attributes,and it is not wanted to restrict the set of attributes in any way. Each of them is connected with a stratigrapic unit described by its upper and lower bound in terms of depth.

Approach:

A stratigraphical record could look like this in JSON: (note that the units can overlap in depth)

{
  "strat_units": [
    {
      "id": 1,
      "top": 0,
      "bottom": 20,
      "attribute": "parameter 1",
      "value": 0.1
    },
    {
      "id": 2,
      "top": 10,
      "bottom": 30,
      "attribute": "parameter 2",
      "value": 0.2
    }
  ]
}

From this example it is obvious that the structure of the data makes it neccessary to store it in a non-relational fashion. But as the application will also deal with geographic analysis, it seems to be favourable to stick to PostgreSQL with PostGIS enabled.

In order to store the required data inside Postgres, I use the following SQLAlchemy Model:

class Site(Base):
    site_name       = db.Column(db.String(128), nullable=False)
    geom            = db.Column(Geometry(geometry_type='POINT', srid=4326))
    stratigraphy    = db.Column(JSON)

Question:

Given the above data structure, how can I query, access and manipulate the data, stored inside the JSON column, using plain SQL and SQLAlchemy (ORM)?

For example:

  • How can I SELECT all rows that have a stratigrahic unit with "parameter x" > y
  • How can I SET or manipulate the data using SQLAlchemy, targeting with the row ID and the strat_unit ID?

UPDATE:

The data model has changed, but the question stays the same. The model now looks like this:

class Record(Base):

    """represents single stratigraphic units"""

    __tablename__ = 'records'
    __table_args__ = {"schema":"public"}

    # ID of corresponding site:
    site_id         = db.Column(db.Integer, db.ForeignKey('public.sites.id'))

    # depth values:
    depth           = db.Column(db.Numeric, nullable=True)
    upper_boundary  = db.Column(db.Numeric, nullable=True)
    lower_boundary  = db.Column(db.Numeric, nullable=True)

    # stratigraphic properties, represented as key/value store
    properties      = db.Column(JSONB)

With the properties JSON looking like:

{"key": "value", "loss_on_ignition": "0.1"}

How can I

  • get all unique keys in all rows?
  • get all rows containing 'key'?
  • get all rows having 'key' = 'value'?
  • update a row/key?
5
  • 1
    Have you tried using cast? Commented Dec 29, 2014 at 2:30
  • mongodb supports gis too, check this out: docs.mongodb.org/manual/applications/geospatial-indexes. Commented Dec 29, 2014 at 18:40
  • @Anzel I tried some things I found all over the place, but the JSON datastore and its methods as well are not very well documented yet. I am even asking myself if it is generally a good idea at all to store redundant keys in the document store, as i dont yet understand how I can access a specific element of the array. Commented Jan 10, 2015 at 11:13
  • @Anzel I updated the question to match the new (simpler) data model. The question remains the same. Commented Apr 4, 2015 at 21:35
  • @leak_my_brain, that's a hell of questions to answer (and it has been quite some time). As you're using JSONB I take you're on psql 9.4+ and as I mentioned above, to deal with psql querying comparison (especially on numbers), use cast. For checking key or value, use index operation perhaps with as_text. Commented Apr 4, 2015 at 22:59

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.