0

I have two tables:

The table LINKS has:

LINK_ID --- integer, unique ID
FROM_NODE_X -- numbers/floats, indicating a geographical position
FROM_NODE_Y --
FROM_NODE_Z --
TO_NODE_X --
TO_NODE_Y --
TO_NODE_Z --

The table LINK_COORDS has:

LINK_ID --- integer, refers to above UID
ORDER --- integer, indicating order
X ---
Y ---
Z ---

Logically each LINK consists of a number of waypoints. The final order is:

FROM_NODE , 1 , 2 , 3 , ... , TO_NODE

A link has at least two waypoints (FROM_NODE, TO_NODE), but can have a variable number of waypoints in between (0 to 100+).

I now would need a way to aggregate, sort and store the waypoints of each link in an array which later will be used to draw a line.

I'm struggling with the LINK_COORDS being available as individual rows. Having the start and end positions in the other (LINKS) table doesn't help either. If I had a way to at least get all the LINK_COORDS joined/updated to the LINKS table I probably could work out the rest myself again. So if you have an idea on how to get that far, it'd be much appreciated already.

Considering performance would be nice (the tables have somewhere between 500k to 1mio entries now and will have multiples of that later), but is not essential for now.

EDIT: Thanks for the suggestion, a-horse-with-no-name. I chose to create the point geometries (PostGIS) for each XYZ before this step, so in the end there's only an array of points to create from the individual points. The adapted SQL

UPDATE "Link"
SET "POINTS" = 
    array_append(
        (array_prepend(
            "FROM_POINT",
            (SELECT array_agg(lc."POINT" ORDER BY lc."COUNT")
                FROM "LinkCoordinate" lc 
                WHERE lc."LINK_ID" = "Link"."LINK_ID")))
        , "TO_POINT")

however is running extremely slow: Running it exemplary on 10 links required ~120 seconds. Running it for all the 1,3mio links and plenty more linkcoords would probably take somewhere around half a year. Not really ideal.

How can I figure out where this immense slowness originates from?

If I get the source data in a pre-ordered format (so linkcoordinates of each link_ID), would this allow me to significantly speed up the SQL query?

EDIT: It appears the main slowdown originates from the SELECT subquery used in the array_agg() function. Everything else (incl. ordering) does not really cause any slowdown.

My current guess is that the SELECT query iterates over the entirety of "LinkCoordinate" for each and every link, making it work much harder than it has to, as all LinkCoordinates belonging to a Link are always stored in 'blocks' of rows. A single, sequential processing of the LinkCoordinates would be sufficient, really.

1 Answer 1

0

something like this maybe:

select l.link_id, 
       min(l.from_node_x) as from_node_x, 
       min(l.from_node_y) as from_node_y, 
       min(l.from_node_z) as from_node_z,
       array_agg(lc.x order by lc."ORDER") as points_x,
       array_agg(lc.y order by lc."ORDER") as points_y,
       array_agg(lc.z order by lc."ORDER") as points_z,
       min(l.to_node_x) as to_node_x, 
       min(l.to_node_y) as to_node_y, 
       min(l.to_node_z) as to_node_z
from links l 
  join link_coords lc on lc.link_id = l.link_id
group by l.link_id;

The min() is necessary due to the group by but won't change the result as all values from the links are the same anyway.

Another possibility is to use a scalar subquery. I'm unsure which of them is faster though - but the join/group by is probably more efficient.

select l.link_id, 
       l.from_node_x, 
       l.from_node_y, 
       l.from_node_z,
       (select array_agg(lc.x order by lc."ORDER") from link_coords lc where lc.link_id = l.link_id) as points_x,
       (select array_agg(lc.y order by lc."ORDER") from link_coords lc where lc.link_id = l.link_id) as points_y,
       (select array_agg(lc.z order by lc."ORDER") from link_coords lc where lc.link_id = l.link_id) as points_z,
       l.to_node_x, 
       l.to_node_y, 
       l.to_node_z
from links l 

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.