3

The table: Flight (flight_num, src_city, dest_city, dep_time, arr_time, airfare, mileage)

I need to find the cheapest fare for unlimited stops from any given source city to any given destination city. The catch is that this can involve multiple flights, so for example if I'm flying from Montreal->KansasCity I can go from Montreal->Washington and then from Washington->KansasCity and so on. How would I go about generating this using a Postgres query?

Sample Data:

create table flight(
  flight_num BIGSERIAL PRIMARY KEY,
  source_city varchar,
  dest_city varchar,
  dep_time int,
  arr_time int,
  airfare int,
  mileage int
);


insert into flight VALUES
  (101,    'Montreal', 'NY',         0530,     0645,    180,      170),
  (102,    'Montreal', 'Washington',     0100,     0235,    100,      180),
  (103,    'NY',   'Chicago',        0800,     1000,    150,      300),
  (105,    'Washington', 'KansasCity',    0600,     0845,    200,      600),
  (106,    'Washington', 'NY',         1200,     1330,     50,       80),
  (107,    'Chicago',  'SLC',        1100,     1430,    220,      750),
  (110,    'KansasCity',  'Denver',         1400,     1525,    180,      300),
  (111,    'KansasCity',  'SLC',        1300,     1530,    200,      500),
  (112,    'SLC',    'SanFran',        1800,     1930,     85,      210),
  (113,    'SLC',    'LA',         1730,     1900,    185,      230),
  (115,    'Denver', 'SLC',        1500,     1600,     75,      300),
  (116,    'SanFran',  'LA',         2200,     2230,     50,       75),
  (118,    'LA',   'Seattle',        2000,     2100,    150,      450);
6
  • Are you limited in the # stops? Assuming there can only be a reasonable # connections (1-3 say), you could just write the three different queries and do a union. Commented Nov 18, 2013 at 3:04
  • I have a lot of latitude here. Can you show me how you might go about that? Commented Nov 18, 2013 at 3:05
  • You might be able to express this problem using pgrouting. In general, finding shortest/cheapest/etc path from A to B is a combinatorial problem - you need to restrict the search domain, otherwise you'll be considering (n)*(n-1)*(n-2)*(n-3).... combinations of flights, which is clearly not going to be viable. You can pre-filter to exclude obviously unacceptable candidates, but even so - routing is hard. Model "cost" as "distance" and use a proper routing algorithm, this is well explored. Commented Nov 18, 2013 at 3:34
  • Also, please add sample data in the form of CREATE TABLE and insert statements. Then we have something to work with when demoing examples. Comment here if you add it and I'll take a look; not interested in making up mock data for this problem. Commented Nov 18, 2013 at 3:36
  • Alright, updated with sample data per @CraigRinger's request Commented Nov 18, 2013 at 4:39

3 Answers 3

5

[this answer is based on Gordon's]

I changed arr_time and dep_time to TIME datatypes, which makes calculations easier. Also added result columns for total_time and waiting_time. Note: if there are any loops possible in the graph, you will need to avoid them (possibly using an array to store the path)

WITH RECURSIVE segs AS (
  SELECT f0.flight_num::text as flight
            , src_city, dest_city
            , dep_time AS departure
            , arr_time AS arrival
            , airfare, mileage
            , 1 as hops
            , (arr_time - dep_time)::interval AS total_time
            , '00:00'::interval as waiting_time
  FROM flight f0
  WHERE src_city = 'SLC' -- <SRC_CITY>
  UNION ALL
  SELECT s.flight || '-->' || f1.flight_num::text as flight
            , s.src_city, f1.dest_city
            , s.departure AS departure
            , f1.arr_time AS arrival
            , s.airfare + f1.airfare as airfare
            , s.mileage + f1.mileage as mileage
            , s.hops + 1 AS hops
            , s.total_time + (f1.arr_time - f1.dep_time)::interval AS total_time
            , s.waiting_time + (f1.dep_time - s.arrival)::interval AS waiting_time
  FROM segs s
     JOIN flight f1
       ON f1.src_city = s.dest_city
       AND f1.dep_time > s.arrival -- you can't leave until you are there
)
SELECT *
FROM segs
WHERE dest_city = 'LA' -- <DEST_CITY>
ORDER BY airfare desc
    ;

FYI: the changes to the table structure:

create table flight
  ( flight_num BIGSERIAL PRIMARY KEY
  , src_city varchar
  , dest_city varchar
  , dep_time TIME
  , arr_time TIME
  , airfare INTEGER
  , mileage INTEGER
);

And to the data:

insert into flight VALUES
  (101,    'Montreal',          'NY',                   '05:30',     '06:45',    180,      170),
  (102,    'Montreal',          'Washington',           '01:00',     '02:35',    100,      180),
  (103,    'NY',                'Chicago',              '08:00',     '10:00',    150,      300),
  (105,    'Washington',        'KansasCity',           '06:00',     '08:45',    200,      600),
  (106,    'Washington',        'NY',                   '12:00',     '13:30',     50,       80),
  (107,    'Chicago',           'SLC',                  '11:00',     '14:30',    220,      750),
  (110,    'KansasCity',        'Denver',               '14:00',     '15:25',    180,      300),
  (111,    'KansasCity',        'SLC',                  '13:00',     '15:30',    200,      500),
  (112,    'SLC',               'SanFran',              '18:00',     '19:30',     85,      210),
  (113,    'SLC',               'LA',                   '17:30',     '19:00',    185,      230),
  (115,    'Denver',            'SLC',                  '15:00',     '16:00',     75,      300),
  (116,    'SanFran',           'LA',                   '22:00',     '22:30',     50,       75),
  (118,    'LA',                'Seattle',              '20:00',     '21:00',    150,      450);
3
  • ERROR: cannot cast type integer to interval LINE 8: , (arr_time - dep_time)::interval AS total_time Commented Nov 18, 2013 at 16:33
  • Weel, it works here (pg 9.3.1) Note: I did change the times to the datatype TIME (and changed the all the literals to '18:45', etc) , which makes more sense to me. Commented Nov 18, 2013 at 17:47
  • Any idea how I would exclude certain cities from being considered in the query? Say, for example, that I wouldn't want to go through SLC on a particular flight. I tried inserting some logic into the WHERE clauses, but to no avail. Commented Nov 19, 2013 at 17:10
1

You want to use a recursive CTE for this. However, you will have to make a decision about how many flights to include. The following (untested) query shows how to do this, limiting the number of flight segments to 5:

with recursive segs as (
      select cast(f.flight_num as varchar(255)) as flight, src_city, dest_city, dept_time,
             arr_time, airfare, mileage, 1 as numsegs
      from flight f
      where src_city = <SRC_CITY>
      union all
      select cast(s.flight||'-->'||cast(f.flight_num as varchar(255)) as varchar(255)) as flight, s.src_city, f.dest_city,
             s.dept_time, f.arr_time, s.airfare + f.airfare as airfare,
             s.mileage + f.mileage as milage, s.numsegs + 1
      from segs s join
           flight f
           on s.src_city = f.dest_city
      where s.numsegs < 5
)
select *
from segs
where dest_city = <DEST_CITY>
order by airfare desc
limit 1;
2
  • Getting an error: ERROR: recursive query "segs" column 1 has type character varying(255) in non-recursive term but type character varying overall LINE 2: select cast(f.flight_num as varchar(255)) as flight, s... ^ HINT: Cast the output of the non-recursive term to the correct type. Commented Nov 18, 2013 at 3:19
  • Thanks. Seems as though this is only working for one flight. For example, if Montreal goes directly to Chicago, that works. But if Montreal goes to LAX and LAX goes to Chicago, then nothing appears. Commented Nov 18, 2013 at 3:28
0

Something like this:

select * from 
(select flight_num, airfare from flight where src_city = ? and  dest_city = ?
union
select f1.flight_num || f2.flight_num, f1.airfare+f2.airfare 
from flight f1, flight f2 where f1.src_city = ? and  f2.dest_city = ? and f1.dest_city = f2.src_city
union
...
) s order by airfare desc

I didn't test that as I'm leaving that for you so there might be subtle problems that require testing. This is clearly homework since no airline plans things this way. So I don't mind leaving you extra work.

1
  • This is overly complex and inefficient. A recursive query (as shown by Gordon or and joop) is a much better solution Commented Nov 18, 2013 at 15:31

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.