Take the 2-minute tour ×
Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I have been using Python to insert parsed data from a JSON server response into a MySQL database.

The JSON data consists of around 10,000 responses, each holding around 40 nested pieces of information.

For illustration - it holds information on 10,000 journeys. For each of these journeys, there are the constitutional parts of the journeys. Each journey can have up to 25 legs (for example, take road A10 for 3km). Within these legs, there can be multiple parts (walk to corner of B231).

What I have been doing is iterating over the entire set of results at each level. So, my python begins by inserting all of the journey information for the 10,000 journeys. It then moves to the next level and inserts this tier of information (the legs). For example, the constitutional, leg parts of the journey. At this point, to maintain integrity, I have to associate this information with its parent, such that the correct leg is associated with its correct parent journey. I do this by searching the DB for the journey information and identifying the foreign key using the unique combination of origin, destination and time.

query = ('SELECT request_no FROM Journey WHERE Orig_lat = %s AND Orig_lng = %s AND Dest_lat = %s AND Dest_lng = %s AND _sent_time_stamp = %s')
                Orig_lat = leg['start_location']['lat']
                Orig_lng = leg['start_location']['lng']
                Dest_lat = leg['end_location']['lat']
                Dest_lng = leg['end_location']['lng']
                _sent_time_stamp = leg['_sent_time_stamp']
                cursor.execute(query,(Orig_lat, Orig_lng, Dest_lat, Dest_lng, _sent_time_stamp))
                request_no = cursor.fetchone()[0]

This is rather clunky, but works.

However, as I move to the next level of information, the legs of the legs data, it has become too complex.

The database effectively captures the same origin and destination information over different time periods. As such, in order to uniquely identify the correct leg, they must feature a time column. This would require me to add time information to all of the leg tables, which would be significant and seems rather wasteful to me

Therefore my question is - is it possible for me to enter all of data for a journey, in one go and thus associate the legs with their parents, in one go - rather than inserting each level independently and then searching the database to relate it to its correct parent

Apologies for the poor explanation - I am a civil engineer & finding my way here. Guidance, nudges etc all welcome. I am learning by doing

EDIT -

How I parse and loop insert the first and second layers of information seperately, for all of the results

for result in results:
    if result["status"] == "OK":
        for leg in result['routes'][0]['legs']:
            try:
                params = {
                "_sent_time_stamp": leg['_sent_time_stamp'],
                "distance": leg['distance']['value'],
                "duration": leg['duration']['value'],
                "Orig_lat": leg['start_location']['lat'],
                "Orig_lng": leg['start_location']['lng'],
                "Orig_LSOA": leg['Orig_LSOA'],
                "Dest_lat": leg['end_location']['lat'],
                "Dest_lng": leg['end_location']['lng'],
                "arrival_time": leg['arrival_time']['value'],
                "departure_time": leg['departure_time']['value']
                }
                cursor.execute(add_overall_data, params)
                query = ('SELECT request_no FROM Journey WHERE Orig_lat = %s AND Orig_lng = %s AND Dest_lat = %s AND Dest_lng = %s AND _sent_time_stamp = %s')
                Orig_lat = leg['start_location']['lat']
                Orig_lng = leg['start_location']['lng']
                Dest_lat = leg['end_location']['lat']
                Dest_lng = leg['end_location']['lng']
                _sent_time_stamp = leg['_sent_time_stamp']
                cursor.execute(query,(Orig_lat, Orig_lng, Dest_lat, Dest_lng, _sent_time_stamp))
                request_no = cursor.fetchone()[0]
            except KeyError:
                print("Key_error")
for steps in result['routes'][0]['legs'][0]['steps']:
            try:
                params = {
                "request_no": request_no,
                "leg_distance": steps['distance']['value'],
                "leg_duration": steps['duration']['value'],
                "leg_Orig_lat": steps['start_location']['lat'],
                "leg_Orig_lng": steps['start_location']['lng'],
                "leg_Dest_lat": steps['end_location']['lat'],
                "leg_Dest_lng": steps['end_location']['lng'],
                "leg_html_inst": steps['html_instructions'],
                "leg_polyline": steps['polyline']['points'],
                "leg_travel_mode": steps['travel_mode'],
                "leg_sent_time_stamp": leg['_sent_time_stamp']
                }
                cursor.execute(add_leg_data, params)
share|improve this question

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.