I have been using Python to insert parsed data from a JSON server response into a MySQL database.
The JSON data consists of around 10,000 responses, each holding around 40 nested pieces of information.
For illustration - it holds information on 10,000 journeys. For each of these journeys, there are the constitutional parts of the journeys. Each journey can have up to 25 legs (for example, take road A10 for 3km). Within these legs, there can be multiple parts (walk to corner of B231).
What I have been doing is iterating over the entire set of results at each level. So, my python begins by inserting all of the journey information for the 10,000 journeys. It then moves to the next level and inserts this tier of information (the legs). For example, the constitutional, leg parts of the journey. At this point, to maintain integrity, I have to associate this information with its parent, such that the correct leg is associated with its correct parent journey. I do this by searching the DB for the journey information and identifying the foreign key using the unique combination of origin, destination and time.
query = ('SELECT request_no FROM Journey WHERE Orig_lat = %s AND Orig_lng = %s AND Dest_lat = %s AND Dest_lng = %s AND _sent_time_stamp = %s')
Orig_lat = leg['start_location']['lat']
Orig_lng = leg['start_location']['lng']
Dest_lat = leg['end_location']['lat']
Dest_lng = leg['end_location']['lng']
_sent_time_stamp = leg['_sent_time_stamp']
cursor.execute(query,(Orig_lat, Orig_lng, Dest_lat, Dest_lng, _sent_time_stamp))
request_no = cursor.fetchone()[0]
This is rather clunky, but works.
However, as I move to the next level of information, the legs of the legs data, it has become too complex.
The database effectively captures the same origin and destination information over different time periods. As such, in order to uniquely identify the correct leg, they must feature a time column. This would require me to add time information to all of the leg tables, which would be significant and seems rather wasteful to me
Therefore my question is - is it possible for me to enter all of data for a journey, in one go and thus associate the legs with their parents, in one go - rather than inserting each level independently and then searching the database to relate it to its correct parent
Apologies for the poor explanation - I am a civil engineer & finding my way here. Guidance, nudges etc all welcome. I am learning by doing
EDIT -
How I parse and loop insert the first and second layers of information seperately, for all of the results
for result in results:
if result["status"] == "OK":
for leg in result['routes'][0]['legs']:
try:
params = {
"_sent_time_stamp": leg['_sent_time_stamp'],
"distance": leg['distance']['value'],
"duration": leg['duration']['value'],
"Orig_lat": leg['start_location']['lat'],
"Orig_lng": leg['start_location']['lng'],
"Orig_LSOA": leg['Orig_LSOA'],
"Dest_lat": leg['end_location']['lat'],
"Dest_lng": leg['end_location']['lng'],
"arrival_time": leg['arrival_time']['value'],
"departure_time": leg['departure_time']['value']
}
cursor.execute(add_overall_data, params)
query = ('SELECT request_no FROM Journey WHERE Orig_lat = %s AND Orig_lng = %s AND Dest_lat = %s AND Dest_lng = %s AND _sent_time_stamp = %s')
Orig_lat = leg['start_location']['lat']
Orig_lng = leg['start_location']['lng']
Dest_lat = leg['end_location']['lat']
Dest_lng = leg['end_location']['lng']
_sent_time_stamp = leg['_sent_time_stamp']
cursor.execute(query,(Orig_lat, Orig_lng, Dest_lat, Dest_lng, _sent_time_stamp))
request_no = cursor.fetchone()[0]
except KeyError:
print("Key_error")
for steps in result['routes'][0]['legs'][0]['steps']:
try:
params = {
"request_no": request_no,
"leg_distance": steps['distance']['value'],
"leg_duration": steps['duration']['value'],
"leg_Orig_lat": steps['start_location']['lat'],
"leg_Orig_lng": steps['start_location']['lng'],
"leg_Dest_lat": steps['end_location']['lat'],
"leg_Dest_lng": steps['end_location']['lng'],
"leg_html_inst": steps['html_instructions'],
"leg_polyline": steps['polyline']['points'],
"leg_travel_mode": steps['travel_mode'],
"leg_sent_time_stamp": leg['_sent_time_stamp']
}
cursor.execute(add_leg_data, params)