I have a pandas DataFrame and a (MySQL) database with the same columns. The database is not managed by me.
I want to update the values in the database in an "UPDATE... WHERE" style, updating only some columns wherever some other columns match.
Here's my code:
import sqlalchemy as sqla
def save_to_db(final_df, passwd):
engine_str = 'mysql+mysqldb://username:{}@localhost/mydb'.format(passwd)
engine = sqla.create_engine(engine_str)
sm = sessionmaker(bind=engine)
session = sm()
metadata = sqla.MetaData(bind=engine)
datatable = sqla.Table('AdcsLogForProduct', metadata, autoload=True)
for ind, row in final_df.iterrows():
u = sqla.sql.update(datatable) \
.values({"q_ECI_B_x": row.q_ECI_B_x,
"q_ECI_B_y": row.q_ECI_B_y,
"q_ECI_B_z": row.q_ECI_B_z,
"q_ECI_B_s": row.q_ECI_B_s}) \
.where(sqla.and_(datatable.c.year == row.year,
datatable.c.month == row.month,
datatable.c.day == row.day,
datatable.c.hours == row.hours,
datatable.c.minutes == row.minutes,
datatable.c.seconds == row.seconds,
datatable.c.milliseconds == row.milliseconds,
datatable.c.microseconds == row.microseconds))
session.execute(u)
session.flush()
session.commit()
I'm doing this with plain sqlalchemy because apparently pandas' built-in SQL functions can't handle "UPDATE... WHERE" scenarios. However, this is really slow.
Isn't there a more efficient way to do this?