-
- Notifications
You must be signed in to change notification settings - Fork 19.2k
Closed
Labels
Description
Suppose you have an existing SQL table called person_age, where id is the primary key:
age id 118 242and you also have new data in a DataFrame called extra_data
age id 244 395then it would be useful to have an option on extra_data.to_sql() that allows to pass the DataFrame to SQL with an INSERT or UPDATE option on the rows, based on the primary key.
In this case, the id=2 row would get updated to age=44 and the id=3 row would get added
Expected Output
age id 118 244 395(Maybe) helpful code references
- Use
mergefrom SQLAlchemy? - The query: '''INSERT or REPLACE into person_age (id, age) values (?,?,?) ''' in this question
I looked at pandas sql.py sourcecode to come up with a solution, but I couldn't follow.
Code to replicate the example above
(Apologies for mixing sqlalchemy and sqlite
import pandas as pd from sqlalchemy import create_engine import sqlite3 conn = sqlite3.connect('example.db') c = conn.cursor() c.execute('''DROP TABLE IF EXISTS person_age;''') c.execute(''' CREATE TABLE person_age (id INTEGER PRIMARY KEY ASC, age INTEGER NOT NULL) ''') conn.commit() conn.close() ##### Create original table engine = create_engine("sqlite:///example.db") sql_df = pd.DataFrame({'id' : [1, 2], 'age' : [18, 42]}) sql_df.to_sql('person_age', engine, if_exists='append', index=False) #### Extra data to insert/update extra_data = pd.DataFrame({'id' : [2, 3], 'age' : [44, 95]}) extra_data.set_index('id', inplace=True) #### extra_data.to_sql() with row update or insert option expected_df = pd.DataFrame({'id': [1, 2, 3], 'age': [18, 44, 95]}) expected_df.set_index('id', inplace=True)slarrain, Casyfill, lematt1991, luchko, jprockbelly and 136 moreMahmudRana, philipheu, Atlaoui, andybyte, nigma1985 and 5 more