Skip to content

Conversation

ruslandoga
Copy link
Contributor

@ruslandoga ruslandoga commented Oct 14, 2024

This PR adds a helper NIF that can be used to bulk-insert "unlimited" number of rows.

Should be enough to clоse #129 once it gets into ecto_sqlite3

Regarding performance: XQLite uses this approach in insert_all benchmark in https://github.com/ruslandoga/elixir-sqlite-bench

ruslandoga and others added 12 commits October 14, 2024 20:21
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
@warmwaffles
Copy link
Member

This looks good.

Co-authored-by: Matthew Johnston <warmwaffles@gmail.com>
@warmwaffles
Copy link
Member

I do not like how often the windows test flaps.

@warmwaffles
Copy link
Member

Ugh

c_src\sqlite3_nif.c(703): error C2057: expected constant expression c_src\sqlite3_nif.c(703): error C2466: cannot allocate an array of constant size 0 c_src\sqlite3_nif.c(703): error C2133: 'types_array': unknown size 

// TODO: MSVC can't do VLA
// int types_array[stmt_param_count];
int* types_array = (int*)malloc(stmt_param_count * sizeof(int));
Copy link
Member

@warmwaffles warmwaffles Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use enif_alloc for this. This way the VM will know about the memory allocated and will count it correctly. Also enif_free to free the memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I did it in the very next commit: cf69cd5

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I was walking the changes.

Comment on lines +797 to +798
if (rc != SQLITE_DONE) {
return make_sqlite3_error_tuple(env, rc, sqlite3_db_handle(statement->statement));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are missing cleanup code here. We need to enif_free(types_array) before all of these error returns.

Copy link
Contributor Author

@ruslandoga ruslandoga Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. And not just here, but for every return.

Copy link
Contributor Author

@ruslandoga ruslandoga Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll need to find another way to pass the types information. The current approach is getting complicated without VLA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry!

Copy link
Member

@warmwaffles warmwaffles Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the nature of C.

One way would be to use a common cleanup statement with goto like this.

{ // ... if (!enif_get_list_cell(env, head, &param, &head)) { goto bad_arg_exception; } // .. enif_free(types_array); return am_done; bad_arg_exception: enif_free(types_array); return raise_badarg(env, param); }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the nature of C.

I think it should be possible to implement insert_all or a similar function for bulk inserts with no manual memory allocation. It's only used in a small part of the function -- just to know what types to insert -- but ends up affecting all control flow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how often this feature would be used and if it is worth it to solve in the NIF.

Copy link
Contributor Author

@ruslandoga ruslandoga Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Repo.insert_all calls would go through this NIF and the performance improvement is quite significant (seems close to x2 from my benchmarks, but potentially higher because it allows reusing a single prepared statement for all inserts into a table) so the I think it's worth it.

@ruslandoga ruslandoga closed this Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants