-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using --schema flag for having dump schema level granularity and make COPY less error prone #14
base: master
Are you sure you want to change the base?
Conversation
like in pg_dump and introduce --sample_schema overtaking old functionality
Using --schema flag for having dump schema level granularity
changed DELIMITER, QUOTE and ESCAPE for better handling of complicated data type fields (JSON in particular) added explicitly the table collumns listed in COPY FROM to handle situation when the order of collumns differs between source and destination databases added flag --password to explicitly ask for password when pg_dump needed (i.e each run without --data-only flag) commented out standard_conforming_strings = off so this will be set to default value, which is on and is needed for import with \t delimiter introduced the requirement that --sample_schema if specified must include _pg_sample in name to avoid risk of accidentally delete when used with --force
@mla I am not much into Perl. Any kind of feedback will be welcome ✊ |
These look like great changes, @andilabs. Why was --password option added to pg_dump? |
I saw your comment on the change. Just wondering if it's actually necessary since pg_dump says:
|
I find out this flag necessary - at least for my production db. I was also
wondering when I red in the docs that it isn’t needed 🤷🏼♂️
On Wed, 5 Feb 2020 at 17:31, mla ***@***.***> wrote:
I saw your comment on the change. Just wondering if it's actually
necessary since pg_dump says:
--password
Force pg_dump to prompt for a password before connecting to a
database.
This option is never essential, since pg_dump will automatically
prompt for a password if the server demands password
authentication. However, pg_dump will waste a connection attempt
finding out that the server wants a password. In some cases it is
worth typing -W to avoid the extra connection attempt.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14?email_source=notifications&email_token=AAHS6N3E3CUFY5UDFI3TQLTRBLSWZA5CNFSM4KOFEN62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEK4CH7I#issuecomment-582493181>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHS6N3RX62IHMSF3P76XM3RBLSWZANCNFSM4KOFEN6Q>
.
--
Pozdrowienia,
Andrzej
|
Hi @mla how are you doing in coronavirus times? I hope you are doing well! Cheers! |
Hey @andilabs. Doing okay, thanks. Hope you are too. Yes, sorry for the delay on this. Will target this weekend. |
@andilabs can you explain what issue you hit with the COPY being error prone? Or even better, an example test case? From the project dir, you should be able to run: prove -v On the standard_conforming_strings setting, I see you commented it out with a comment that we'll use the Pg default. But isn't it more robust to know what the setting is? I don't particularly care if setting it on or off is better, but seems like we should try to minimize any configuration differences between environments. I've working in the dev branch and working through your changes if you want to see the current state. The splitting of schema and table name shouldn't be necessary. The values in @table are Table instances, so we should be able to inspect the schema() and table() accessors directly. If you could just explain what the string_agg stuff and adding CSV DELIMITER specifications and such to the COPY so I can better understand what problem was being triggered. Thanks! |
Also, I would rather we not add the --password option to the pg commands. The pg_dump docs state, for example:
So what is happening for you? It is not prompting you for a password? We already support the connection params as part of the main script, so prob those should just be propagated to the pg_dump call if supplied. |
I guess there's no great way to pass the password securely. There is a PGPASSWORD env var but it's not recommended to use. There's a PGPASSFILE var that we could use but seems like a bit of a pain. We could create a correctly formatted pgpass file and point to that. https://www.postgresql.org/docs/9.6/libpq-pgpass.html Or we could pass the --password option to pg_dump if it was supplied to pg_sample. But I thought that would be triggered automatically. |
my ($schema_name, $table_name) = split('\.', $table, 2); | ||
my $cleaned_table_name = substr $table_name, 1, -1; | ||
my $cleaned_schema_name = substr $schema_name, 1, -1; | ||
my ($q) = "SELECT string_agg(quote_ident(column_name), ',') FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '$cleaned_table_name' AND TABLE_SCHEMA = '$cleaned_schema_name'"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mla this was needed to explicitly persist the order of columns, which may differ in prod vs dev db.
my $cleaned_schema_name = substr $schema_name, 1, -1; | ||
my ($q) = "SELECT string_agg(quote_ident(column_name), ',') FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = '$cleaned_table_name' AND TABLE_SCHEMA = '$cleaned_schema_name'"; | ||
my ($column_names_to_keep_order) = $dbh->selectrow_array($q); | ||
print "COPY $table ($column_names_to_keep_order) FROM stdin WITH CSV DELIMITER E'\\t' QUOTE '\b' ESCAPE '\\';\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mla this was for mentioned JSONB fields proper escaping
|
@mla Will this one get merged? |
What features from this are you looking for? Are you hitting a problem? |
Using --schema flag for having dump schema level granularity
improved per schema filtering, make COPY less error prone
changed DELIMITER, QUOTE and ESCAPE for better handling of complicated data type fields (JSON in particular)
added explicitly the table collumns listed in COPY FROM to handle situation when the order of collumns differs between source and destination databases
added flag --password to explicitly ask for password when pg_dump needed (i.e each run without --data-only flag)
commented out standard_conforming_strings = off so this will be set to default value, which is on and is needed for import with \t delimiter
introduced the requirement that --sample_schema if specified must include _pg_sample in name to avoid risk of accidentally delete when used with --force