summaryrefslogtreecommitdiff
path: root/src/include/common
Commit message (Collapse)AuthorAgeFilesLines
* Support long distance matching for zstd compressionTomas Vondra2023-04-061-0/+2
| | | | | | | | | | | | | | zstd compression supports a special mode for finding matched in distant past, which may result in better compression ratio, at the expense of using more memory (the window size is 128MB). To enable this optional mode, use the "long" keyword when specifying the compression method (--compress=zstd:long). Author: Justin Pryzby Reviewed-by: Tomas Vondra, Jacob Champion Discussion: https://postgr.es/m/20230224191840.GD1653@telsasoft.com Discussion: https://postgr.es/m/20220327205020.GM28503@telsasoft.com
* Make SCRAM iteration count configurableDaniel Gustafsson2023-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the hardcoded value with a GUC such that the iteration count can be raised in order to increase protection against brute-force attacks. The hardcoded value for SCRAM iteration count was defined to be 4096, which is taken from RFC 7677, so set the default for the GUC to 4096 to match. In RFC 7677 the recommendation is at least 15000 iterations but 4096 is listed as a SHOULD requirement given that it's estimated to yield a 0.5s processing time on a mobile handset of the time of RFC writing (late 2015). Raising the iteration count of SCRAM will make stored passwords more resilient to brute-force attacks at a higher computational cost during connection establishment. Lowering the count will reduce computational overhead during connections at the tradeoff of reducing strength against brute-force attacks. There are however platforms where even a modest iteration count yields a too high computational overhead, with weaker password encryption schemes chosen as a result. In these situations, SCRAM with a very low iteration count still gives benefits over weaker schemes like md5, so we allow the iteration count to be set to one at the low end. The new GUC is intentionally generically named such that it can be made to support future SCRAM standards should they emerge. At that point the value can be made into key:value pairs with an undefined key as a default which will be backwards compatible with this. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org> Discussion: https://postgr.es/m/F72E7BC7-189F-4B17-BF47-9735EB72C364@yesql.se
* Revise pg_pwrite_zeros()Michael Paquier2023-03-061-1/+1
| | | | | | | | | | | | | | | | | The following changes are made to pg_write_zeros(), the API able to write series of zeros using vectored I/O: - Add of an "offset" parameter, to write the size from this position (the 'p' of "pwrite" seems to mean position, though POSIX does not outline ythat directly), hence the name of the routine is incorrect if it is not able to handle offsets. - Avoid memset() of "zbuffer" on every call. - Avoid initialization of the whole IOV array if not needed. - Group the trailing write() call with the main write() call, simplifying the function logic. Author: Andres Freund Reviewed-by: Michael Paquier, Bharath Rupireddy Discussion: https://postgr.es/m/20230215005525.mrrlmqrxzjzhaipl@awork3.anarazel.de
* Introduce a generic pg_dump compression APITomas Vondra2023-02-231-0/+4
| | | | | | | | | | | | | | | | | Switch pg_dump to use the Compression API, implemented by bf9aa490db. The CompressFileHandle replaces the cfp* family of functions with a struct of callbacks for accessing (compressed) files. This allows adding new compression methods simply by introducing a new struct instance with appropriate implementation of the callbacks. Archives compressed using custom compression methods store an identifier of the compression algorithm in their header instead of the compression level. The header version is bumped. Author: Georgios Kokolatos Reviewed-by: Michael Paquier, Rachel Heaton, Justin Pryzby, Tomas Vondra Discussion: https://postgr.es/m/faUNEOpts9vunEaLnmxmG-DldLSg_ql137OC3JYDmgrOMHm1RvvWY2IdBkv_CRxm5spCCb_OmKNk2T03TMm0fBEWveFF9wA1WizPuAgB7Ss%3D%40protonmail.com
* Revert refactoring of restore command code to shell_restore.cMichael Paquier2023-02-061-0/+21
| | | | | | | | | | | | | | | | | | | | | This reverts commits 24c35ec and 57169ad. PreRestoreCommand() and PostRestoreCommand() need to be put closer to the system() call calling a restore_command, as they enable in_restore_command for the startup process which would in turn trigger an immediate proc_exit() in the SIGTERM handler. Perhaps we could get rid of this behavior entirely, but 24c35ec has made the window where the flag is enabled much larger than it was, and any Postgres-like actions (palloc, etc.) taken by code paths while the flag is enabled could lead to more severe issues in the shutdown processing. Note that curculio has showed that there are much more problems in this area, unrelated to this change, actually, hence the issues related to that had better be addressed first. Keeping the code of HEAD in line with the stable branches should make that a bit easier. Per discussion with Andres Freund and Nathan Bossart. Discussion: https://postgr.es/m/Y979NR3U5VnWrTwB@paquier.xyz
* Refactor code for restoring files via shell commandsMichael Paquier2023-01-181-21/+0
| | | | | | | | | | | | | | | | | | | Presently, restore_command uses a different code path than archive_cleanup_command and recovery_end_command. These code paths are similar and can be easily combined, as long as it is possible to identify if a command should: - Issue a FATAL on signal. - Exit immediately on SIGTERM. While on it, this removes src/common/archive.c and its associated header. Since the introduction of c96de2c, BuildRestoreCommand() has become a simple wrapper of replace_percent_placeholders() able to call make_native_path(). This simplifies shell_restore.c as long as RestoreArchivedFile() includes a call to make_native_path(). Author: Nathan Bossart Reviewed-by: Andres Freund, Michael Paquier Discussion: https://postgr.es/m/20221227192449.GA3672473@nathanxps13
* Common function for percent placeholder replacementPeter Eisentraut2023-01-111-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | There are a number of places where a shell command is constructed with percent-placeholders (like %x). It's cumbersome to have to open-code this several times. This factors out this logic into a separate function. This also allows us to ensure consistency for and document some subtle behaviors, such as what to do with unrecognized placeholders. The unified handling is now that incorrect and unknown placeholders are an error, where previously in most cases they were skipped or ignored. This affects the following settings: - archive_cleanup_command - archive_command - recovery_end_command - restore_command - ssl_passphrase_command The following settings are part of this refactoring but already had stricter error handling and should be unchanged in their behavior: - basebackup_to_shell.command Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5238bbed-0b01-83a6-d4b2-7eb0562a054e%40enterprisedb.com
* Invent random_normal() to provide normally-distributed random numbers.Tom Lane2023-01-091-0/+1
| | | | | | | | | There is already a version of this in contrib/tablefunc, but it seems sufficiently widely useful to justify having it in core. Paul Ramsey Discussion: https://postgr.es/m/CACowWR0DqHAvOKUCNxTrASFkWsDLqKMd6WiXvVvaWg4pV1BMnQ@mail.gmail.com
* Update copyright for 2023Bruce Momjian2023-01-0236-36/+36
| | | | Backpatch-through: 11
* Remove hardcoded dependency to cryptohash type in the internals of SCRAMMichael Paquier2022-12-201-10/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SCRAM_KEY_LEN was a variable used in the internal routines of SCRAM to size a set of fixed-sized arrays used in the SHA and HMAC computations during the SASL exchange or when building a SCRAM password. This had a hard dependency on SHA-256, reducing the flexibility of SCRAM when it comes to the addition of more hash methods. A second issue was that SHA-256 is assumed as the cryptohash method to use all the time. This commit renames SCRAM_KEY_LEN to a more generic SCRAM_KEY_MAX_LEN, which is used as the size of the buffers used by the internal routines of SCRAM. This is aimed at tracking centrally the maximum size necessary for all the hash methods supported by SCRAM. A global variable has the advantage of keeping the code in its simplest form, reducing the need of more alloc/free logic for all the buffers used in the hash calculations. A second change is that the key length (SHA digest length) and hash types are now tracked by the state data in the backend and the frontend, the common portions being extended to handle these as arguments by the internal routines of SCRAM. There are a few RFC proposals floating around to extend the SCRAM protocol, including some to use stronger cryptohash algorithms, so this lifts some of the existing restrictions in the code. The code in charge of parsing and building SCRAM secrets is extended to rely on the key length and on the cryptohash type used for the exchange, assuming currently that only SHA-256 is supported for the moment. Note that the mock authentication simply enforces SHA-256. Author: Michael Paquier Reviewed-by: Peter Eisentraut, Jonathan Katz Discussion: https://postgr.es/m/Y5k3Qiweo/1g9CG6@paquier.xyz
* Static assertions cleanupPeter Eisentraut2022-12-151-1/+1
| | | | | | | | | | | | | | | | | | | | | Because we added StaticAssertStmt() first before StaticAssertDecl(), some uses as well as the instructions in c.h are now a bit backwards from the "native" way static assertions are meant to be used in C. This updates the guidance and moves some static assertions to better places. Specifically, since the addition of StaticAssertDecl(), we can put static assertions at the file level. This moves a number of static assertions out of function bodies, where they might have been stuck out of necessity, to perhaps better places at the file level or in header files. Also, when the static assertion appears in a position where a declaration is allowed, then using StaticAssertDecl() is more native than StaticAssertStmt(). Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/941a04e7-dd6f-c0e4-8cdf-a33b3338cbda%40enterprisedb.com
* Remove SHA256_HMAC_B from scram-common.hMichael Paquier2022-12-141-3/+0
| | | | | | | | | | This referred to the size of the buffers for k_ipad and k_opad in HMAC computations. This is unused since e6bdfd9, where SCRAM has switched to the cryptohash routines for its HMAC calculations rather than its own maths. Reviewed-by: Jacob Champion Discussion: https://postgr.es/m/Y5gGMjXhyp0oK0mH@paquier.xyz
* Convert json_in and jsonb_in to report errors softly.Tom Lane2022-12-111-0/+1
| | | | | | | | | | | | | | | | This requires a bit of further infrastructure-extension to allow trapping errors reported by numeric_in and pg_unicode_to_server, but otherwise it's pretty straightforward. In the case of jsonb_in, we are only capturing errors reported during the initial "parse" phase. The value-construction phase (JsonbValueToJsonb) can also throw errors if assorted implementation limits are exceeded. We should improve that, but it seems like a separable project. Andrew Dunstan and Tom Lane Discussion: https://postgr.es/m/3bac9841-fe07-713d-fa42-606c225567d6@dunslane.net
* Change JsonSemAction to allow non-throw error reporting.Tom Lane2022-12-111-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly, semantic action functions for the JSON parser returned void, so that there was no way for them to affect the parser's behavior. That means in particular that they can't force an error exit except by longjmp'ing. That won't do in the context of our project to make input functions return errors softly. Hence, change them to return the same JsonParseErrorType enum value as the parser itself uses. If an action function returns anything besides JSON_SUCCESS, the parse is abandoned and that error code is returned. Action functions can thus easily return the same error conditions that the parser already knows about. As an escape hatch for expansion, also invent a code JSON_SEM_ACTION_FAILED that the core parser does not know the exact meaning of. When returning this code, an action function must use some out-of-band mechanism for reporting the error details. This commit simply makes the API change and causes all the existing action functions to return JSON_SUCCESS, so that there is no actual change in behavior here. This is long enough and boring enough that it seemed best to commit it separately from the changes that make real use of the new mechanism. In passing, remove a duplicate assignment of transform_string_values_scalar. Discussion: https://postgr.es/m/1436686.1670701118@sss.pgh.pa.us
* Refactor code parsing compression option values (-Z/--compress)Michael Paquier2022-11-301-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | This commit moves the code in charge of deparsing the method and detail strings fed later to parse_compress_specification() to a common routine, where the backward-compatible case of only an integer being found (N = 0 => "none", N > 1 => gzip at level N) is handled. Note that this has a side-effect for pg_basebackup, as we now attempt to detect "server-" and "client-" before checking for the integer-only pre-14 grammar, where values like server-N and client-N (without the follow-up detail string) are now valid rather than failing because of an unsupported method name. Past grammars are still handled the same way, but these flavors are now authorized, and would now switch to consider N = 0 as no compression and N > 1 as gzip with the compression level used as N, with the caller still controlling if the compression method should be done server-side, client-side or is unspecified. The documentation of pg_basebackup is updated to reflect that. This benefits other code paths that would like to rely on the same logic as pg_basebackup and pg_receivewal with option values used for compression specifications, one area discussed lately being pg_dump. Author: Georgios Kokolatos, Michael Paquier Discussion: https://postgr.es/m/O4mutIrCES8ZhlXJiMvzsivT7ztAMja2lkdL1LJx6O5f22I2W8PBIeLKz7mDLwxHoibcnRAYJXm1pH4tyUNC4a8eDzLn22a6Pb1S74Niexg=@pm.me
* Introduce pg_pwrite_zeros() in fileutils.cMichael Paquier2022-11-081-0/+2
| | | | | | | | | | | | | | | | | | | | | This routine is designed to write zeros to a file using vectored I/O, for a size given by its caller, being useful when it comes to initializing a file with a final size already known. XLogFileInitInternal() in xlog.c is changed to use this new routine when initializing WAL segments with zeros (wal_init_zero enabled). Note that the aligned buffers used for the vectored I/O writes have a size of XLOG_BLCKSZ, and not BLCKSZ anymore, as pg_pwrite_zeros() relies on PGAlignedBlock while xlog.c originally used PGAlignedXLogBlock. This routine will be used in a follow-up patch to do the pre-padding of WAL segments for pg_receivewal and pg_basebackup when these are not compressed. Author: Bharath Rupireddy Reviewed-by: Nathan Bossart, Andres Freund, Thomas Munro, Michael Paquier Discussion: https://www.postgresql.org/message-id/CALj2ACUq7nAb7%3DbJNbK3yYmp-SZhJcXFR_pLk8un6XgDzDF3OA%40mail.gmail.com
* Move pg_pwritev_with_retry() to src/common/file_utils.cMichael Paquier2022-10-271-0/+7
| | | | | | | | | | | | | This commit moves pg_pwritev_with_retry(), a convenience wrapper of pg_writev() able to handle partial writes, to common/file_utils.c so that the frontend code is able to use it. A first use-case targetted for this routine is pg_basebackup and pg_receivewal, for the zero-padding of a newly-initialized WAL segment. This is used currently in the backend when the GUC wal_init_zero is enabled (default). Author: Bharath Rupireddy Reviewed-by: Nathan Bossart, Thomas Munro Discussion: https://postgr.es/m/CALj2ACUq7nAb7=bJNbK3yYmp-SZhJcXFR_pLk8un6XgDzDF3OA@mail.gmail.com
* Mark sigint_interrupt_enabled as sig_atomic_tMichael Paquier2022-09-291-1/+3
| | | | | | | | | | | This is a continuation of 78fdb1e, where this flag is set in the psql callback handler used for SIGINT. This was previously a boolean but the C standard recommends the use of sig_atomic_t. Note that this influences PromptInterruptContext in string.h, where the same flag is tracked. Author: Hayato Kuroda Discussion: https://postgr.es/m/TYAPR01MB58669A9EC96AA3078C2CD938F5549@TYAPR01MB5866.jpnprd01.prod.outlook.com
* Revert 56-bit relfilenode change and follow-up commits.Robert Haas2022-09-281-5/+2
| | | | | | | | There are still some alignment-related failures in the buildfarm, which might or might not be able to be fixed quickly, but I've also just realized that it increased the size of many WAL records by 4 bytes because a block reference contains a RelFileLocator. The effect of that hasn't been studied or discussed, so revert for now.
* Increase width of RelFileNumbers from 32 bits to 56 bits.Robert Haas2022-09-271-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | RelFileNumbers are now assigned using a separate counter, instead of being assigned from the OID counter. This counter never wraps around: if all 2^56 possible RelFileNumbers are used, an internal error occurs. As the cluster is limited to 2^64 total bytes of WAL, this limitation should not cause a problem in practice. If the counter were 64 bits wide rather than 56 bits wide, we would need to increase the width of the BufferTag, which might adversely impact buffer lookup performance. Also, this lets us use bigint for pg_class.relfilenode and other places where these values are exposed at the SQL level without worrying about overflow. This should remove the need to keep "tombstone" files around until the next checkpoint when relations are removed. We do that to keep RelFileNumbers from being recycled, but now that won't happen anyway. However, this patch doesn't actually change anything in this area; it just makes it possible for a future patch to do so. Dilip Kumar, based on an idea from Andres Freund, who also reviewed some earlier versions of the patch. Further review and some wordsmithing by me. Also reviewed at various points by Ashutosh Sharma, Vignesh C, Amul Sul, Álvaro Herrera, and Tom Lane. Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com
* Move RelFileNumber declarations to common/relpath.h.Robert Haas2022-09-271-0/+7
| | | | | | | | | | | | Previously, these were declared in postgres_ext.h, but they are not needed nearly so widely as the OID declarations, so that doesn't necessarily make sense. Also, because postgres_ext.h is included before most of c.h has been processed, the previous location creates some problems for a pending patch. Patch by me, reviewed by Dilip Kumar. Discussion: http://postgr.es/m/CA+TgmoYc8oevMqRokZQ4y_6aRn-7XQny1JBr5DyWR_jiFtONHw@mail.gmail.com
* Harmonize more parameter names in bulk.Peter Geoghegan2022-09-203-4/+4
| | | | | | | | | | | | | | | | Make sure that function declarations use names that exactly match the corresponding names from function definitions in optimizer, parser, utility, libpq, and "commands" code, as well as in remaining library code. Do the same for all code related to frontend programs (with the exception of pg_dump/pg_dumpall related code). Like other recent commits that cleaned up function parameter names, this commit was written with help from clang-tidy. Later commits will handle ecpg and pg_dump/pg_dumpall. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com
* Update Unicode data to Unicode 15.0.0Peter Eisentraut2022-09-195-2268/+2444
|
* Simplify handling of compression level with compression specificationsMichael Paquier2022-09-141-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PG_COMPRESSION_OPTION_LEVEL is removed from the compression specification logic, and instead the compression level is always assigned with each library's default if nothing is directly given. This centralizes the checks on the compression methods supported by a given build, and always assigns a default compression level when parsing a compression specification. This results in complaining at an earlier stage than previously if a build supports a compression method or not, aka when parsing a specification in the backend or the frontend, and not when processing it. zstd, lz4 and zlib are able to handle in their respective routines setting up the compression level the case of a default value, hence the backend or frontend code (pg_receivewal or pg_basebackup) has now no need to know what the default compression level should be if nothing is specified: the logic is now done so as the specification parsing assigns it. It can also be enforced by passing down a "level" set to the default value, that the backend will accept (the replication protocol is for example able to handle a command like BASE_BACKUP (COMPRESSION_DETAIL 'gzip:level=-1')). This code simplification fixes an issue with pg_basebackup --gzip introduced by ffd5365, where the tarball of the streamed WAL segments would be created as of pg_wal.tar.gz with uncompressed contents, while the intention is to compress the segments with gzip at a default level. The origin of the confusion comes from the handling of the default compression level of gzip (-1 or Z_DEFAULT_COMPRESSION) and the value of 0 was getting assigned, which is what walmethods.c would consider as equivalent to no compression when streaming WAL segments with its tar methods. Assigning always the compression level removes the confusion of some code paths considering a value of 0 set in a specification as either no compression or a default compression level. Note that 010_pg_basebackup.pl has to be adjusted to skip a few tests where the shape of the compression detail string for client and server-side compression was checked using gzip. This is a result of the code simplification, as gzip specifications cannot be used if a build does not support it. Reported-by: Tom Lane Reviewed-by: Tom Lane Discussion: https://postgr.es/m/1400032.1662217889@sss.pgh.pa.us Backpatch-through: 15
* pg_clean_ascii(): escape bytes rather than lose themPeter Eisentraut2022-09-131-1/+1
| | | | | | | | | Rather than replace each unprintable byte with a '?' character, replace it with a hex escape instead. The API now allocates a copy rather than modifying the input in place. Author: Jacob Champion <jchampion@timescale.com> Discussion: https://www.postgresql.org/message-id/CAAWbhmgsvHrH9wLU2kYc3pOi1KSenHSLAHBbCVmmddW6-mc_=w@mail.gmail.com
* Treat Unicode codepoints of category "Format" as non-spacingJohn Naylor2022-09-131-12/+21
| | | | | | | | | | | | | | | | | | | | Commit d8594d123 updated the list of non-spacing codepoints used for calculating display width, but in doing so inadvertently removed some, since the script used for that commit only considered combining characters. For complete coverage for zero-width characters, include codepoints in the category Cf (Format). To reflect the wider purpose, also rename files and update comments that referred specifically to combining characters. Some of these ranges have been missing since v12, but due to lack of field complaints it was determined not important enough to justify adding special-case logic the backbranches. Kyotaro Horiguchi Report by Pavel Stehule Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRBE8yvpQ0FSkPCoe0Ny1jAAsAQ6j3qMgVwWvkqAoaaNmQ%40mail.gmail.com
* Expand palloc/pg_malloc API for more type safetyPeter Eisentraut2022-09-121-0/+28
| | | | | | | | | | | | | | | | | This adds additional variants of palloc, pg_malloc, etc. that encapsulate common usage patterns and provide more type safety. Specifically, this adds palloc_object(), palloc_array(), and repalloc_array(), which take the type name of the object to be allocated as its first argument and cast the return as a pointer to that type. There are also palloc0_object() and palloc0_array() variants for initializing with zero, and pg_malloc_*() variants of all of the above. Inspired by the talloc library. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/bb755632-2a43-d523-36f8-a1e7a389a907@enterprisedb.com
* Remove replacement code for getaddrinfo.Thomas Munro2022-08-141-1/+3
| | | | | | | | | | SUSv3, all targeted Unixes and modern Windows have getaddrinfo() and related interfaces. Drop the replacement implementation, and adjust some headers slightly to make sure that the APIs are visible everywhere using standard POSIX headers and names. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA%2BhUKG%2BL_3brvh%3D8e0BW_VfX9h7MtwgN%3DnFHP5o7X2oZucY9dg%40mail.gmail.com
* Fix mismatched file identificationsJohn Naylor2022-08-091-1/+1
| | | | | Masahiko Sawada Discussion: https://www.postgresql.org/message-id/CAD21AoASq93KPiNxipPaTCzEdsnxT9665UesOnWcKhmX9Qfx6A@mail.gmail.com
* Change internal RelFileNode references to RelFileNumber or RelFileLocator.Robert Haas2022-07-061-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have been using the term RelFileNode to refer to either (1) the integer that is used to name the sequence of files for a certain relation within the directory set aside for that tablespace/database combination; or (2) that value plus the OIDs of the tablespace and database; or occasionally (3) the whole series of files created for a relation based on those values. Using the same name for more than one thing is confusing. Replace RelFileNode with RelFileNumber when we're talking about just the single number, i.e. (1) from above, and with RelFileLocator when we're talking about all the things that are needed to locate a relation's files on disk, i.e. (2) from above. In the places where we refer to (3) as a relfilenode, instead refer to "relation storage". Since there is a ton of SQL code in the world that knows about pg_class.relfilenode, don't change the name of that column, or of other SQL-facing things that derive their name from it. On the other hand, do adjust closely-related internal terminology. For example, the structure member names dbNode and spcNode appear to be derived from the fact that the structure itself was called RelFileNode, so change those to dbOid and spcOid. Likewise, various variables with names like rnode and relnode get renamed appropriately, according to how they're being used in context. Hopefully, this is clearer than before. It is also preparation for future patches that intend to widen the relfilenumber fields from its current width of 32 bits. Variables that store a relfilenumber are now declared as type RelFileNumber rather than type Oid; right now, these are the same, but that can now more easily be changed. Dilip Kumar, per an idea from me. Reviewed also by Andres Freund. I fixed some whitespace issues, changed a couple of words in a comment, and made one other minor correction. Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com
* Remove PGDLLIMPORT marker from __pg_log_levelMichael Paquier2022-05-131-1/+1
| | | | | | | Per discussion with Tom Lane and Andres Freund. I have misunderstood the intention behind the choice done in 9a374b7. Discussion: https://postgr.es/m/20220512153737.6kbbcf4qyvwgq4s2@alap3.anarazel.de
* Add some missing PGDLLIMPORT markingsMichael Paquier2022-05-121-1/+1
| | | | | | | | | | | | | | | | | | Three variables in pqsignal.h (UnBlockSig, BlockSig and StartupBlockSig) were not marked with PGDLLIMPORT, as they are declared in a way that prevents mark_pgdllimport.pl to detect them. These variables are redefined in a style more consistent with the other headers, allowing the script to find and mark them. PGDLLIMPORT was missing for __pg_log_level in logging.h, so add it back. The marking got accidentally removed in 9a374b77, just after its addition in 8ec5694. While on it, add a comment in mark_pgdllimport.pl explaining what are the arguments needed by the script (aka a list of header paths). Reported-by: Andres Freund Discussion: https://postgr.es/m/20220506234924.6mxxotl3xl63db3l@alap3.anarazel.de
* Remove not-very-useful early checks of __pg_log_level in logging.h.Tom Lane2022-04-121-38/+19
| | | | | | | | | | | | | | | | | Enforce __pg_log_level message filtering centrally in logging.c, instead of relying on the calling macros to do it. This is more reliable (e.g. it works correctly for direct calls to pg_log_generic) and it saves a percent or so of total code size because we get rid of so many duplicate checks of __pg_log_level. This does mean that argument expressions in a logging macro will be evaluated even if we end up not printing anything. That seems of little concern for INFO and higher levels as those messages are printed by default, and most of our frontend programs don't even offer a way to turn them off. I left the unlikely() checks in place for DEBUG messages, though. Discussion: https://postgr.es/m/3993549.1649449609@sss.pgh.pa.us
* Rename backup_compression.{c,h} to compression.{c,h}Michael Paquier2022-04-122-46/+46
| | | | | | | | | | | | | | | | | | Compression option handling (level, algorithm or even workers) can be used across several parts of the system and not only base backups. Structures, objects and routines are renamed in consequence, to remove the concept of base backups from this part of the code making this change straight-forward. pg_receivewal, that has gained support for LZ4 since babbbb5, will make use of this infrastructure for its set of compression options, bringing more consistency with pg_basebackup. This cleanup needs to be done before releasing a beta of 15. pg_dump is a potential future target, as well, and adding more compression options to it may happen in 16~. Author: Michael Paquier Reviewed-by: Robert Haas, Georgios Kokolatos Discussion: https://postgr.es/m/YlPQGNAAa04raObK@paquier.xyz
* Improve frontend error logging style.Tom Lane2022-04-081-18/+97
| | | | | | | | | | | | | | | | | | | | | | | | Get rid of the separate "FATAL" log level, as it was applied so inconsistently as to be meaningless. This mostly involves s/pg_log_fatal/pg_log_error/g. Create a macro pg_fatal() to handle the common use-case of pg_log_error() immediately followed by exit(1). Various modules had already invented either this or equivalent macros; standardize on pg_fatal() and apply it where possible. Invent the ability to add "detail" and "hint" messages to a frontend message, much as we have long had in the backend. Except where rewording was needed to convert existing coding to detail/hint style, I have (mostly) resisted the temptation to change existing message wording. Patch by me. Design and patch reviewed at various stages by Robert Haas, Kyotaro Horiguchi, Peter Eisentraut and Daniel Gustafsson. Discussion: https://postgr.es/m/1363732.1636496441@sss.pgh.pa.us
* Apply PGDLLIMPORT markings broadly.Robert Haas2022-04-085-8/+8
| | | | | | | | | | | Up until now, we've had a policy of only marking certain variables in the PostgreSQL header files with PGDLLIMPORT, but now we've decided to mark them all. This means that extensions running on Windows should no longer operate at a disadvantage as compared to extensions running on Linux: if the variable is present in a header file, it should be accessible. Discussion: http://postgr.es/m/CA+TgmoYanc1_FSfimhgiWSqVyP5KKmh5NP2BWNwDhO8Pg2vGYQ@mail.gmail.com
* Allow parallel zstd compression when taking a base backup.Robert Haas2022-03-301-0/+2
| | | | | | | | | | | | | | | | | | | libzstd allows transparent parallel compression just by setting an option when creating the compression context, so permit that for both client and server-side backup compression. To use this, use something like pg_basebackup --compress WHERE-zstd:workers=N where WHERE is "client" or "server" and N is an integer. When compression is performed on the server side, this will spawn threads inside the PostgreSQL backend. While there is almost no PostgreSQL server code which is thread-safe, the threads here are used internally by libzstd and touch only data structures controlled by libzstd. Patch by me, based in part on earlier work by Dipesh Pandit and Jeevan Ladhe. Reviewed by Justin Pryzby. Discussion: http://postgr.es/m/CA+Tgmobj6u-nWF-j=FemygUhobhryLxf9h-wJN7W-2rSsseHNA@mail.gmail.com
* Replace BASE_BACKUP COMPRESSION_LEVEL option with COMPRESSION_DETAIL.Robert Haas2022-03-231-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are more compression parameters that can be specified than just an integer compression level, so rename the new COMPRESSION_LEVEL option to COMPRESSION_DETAIL before it gets released. Introduce a flexible syntax for that option to allow arbitrary options to be specified without needing to adjust the main replication grammar, and common code to parse it that is shared between the client and the server. This commit doesn't actually add any new compression parameters, so the only user-visible change is that you can now type something like pg_basebackup --compress gzip:level=5 instead of writing just pg_basebackup --compress gzip:5. However, it should make it easy to add new options. If for example gzip starts offering fries, we can support pg_basebackup --compress gzip:level=5,fries=true for the benefit of users who want fries with that. Along the way, this fixes a few things in pg_basebackup so that the pg_basebackup can be used with a server-side compression algorithm that pg_basebackup itself does not understand. For example, pg_basebackup --compress server-lz4 could still succeed even if only the server and not the client has LZ4 support, provided that the other options to pg_basebackup don't require the client to decompress the archive. Patch by me. Reviewed by Justin Pryzby and Dagfinn Ilmari Mannsåker. Discussion: http://postgr.es/m/CA+TgmoYvpetyRAbbg1M8b3-iHsaN4nsgmWPjOENu5-doHuJ7fA@mail.gmail.com
* Remove IS_AF_UNIX macroPeter Eisentraut2022-02-151-6/+0
| | | | | | | | | | | | The AF_UNIX macro was being used unprotected by HAVE_UNIX_SOCKETS, apparently since 2008. So the redirection through IS_AF_UNIX() is apparently no longer necessary. (More generally, all supported platforms are now HAVE_UNIX_SOCKETS, but even if there were a new platform in the future, it seems plausible that it would define the AF_UNIX symbol even without kernel support.) So remove the IS_AF_UNIX() macro and make the code a bit more consistent. Discussion: https://www.postgresql.org/message-id/flat/f2d26815-9832-e333-d52d-72fbc0ade896%40enterprisedb.com
* Improve error handling of HMAC computationsMichael Paquier2022-01-132-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is similar to b69aba7, except that this completes the work for HMAC with a new routine called pg_hmac_error() that would provide more context about the type of error that happened during a HMAC computation: - The fallback HMAC implementation in hmac.c relies on cryptohashes, so in some code paths it is necessary to return back the error generated by cryptohashes. - For the OpenSSL implementation (hmac_openssl.c), the logic is very similar to cryptohash_openssl.c, where the error context comes from OpenSSL if one of its internal routines failed, with different error codes if something internal to hmac_openssl.c failed or was incorrect. Any in-core code paths that use the centralized HMAC interface are related to SCRAM, for errors that are unlikely going to happen, with only SHA-256. It would be possible to see errors when computing some HMACs with MD5 for example and OpenSSL FIPS enabled, and this commit would help in reporting the correct errors but nothing in core uses that. So, at the end, no backpatch to v14 is done, at least for now. Errors in SCRAM related to the computation of the server key, stored key, etc. need to pass down the potential error context string across more layers of their respective call stacks for the frontend and the backend, so each surrounding routine is adapted for this purpose. Reviewed-by: Sergey Shinderuk Discussion: https://postgr.es/m/Yd0N9tSAIIkFd+qi@paquier.xyz
* Improve error handling of cryptohash computationsMichael Paquier2022-01-112-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing cryptohash facility was causing problems in some code paths related to MD5 (frontend and backend) that relied on the fact that the only type of error that could happen would be an OOM, as the MD5 implementation used in PostgreSQL ~13 (the in-core implementation is used when compiling with or without OpenSSL in those older versions), could fail only under this circumstance. The new cryptohash facilities can fail for reasons other than OOMs, like attempting MD5 when FIPS is enabled (upstream OpenSSL allows that up to 1.0.2, Fedora and Photon patch OpenSSL 1.1.1 to allow that), so this would cause incorrect reports to show up. This commit extends the cryptohash APIs so as callers of those routines can fetch more context when an error happens, by using a new routine called pg_cryptohash_error(). The error states are stored within each implementation's internal context data, so as it is possible to extend the logic depending on what's suited for an implementation. The default implementation requires few error states, but OpenSSL could report various issues depending on its internal state so more is needed in cryptohash_openssl.c, and the code is shaped so as we are always able to grab the necessary information. The core code is changed to adapt to the new error routine, painting more "const" across the call stack where the static errors are stored, particularly in authentication code paths on variables that provide log details. This way, any future changes would warn if attempting to free these strings. The MD5 authentication code was also a bit blurry about the handling of "logdetail" (LOG sent to the postmaster), so improve the comments related that, while on it. The origin of the problem is 87ae969, that introduced the centralized cryptohash facility. Extra changes are done for pgcrypto in v14 for the non-OpenSSL code path to cope with the improvements done by this commit. Reported-by: Michael Mühlbeyer Author: Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/89B7F072-5BBE-4C92-903E-D83E865D9367@trivadis.com Backpatch-through: 14
* Update copyright for 2022Bruce Momjian2022-01-0735-35/+35
| | | | Backpatch-through: 10
* Simplify declaring variables exported from libpgcommon and libpgport.Tom Lane2021-11-292-10/+0
| | | | | | | | | | | | | | | | This reverts commits c2d1eea9e and 11b500072, as well as similar hacks elsewhere, in favor of setting up the PGDLLIMPORT macro so that it can just be used unconditionally. That can work because in frontend code, we need no marking in either the defining or consuming files for a variable exported from these libraries; and frontend code has no need to access variables exported from the core backend, either. While at it, write some actual documentation about the PGDLLIMPORT and PGDLLEXPORT macros. Patch by me, based on a suggestion from Robert Haas. Discussion: https://postgr.es/m/1160385.1638165449@sss.pgh.pa.us
* Portability hack for pg_global_prng_state.Tom Lane2021-11-291-0/+4
| | | | | | | | | PGDLLIMPORT is only appropriate for variables declared in the backend, not when the variable is coming from a library included in frontend code. (This isn't a particularly nice fix, but for now, use the same method employed elsewhere.) Discussion: https://postgr.es/m/E1mrWUD-000235-Hq@gemulon.postgresql.org
* Replace random(), pg_erand48(), etc with a better PRNG API and algorithm.Tom Lane2021-11-281-0/+60
| | | | | | | | | | | | | | | | | | | Standardize on xoroshiro128** as our basic PRNG algorithm, eliminating a bunch of platform dependencies as well as fundamentally-obsolete PRNG code. In addition, this API replacement will ease replacing the algorithm again in future, should that become necessary. xoroshiro128** is a few percent slower than the drand48 family, but it can produce full-width 64-bit random values not only 48-bit, and it should be much more trustworthy. It's likely to be noticeably faster than the platform's random(), depending on which platform you are thinking about; and we can have non-global state vectors easily, unlike with random(). It is not cryptographically strong, but neither are the functions it replaces. Fabien Coelho, reviewed by Dean Rasheed, Aleksander Alekseev, and myself Discussion: https://postgr.es/m/alpine.DEB.2.22.394.2105241211230.165418@pseudo
* Provide a variant of simple_prompt() that can be interrupted by ^C.Tom Lane2021-11-171-2/+13
| | | | | | | | | | | | | | | | | | | | | Up to now, you couldn't escape out of psql's \password command by typing control-C (or other local spelling of SIGINT). This is pretty user-unfriendly, so improve it. To do so, we have to modify the functions provided by pg_get_line.c; but we don't want to mess with psql's SIGINT handler setup, so provide an API that lets that handler cause the cancel to occur. This relies on the assumption that we won't do any major harm by longjmp'ing out of fgets(). While that's obviously a little shaky, we've long had the same assumption in the main input loop, and few issues have been reported. psql has some other simple_prompt() calls that could usefully be improved the same way; for now, just deal with \password. Nathan Bossart, minor tweaks by me Discussion: https://postgr.es/m/747443.1635536754@sss.pgh.pa.us
* Update Unicode data to Unicode 14.0.0Peter Eisentraut2021-09-155-3208/+3426
|
* Extend collection of Unicode combining characters to beyond the BMPJohn Naylor2021-08-261-0/+102
| | | | | | | | The former limit was perhaps a carryover from an older hand-coded table. Since commit bab982161 we have enough space in mbinterval to store larger codepoints, so collect all combining characters. Discussion: https://www.postgresql.org/message-id/49ad1fa0-174e-c901-b14c-c484b60907f1%40enterprisedb.com
* Update display widths as part of updating UnicodeJohn Naylor2021-08-261-0/+120
| | | | | | | | | | | | | | | | | | The hardcoded "wide character" set in ucs_wcwidth() was last updated around the Unicode 5.0 era. This led to misalignment when printing emojis and other codepoints that have since been designated wide or full-width. To fix and keep up to date, extend update-unicode to download the list of wide and full-width codepoints from the offical sources. In passing, remove some comments about non-spacing characters that haven't been accurate since we removed the former hardcoded logic. Jacob Champion Reported and reviewed by Pavel Stehule Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRCeX21O69YHxmykYySYyprZAqrKWWg0KoGKdjgqcGyygg@mail.gmail.com
* Revert "Rename unicode_combining_table to unicode_width_table"John Naylor2021-08-261-1/+1
| | | | | | | | | | | | | This reverts commit eb0d0d2c7300c9c5c22b35975c11265aa4becc84. After I had committed eb0d0d2c7 and 78ab944cd, I decided to add a sanity check for a "can't happen" scenario just to be cautious. It turned out that it already happened in the official Unicode source data, namely that a character can be both wide and a combining character. This fact renders the aforementioned commits unnecessary, so revert both of them. Discussion: https://www.postgresql.org/message-id/CAFBsxsH5ejH4-1xaTLpSK8vWoK1m6fA1JBtTM6jmBsLfmDki1g%40mail.gmail.com