summaryrefslogtreecommitdiff
path: root/src/config_parse.c
Commit message (Collapse)AuthorAgeFilesLines
* str: introduce `git_str` for internal, `git_buf` is externalethomson/gitstrEdward Thomson2021-10-171-19/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
* Fix multiline strip_comments logicBasile Henry2021-09-091-1/+1
| | | | | | | | | The strip_comments function uses the count of quotes to know if a comment char (';' or '#') is the start of a comment or part of the multiline as a string. Unfortunately converting the count of quotes from previous lines to a boolean meant that it would only work as expected in some cases (0 quotes or an odd number of quotes).
* buf: bom enum is in the buf namespaceEdward Thomson2021-05-111-2/+2
| | | | | Instead of a `git_bom_t` that a `git_buf` function returns, let's keep it `git_buf_bom_t`.
* buf: remove internal `git_buf_text` namespaceEdward Thomson2021-05-111-3/+1
| | | | | The `git_buf_text` namespace is unnecessary and strange. Remove it, just keep the functions prefixed with `git_buf`.
* config: use GIT_ASSERTEdward Thomson2020-11-271-1/+1
|
* Fix config file parsing with multi line values containing quoted partsSven Strickroth2020-09-181-1/+1
| | | | Signed-off-by: Sven Strickroth <email@cs-ware.de>
* config_parse: provide parser init and dispose functionsPatrick Steinhardt2019-07-111-0/+11
| | | | | | | | | | | | | | Right now, all configuration file backends are expected to directly mess with the configuration parser's internals in order to set it up. Let's avoid doing that by implementing both a `git_config_parser_init` and `git_config_parser_dispose` function to clearly define the interface between configuration backends and the parser. Ideally, we would make the `git_config_parser` structure definition private to its implementation. But as that would require an additional memory allocation that was not required before we just live with it being visible to others.
* config_parse: remove use of `git_config_file`Patrick Steinhardt2019-07-111-4/+2
| | | | | | | | | | The config parser code needs to keep track of the current parsed file's name so that we are able to provide proper error messages to the user. Right now, we do that by storing a `git_config_file` in the parser structure, but as that is a specific backend and the parser aims to be generic, it is a layering violation. Switch over to use a simple string to fix that.
* config_parse: rename `data` parameter to `payload` for clarityPatrick Steinhardt2019-07-111-5/+5
| | | | | | | By convention, parameters that get passed to callbacks are usually named `payload` in our codebase. Rename the `data` parameters in the configuration parser callbacks to `payload` to avoid confusion.
* config parse: safely cast to intEdward Thomson2019-06-241-1/+6
|
* config: rename subsection header parser funcethomson/config_section_validityEdward Thomson2019-05-221-2/+2
| | | | | | The `parse_section_header_ext` name suggests that it as an extended function for parsing the section header. It is not. Rename it to `parse_subsection_header` to better reflect its true mission.
* config: validate quoted section valueEdward Thomson2019-05-221-10/+9
| | | | | | | | | | | | | When we reach a whitespace after a section name, we assume that what will follow will be a quoted subsection name. Pass the current position of the line being parsed to the subsection parser, so that it can validate that subsequent characters are additional whitespace or a single quote. Previously we would begin parsing after the section name, looking for the first quotation mark. This allows invalid characters to embed themselves between the end of the section name and the first quotation mark, eg `[section foo "subsection"]`, which is illegal.
* config: don't write invalid columnEdward Thomson2019-05-221-2/+9
| | | | | When we don't specify a particular column, don't write it in the error message. (column "0" is unhelpful.)
* config: lowercase error messagesEdward Thomson2019-05-221-10/+10
| | | | | Update the configuration parsing error messages to be lower-cased for consistency with the rest of the library.
* git_error: use new names in internal APIs and usageEdward Thomson2019-01-221-8/+8
| | | | | Move to the `git_error` name in the internal API for error-related functions.
* config: variables might appear on the same line as a section headercmn/config-nonewlineCarlos Martín Nieto2018-10-151-6/+26
| | | | | | | | | While rare and a machine would typically not generate such a configuration file, it is nevertheless valid to write [foo "bar"] baz = true and we need to deal with that instead of assuming everything is on its own line.
* config: introduce new read-only in-memory backendPatrick Steinhardt2018-09-281-1/+2
| | | | | | | | | Now that we have abstracted away how to store and retrieve config entries, it became trivial to implement a new in-memory backend by making use of this. And thus we do so. This commit implements a new read-only in-memory backend that can parse a chunk of memory into a `git_config_backend` structure.
* config_parse: avoid unused static declared valuesPatrick Steinhardt2018-09-211-0/+3
| | | | | | | | | | | | The variables `git_config_escaped` and `git_config_escapes` are both defined as static const character pointers in "config_parse.h". In case where "config_parse.h" is included but those two variables are not being used, the compiler will thus complain about defined but unused variables. Fix this by declaring them as external and moving the actual initialization to the C file. Note that it is not possible to simply make this a #define, as we are indexing into those arrays.
* config_parse: refactor error handling when parsing multiline variablesPatrick Steinhardt2018-09-031-19/+25
| | | | | | | | | The current error handling for the multiline variable parser is a bit fragile, as each error condition has its own code to clear memory. Instead, unify error handling as far as possible to avoid this repetitive code. While at it, make use of `GITERR_CHECK_ALLOC` to correctly handle OOM situations and verify that the buffer we print into does not run out of memory either.
* config: Fix a leak parsing multi-line config entriesNelson Elhage2018-09-011-0/+1
|
* config: convert unbounded recursion into a loopNelson Elhage2018-08-251-35/+30
|
* Fix a double-free in config parsingNelson Elhage2018-08-051-0/+1
|
* config_parse: always sanitize out-parameters in `parse_variable`Patrick Steinhardt2018-06-221-20/+23
| | | | | | | | The `parse_variable` function has two out parameters `var_name` and `var_value`. Currently, those are not being sanitized to `NULL`. when. any error happens inside of the `parse_variable` function. Fix that. While at it, the coding style is improved to match our usual coding practices more closely.
* config_parse: have `git_config_parse` own entry value and namePatrick Steinhardt2018-06-221-1/+4
| | | | | | | | | | | | | The function `git_config_parse` uses several callbacks to pass data along to the caller as it parses the file. One design shortcoming here is that strings passed to those callbacks are expected to be freed by them, which is really confusing. Fix the issue by changing memory ownership here. Instead of expecting the `on_variable` callbacks to free memory for `git_config_parse`, just do it inside of `git_config_parse`. While this obviously requires a bit more memory allocation churn due to having to copy both name and value at some places, this shouldn't be too much of a burden.
* Convert usage of `git_buf_free` to new `git_buf_dispose`Patrick Steinhardt2018-06-101-3/+3
|
* buf_text: remove `offset` parameter of BOM detection functionPatrick Steinhardt2018-02-081-1/+1
| | | | | | | | | | The function to detect a BOM takes an offset where it shall look for a BOM. No caller uses that, and searching for the BOM in the middle of a buffer seems to be very unlikely, as a BOM should only ever exist at file start. Remove the parameter, as it has already caused confusion due to its weirdness.
* config_parse: fix reading files with BOMPatrick Steinhardt2018-02-081-1/+1
| | | | | | | | | | | | | | | The function `skip_bom` is being used to detect and skip BOM marks previously to parsing a configuration file. To do so, it simply uses `git_buf_text_detect_bom`. But since the refactoring to use the parser interface in commit 9e66590bd (config_parse: use common parser interface, 2017-07-21), the BOM detection was actually broken. The issue stems from a misunderstanding of `git_buf_text_detect_bom`. It was assumed that its third parameter limits the length of the character sequence that is to be analyzed, while in fact it was an offset at which we want to detect the BOM. Fix the parameter to be `0` instead of the buffer length, as we always want to check the beginning of the configuration file.
* config_parse: handle empty lines with CRLFPatrick Steinhardt2018-02-081-0/+1
| | | | | | | Currently, the configuration parser will fail reading empty lines with just an CRLF-style line ending. Special-case the '\r' character in order to handle it the same as Unix-style line endings. Add tests to spot this regression in the future.
* config_parse: add comment to clarify logic getting next characterPatrick Steinhardt2018-02-081-0/+5
| | | | | | | | Upon each line, the configuration parser tries to get either the first non-whitespace character or the first whitespace character, in case there is no non-whitespace character. The logic handling this looks rather odd and doesn't immediately convey this meaning, so add a comment to clarify what happens.
* config_parse: use common parser interfacePatrick Steinhardt2017-11-111-169/+30
| | | | | | | | As the config parser is now cleanly separated from the config file code, we can easily refactor the code and make use of the common parser module. This removes quite a lot of duplicated functionality previously used for handling the actual parser state and replaces it with the generic interface provided by the parser context.
* config_file: split out module to parse config filesPatrick Steinhardt2017-11-111-0/+658
The configuration file code grew quite big and intermingles both actual configuration logic as well as the parsing logic of the configuration syntax. This makes it hard to refactor the parsing logic on its own and convert it to make use of our new parsing context module. Refactor the code and split it up into two parts. The config file code will only handle actual handling of configuration files, includes and writing new files. The newly created config parser module is then only responsible for parsing the actual contents of a configuration file, leaving everything else to callbacks provided to its provided function `git_config_parse`.