From c8aa9fdf5dc15e2c508acb22df03d431983569ed Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Wed, 28 Oct 2015 13:17:29 -0700 Subject: strbuf: make strbuf_getline_crlf() global Often we read "text" files that are supplied by the end user (e.g. commit log message that was edited with $GIT_EDITOR upon 'git commit -e'), and in some environments lines in a text file are terminated with CRLF. Existing strbuf_getline() knows to read a single line and then strip the terminating byte from the result, but it is handy to have a version that is more tailored for a "text" input that takes both '\n' and '\r\n' as line terminator (aka in POSIX lingo) and returns the body of the line after stripping . Recently reimplemented "git am" uses such a function implemented privately; move it to strbuf.[ch] and make it available for others. Note that we do not blindly replace calls to strbuf_getline() that uses LF as the line terminator with calls to strbuf_getline_crlf() and this is very much deliberate. Some callers may want to treat an incoming line that ends with CR (and terminated with LF) to have a payload that includes the final CR, and such a blind replacement will result in misconversion when done without code audit. Signed-off-by: Junio C Hamano --- strbuf.h | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'strbuf.h') diff --git a/strbuf.h b/strbuf.h index 7123fca7af..d84c866ab1 100644 --- a/strbuf.h +++ b/strbuf.h @@ -388,6 +388,13 @@ extern int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint); */ extern int strbuf_getline(struct strbuf *, FILE *, int); +/* + * Similar to strbuf_getline(), but uses '\n' as the terminator, + * and additionally treats a '\r' that comes immediately before '\n' + * as part of the terminator. + */ +extern int strbuf_getline_crlf(struct strbuf *, FILE *); + /** * Like `strbuf_getline`, but keeps the trailing terminator (if * any) in the buffer. -- cgit v1.2.1 From 8f309aeb8225a9c26f20c0dbc031f1ea8df75d49 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Wed, 13 Jan 2016 15:31:17 -0800 Subject: strbuf: introduce strbuf_getline_{lf,nul}() The strbuf_getline() interface allows a byte other than LF or NUL as the line terminator, but this is only because I wrote these codepaths anticipating that there might be a value other than NUL and LF that could be useful when I introduced line_termination long time ago. No useful caller that uses other value has emerged. By now, it is clear that the interface is overly broad without a good reason. Many codepaths have hardcoded preference to read either LF terminated or NUL terminated records from their input, and then call strbuf_getline() with LF or NUL as the third parameter. This step introduces two thin wrappers around strbuf_getline(), namely, strbuf_getline_lf() and strbuf_getline_nul(), and mechanically rewrites these call sites to call either one of them. The changes contained in this patch are: * introduction of these two functions in strbuf.[ch] * mechanical conversion of all callers to strbuf_getline() with either '\n' or '\0' as the third parameter to instead call the respective thin wrapper. After this step, output from "git grep 'strbuf_getline('" would become a lot smaller. An interim goal of this series is to make this an empty set, so that we can have strbuf_getline_crlf() take over the shorter name strbuf_getline(). Signed-off-by: Junio C Hamano --- strbuf.h | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) (limited to 'strbuf.h') diff --git a/strbuf.h b/strbuf.h index d84c866ab1..e56ec77e2b 100644 --- a/strbuf.h +++ b/strbuf.h @@ -388,13 +388,25 @@ extern int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint); */ extern int strbuf_getline(struct strbuf *, FILE *, int); +/** + * The strbuf_getline*() family of functions share this signature, but + * have different line termination conventions. + */ +typedef int (*strbuf_getline_fn)(struct strbuf *, FILE *); + +/* Uses LF as the line terminator */ +extern int strbuf_getline_lf(struct strbuf *sb, FILE *fp); + +/* Uses NUL as the line terminator */ +extern int strbuf_getline_nul(struct strbuf *sb, FILE *fp); + /* - * Similar to strbuf_getline(), but uses '\n' as the terminator, - * and additionally treats a '\r' that comes immediately before '\n' - * as part of the terminator. + * Similar to strbuf_getline_lf(), but additionally treats a CR that + * comes immediately before the LF as part of the terminator. */ extern int strbuf_getline_crlf(struct strbuf *, FILE *); + /** * Like `strbuf_getline`, but keeps the trailing terminator (if * any) in the buffer. -- cgit v1.2.1 From 1a0c8dfd89475d6bb09ddee8c019cf0ae5b3bdc2 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Wed, 13 Jan 2016 18:32:23 -0800 Subject: strbuf: give strbuf_getline() to the "most text friendly" variant Now there is no direct caller to strbuf_getline(), we can demote it to file-scope static that is private to strbuf.c and rename it to strbuf_getdelim(). Rename strbuf_getline_crlf(), which is designed to be the most "text friendly" variant, and allow it to take over this simplest name, strbuf_getline(), so we can add more uses of it without having to type _crlf over and over again in the coming steps. Signed-off-by: Junio C Hamano --- strbuf.h | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-) (limited to 'strbuf.h') diff --git a/strbuf.h b/strbuf.h index e56ec77e2b..970c24ab43 100644 --- a/strbuf.h +++ b/strbuf.h @@ -354,8 +354,8 @@ extern void strbuf_addftime(struct strbuf *sb, const char *fmt, const struct tm * * NOTE: The buffer is rewound if the read fails. If -1 is returned, * `errno` must be consulted, like you would do for `read(3)`. - * `strbuf_read()`, `strbuf_read_file()` and `strbuf_getline()` has the - * same behaviour as well. + * `strbuf_read()`, `strbuf_read_file()` and `strbuf_getline_*()` + * family of functions have the same behaviour as well. */ extern size_t strbuf_fread(struct strbuf *, size_t, FILE *); @@ -379,19 +379,14 @@ extern ssize_t strbuf_read_file(struct strbuf *sb, const char *path, size_t hint extern int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint); /** - * Read a line from a FILE *, overwriting the existing contents - * of the strbuf. The second argument specifies the line - * terminator character, typically `'\n'`. + * Read a line from a FILE *, overwriting the existing contents of + * the strbuf. The strbuf_getline*() family of functions share + * this signature, but have different line termination conventions. + * * Reading stops after the terminator or at EOF. The terminator * is removed from the buffer before returning. Returns 0 unless * there was nothing left before EOF, in which case it returns `EOF`. */ -extern int strbuf_getline(struct strbuf *, FILE *, int); - -/** - * The strbuf_getline*() family of functions share this signature, but - * have different line termination conventions. - */ typedef int (*strbuf_getline_fn)(struct strbuf *, FILE *); /* Uses LF as the line terminator */ @@ -403,8 +398,11 @@ extern int strbuf_getline_nul(struct strbuf *sb, FILE *fp); /* * Similar to strbuf_getline_lf(), but additionally treats a CR that * comes immediately before the LF as part of the terminator. + * This is the most friendly version to be used to read "text" files + * that can come from platforms whose native text format is CRLF + * terminated. */ -extern int strbuf_getline_crlf(struct strbuf *, FILE *); +extern int strbuf_getline(struct strbuf *, FILE *); /** -- cgit v1.2.1