| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
|
|
|
|
| |
Instead of using a signed type (`off_t`) use a new `git_object_size_t`
for the sizes of objects.
|
|
|
|
|
|
|
| |
When `GIT_BLOB_FILTER_ATTTRIBUTES_FROM_HEAD` is passed to
`git_blob_filter`, read attributes from `gitattributes` files that
are checked in to the repository at the HEAD revision. This passes
the flag `GIT_FILTER_ATTRIBUTES_FROM_HEAD` to the filter functions.
|
|
|
|
|
|
|
|
| |
Introduce `GIT_BLOB_FILTER_NO_SYSTEM_ATTRIBUTES`, which tells
`git_blob_filter` to ignore the system-wide attributes file, usually
`/etc/gitattributes`.
This simply passes the appropriate flag to the attribute loading code.
|
|
|
|
| |
Users should now use `git_blob_filter`.
|
|
|
|
|
| |
Provide a function to filter blobs that allows for more functionality
than the existing `git_blob_filtered_content` function.
|
|
|
|
|
|
| |
The majority of functions are named `from_something` (with an
underscore) instead of `fromsomething`. Update the blob functions for
consistency with the rest of the library.
|
|
|
|
|
|
| |
Our blob size is a `git_off_t`, which is a signed 64 bit int. This may
be erroneously negative or larger than `SIZE_MAX`. Ensure that the blob
size fits into a `size_t` before casting.
|
|
|
|
|
| |
Move to the `git_error` name in the internal API for error-related
functions.
|
|
|
|
| |
Use the new object_type enumeration names within the codebase.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, parsing objects is strictly tied to having an ODB object
available. This makes it hard to parse an object when all that is
available is its raw object and size. Furthermore, hacking around that
limitation by directly creating an ODB structure either on stack or on
heap does not really work that well due to ODB objects being reference
counted and then automatically free'd when reaching a reference count of
zero.
In some occasions parsing raw objects without touching the ODB
is actually recuired, though. One use case is for example object
verification, where we want to assure that an object is valid before
inserting it into the ODB or writing it into the git repository.
Asa first step towards that, introduce a distinction between raw and ODB
objects for blobs. Creation of ODB objects stays the same by simply
using `git_blob__parse`, but a new function `git_blob__parse_raw` has
been added that creates a blob from a pair of data and size. By setting
a new flag inside of the blob, we can now distinguish whether it is a
raw or ODB object now and treat it accordingly in several places.
Note that the blob data passed in is not being copied. Because of that,
callers need to make sure to keep it alive during the blob's life time.
This is being used to avoid unnecessarily increasing the memory
footprint when parsing largish blobs.
|
|
|
|
|
|
|
|
|
| |
Going forward, we will have to change how blob sizes are calculated
based on whether the blob is a cahed object part of the ODB or not. In
order to not have to distinguish between those two object types
repeatedly when accessing the blob's data or size, encapsulate all
existing direct uses of those fields by instead using
`git_blob_rawcontent` and `git_blob_rawsize`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Next to including several files, our "common.h" header also declares
various macros which are then used throughout the project. As such, we
have to make sure to always include this file first in all
implementation files. Otherwise, we might encounter problems or even
silent behavioural differences due to macros or defines not being
defined as they should be. So in fact, our header and implementation
files should make sure to always include "common.h" first.
This commit does so by establishing a common include pattern. Header
files inside of "src" will now always include "common.h" as its first
other file, separated by a newline from all the other includes to make
it stand out as special. There are two cases for the implementation
files. If they do have a matching header file, they will always include
this one first, leading to "common.h" being transitively included as
first file. If they do not have a matching header file, they instead
include "common.h" as first file themselves.
This fixes the outlined problems and will become our standard practice
for header and source files inside of the "src/" from now on.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The recent introduction of the commondir variable of a repository
requires callers to distinguish whether their files are part of
the dot-git directory or the common directory shared between
multpile worktrees. In order to take the burden from callers and
unify knowledge on which files reside where, the
`git_repository_item_path` function has been introduced which
encapsulate this knowledge.
Modify most existing callers of `git_repository_path` to use
`git_repository_item_path` instead, thus making them implicitly
aware of the common directory.
|
|
|
|
|
|
|
|
| |
Error messages should be sentence fragments, and therefore:
1. Should not begin with a capital letter,
2. Should not conclude with punctuation, and
3. Should not end a sentence and begin a new one
|
|
|
|
|
|
| |
The callback mechanism makes it awkward to write data from an IO
source; move to `_fromstream()` which lets the caller remain in control,
in the same vein as we prefer iterators over foreach callbacks.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pair of `git_blob_create_frombuffer()` and
`git_blob_create_frombuffer_commit()` is meant to replace
`git_blob_create_fromchunks()` by providing a way for a user to write a
new blob when they want filtering or they do not know the size.
This approach allows the caller to retain control over when to add data
to this buffer and a more natural fit into higher-level language's own
stream abstractions instead of having to handle IO wait in the callback.
The in-memory buffer size of 2MB is chosen somewhat arbitrarily to be a
round multiple of usual page sizes and a value where most blobs seem
likely to be either going to be way below or way over that size. It's
also a round number of pages.
This implementation re-uses the helper we have from `_fromchunks()` so
we end up writing everything to disk, but hopefully more efficiently
than with a default filebuf. A later optimisation can be to avoid
writing the in-memory contents to disk, with some extra complexity.
|
|
|
|
|
| |
This also affects `git_index_add_bypath()` by providing a better error
message and a specific error code when a directory is passed.
|
|
|
|
|
|
|
|
|
|
| |
Restricting files to size_t is a silly limitation. The loose backend
writes to a file directly, so there is no issue in using 63 bits for the
size.
We still assume that the header is going to fit in 64 bytes, which does
mean quite a bit smaller files due to the run-length encoding, but it's
still a much larger size than you would want Git to handle.
|
| |
|
|
|
|
|
| |
For consistency with the rest of the library, where an opt is an
options *structure*.
|
|
|
|
|
|
| |
Provide a convenience function that creates a buffer that can be provided
to callers but will not be freed via `git_buf_free`, so the buffer
creator maintains the allocation lifecycle of the buffer's contents.
|
| |
|
|
|
|
|
|
|
|
|
| |
Diff and status do not want core.safecrlf to actually raise an
error regardless of the setting, so this extends the filter API
with an additional options flags parameter and adds a flag so that
filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating
that unsafe filter application should be downgraded from a failure
to a warning.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
The callback to supply data chunks could return a negative value
to stop creation of the blob, but we were neither using GIT_EUSER
nor propagating the return value. This makes things use the new
behavior of returning the negative value back to the user.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the git_buf struct that was used internally into an
externally available structure and eliminates the git_buffer.
As part of that, some of the special cases that arose with the
externally used git_buffer were blended into the git_buf, such as
being careful about git_buf objects that may have a NULL ptr and
allowing for bufs with a valid ptr and size but zero asize as a
way of referring to externally owned data.
|
|
|
|
|
|
|
| |
This adds the ident filter (that knows how to replace $Id$) and
tweaks the filter APIs and code so that git_filter_source objects
actually have the updated OID of the object being filtered when
it is a known value.
|
|
|
|
|
|
|
|
|
|
|
| |
This moves the git_filter_list into the public API so that users
can create, apply, and dispose of filter lists. This allows more
granular application of filters to user data outside of libgit2
internals.
This also converts all the internal usage of filters to the public
APIs along with a few small tweaks to make it easier to use the
public git_buffer stuff alongside the internal git_buf.
|
|
|
|
|
|
|
| |
This creates include/sys/filter.h with a basic definition of a
git_filter and then converts the internal code to use it. There
are related internal objects (git_filter_list) that we will want
to publish at some point, but this is a first step.
|
|
|
|
|
|
|
|
|
|
| |
This begins the process of exposing git_filter objects to the
public API. This includes:
* new public type and API for `git_buffer` through which an
allocated buffer can be passed to the user
* new API `git_blob_filtered_content`
* make the git_filter type and GIT_FILTER_TO_... constants public
|
|
|
|
|
|
| |
This is in preparation for moving the hashing to the frontend, which
requires us to handle the incoming data before passing it to the
backend's stream.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The size data in the index may not reflect the actual size of the
blob data from the ODB when content filtering comes into play.
This commit fixes rename detection to use the actual blob size when
calculating data signatures instead of the value from the index.
Because of a misunderstanding on my part, I first converted the
git_index_add_bypath API to use the post-filtered blob data size
in creating the index entry. I backed that change out, but I
kept the overall refactoring of that routine and the new internal
git_blob__create_from_paths API because it eliminates an extra
stat() call from the code that adds a file to the index.
The existing tests actually cover this code path, at least when
running on Windows, so at this point I'm not adding new tests to
cover the changes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a significant reorganization of the diff code to break it
into a set of more clearly distinct files and to document the new
organization. Hopefully this will make the diff code easier to
understand and to extend.
This adds a new `git_diff_driver` object that looks of diff driver
information from the attributes and the config so that things like
function content in diff headers can be provided. The full driver
spec is not implemented in the commit - this is focused on the
reorganization of the code and putting the driver hooks in place.
This also removes a few #includes from src/repository.h that were
overbroad, but as a result required extra #includes in a variety
of places since including src/repository.h no longer results in
pulling in the whole world.
|
| |
|
|
|
|
|
| |
Removed useless prototype and renamed object typecast functions
declaration macro.
|
|
|
|
|
|
| |
This removes the GIT_INLINE versions of the simple git_object
accessors and standardizes them with a helper macro in src/object.h
to build the function bodies.
|
|
|
|
|
| |
This unifies the object parse functions into one signature that
takes an odb_object.
|
|
|
|
|
|
|
|
|
|
| |
This adds create and free callback to the git_objects_table so
that more of the creation and destruction of objects can be table
driven instead of using switch statements. This also makes the
semantics of certain object creation functions consistent so that
we can make better use of function pointers. This also fixes a
theoretical error case where an object allocation fails and we
end up storing NULL into the cache.
|
|
|
|
|
| |
This uses the odb object accessors so we can change the internals
more easily...
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves some of the odb_backend stuff that is related to the
internals of an odb_backend implementation into include/git2/sys.
Some of the stuff related to streaming I left in include/git2
because it seemed like it would be reasonably needed by a normal
user who wanted to stream objects into and out of the ODB.
Also, I added APIs for traversing the list of backends so that
some of the tests would not need to access ODB internals.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds crlf/lf conversion functions into buf_text with more
efficient implementations that bypass the high level buffer
functions. They attempt to minimize the number of reallocations
done and they directly write the buffer data as needed if they
know that there is enough memory allocated to memcpy data.
Tests are added for these new functions. The crlf.c code is
updated to use the new functions.
Removed the include of buf_text.h from filter.h and just include
it more narrowly in the places that need it.
|