| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
According to Debian code search, it's accumulated 0 users outside of
tracker-miners. If everyone is relying on either tracker-miner-fs/rss
or implementing their own minimal abstraction (TrackerMinerFS is a
complex object, but the others are all fairly shallow), it does not
make sense to drag this as public API anymore.
This code moves into tracker-miners, and every user is expected to
consume and insert data using the sparql library.
|
|
|
|
|
| |
I find this tracing useful sometimes when debugging test failures.
However the existing code had bitrotted quite a lot. Now it works again.
|
| |
|
|
|
|
|
| |
The priv pointer has been removed from object structs for all private
types. For public types it's got to be kept there for ABI stability.
|
|
|
|
| |
Let's stick to SPARQL1.1 correct syntax.
|
|
|
|
|
|
|
|
|
| |
It is not even clear this is possible in real life cases, however the
standalone tracker-file-notifier tests fall into this (due to IRI not being
ever set, still this is an async op). In the case of 2 consecutive CREATED
events on the same file, it would be dealt with in TrackerMinerFS as CREATED+
UPDATED. This was already harmless, but we can do better and swallow one of
such events.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit cef502e668a640 ("Add TrackerMinerFS::move-file vmethod")
introduced a regression which sometimes led to errors like this:
Tracker-FATAL-WARNING: Parent 'file:///tmp/tracker-miner-fs-test-77E2LZ/recursive/4' not indexed yet
This was causing tracker-miner-fs-test to fail in some cases.
TrackerTaskPool assumes that there is only one task in the pool per
GFile. When processing item_move() operations this wasn't true because
we'd create one task for removing the existing dest_file, and another
task for updating the URL of source_file to point to dest_file. Both
tasks would be associated with dest_file.
If the SPARQL buffer was flushed after the first task was created and
before the second task was created, the second task would overwrite
the first task in the ->priv->tasks hash table, so when the first
task completed, the second task would be removed from the task pool
without ever executing.
This would mean that the URL of source_file never got updated to
point at dest_file, which triggered the "Parent not indexed yet" error
later on.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
First, seems cleaner to do it this way, as GObject data has undefined
lifetime (yes, as long as the object lives, but the TrackerFileSystem may
cache those).
But this also fixes an unintended side effect that "attribute only" updates
take precedence over full updates, events themselves may be coalesced away,
but data would remain. Actually it's the other way around, if we get a full
update and an attributes-only update, we may discard the second.
|
|
|
|
| |
https://bugzilla.gnome.org/show_bug.cgi?id=793061
|
|
|
|
|
|
| |
This is doubly cached in the TrackerFileSystem and as GObject qdata (and
obtained from the former in that case). Let's just rely on the former for
all.
|
|
|
|
|
|
|
|
| |
This is library code, so let's use g_debug() which obeys G_MESSAGES_DEBUG,
instead of g_message() which shall be printed unless there is an special
log handler that filters those out.
This code may run on 3rd party code, where we can't trust we'll have
a log handler that catches those from going to stdout.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace all 4 queues for the different create/update/delete/move
events with a single queue that contains generic QueueEvent
structs. The GList node of the last event is stored as GFile
qdata, in order to perform fast lookups when coalescing events.
queue_event_coalesce() will attempt to convert any two events
into less than that, it does rely on merging two events with
no related events in between, those should be coalesced (or
attempted to) when they arrive.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TrackerFileNotifier guarantees that parent files are emitted before any
children, the other usecase that this used to cover are explicit
tracker_miner_fs_check_file() calls which used to also index parent
directories, but it doesn't do that anymore since quite some time.
So the only remaining case where we could end up with a file whose parent
is neither being currently processed nor indexed is actual bugs. In that
case, the bug likely won't go away by trying harder, which leads to logging
for every child file, as they'll fail in cascade.
Let's be less stubborn here, warn (once!) about the missing file and ditch
all pending events happening on/inside it. People love filing bugs about
tracker logging stuff, so I don't think bugs will go unnoticed anyway.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tracker may end up with nfo:FileDataObject prior to handling monitor
events. Be it leftover data from previous bugs, explicit "tracker index"
calls, or data from some other application.
As we can't be really sure of the data consistence, always fallback to
a URN query so we don't break nie:url UNIQUE constraint (inverse functional
property in SPARQL parlance).
https://bugzilla.gnome.org/show_bug.cgi?id=786132
|
|
|
|
| |
This prevents us from hitting too hard the main loop.
|
|
|
|
|
|
|
| |
The checks to notify about indexing having finished on TrackerIndexingTree
roots were mistaking ItemMovedData* with GFile*, which lead to warnings.
This should be harmless, the signal might be possibly emitted before the
move op is dispatched, that's all.
|
|
|
|
|
| |
Fixes handling of moved files, since the subclass vmethod wouldn't
be triggered.
|
|
|
|
|
|
| |
The only user that might ever care does already implement it itself.
There is no need to provide this infrastructure that will be scarcely
used in libtracker-miner API.
|
|
|
|
|
|
|
|
|
| |
This is not a signal because external users of a TrackerMiner have no
business in modifying behavior at this level, this is reserved for
subclasses that presumably know what they are doing.
This vmethod is toggled for every event that gets received from the
TrackerFileNotifier, before the file gets to hit any processing queue.
|
|
|
|
| |
So it can be specified from the miner.
|
|
|
|
|
|
|
| |
Only check_file() remains, with an extra priority argument. The default
G_PRIORITY_HIGH in the older check_file() was unintuitive, and is now
explicitly specified in the org.freedesktop.Tracker1.Miner.Files.Index
interface calls.
|
|
|
|
|
| |
More unused API that is a thin wrapper to TrackerIndexingTree, just
remove it.
|
|
|
|
|
|
|
| |
The whole set of tracker_miner_fs_add_directory_without_parent(),
tracker_miner_fs_directory_add(), directory_remove() and
directory_remove_full() are all covered by TrackerIndexingTree and
basically unused, except for code in examples/.
|
|
|
|
| |
It did nothing at the libtracker-miner level, and can be safely removed.
|
|
|
|
|
|
|
|
|
|
|
| |
This is just used to set the TRACKER_DIRECTORY_FLAG_CHECK_MTIME flag on
the TrackerIndexingTree for all files. Given libtracker-miner has this
fine grained switch and all use of it happens in src/miners/fs, just move
the global toggle there and remove it from libtracker-miner API.
The only usage of this flag inside libtracker-miner happened inside
tracker_miner_fs_directory_add(), which was superseded by
TrackerIndexingTree too and is scheduled for removal.
|
|
|
|
|
|
|
|
| |
It makes no sense to have that at the library level, just move thumbnail
handling to TrackerMinerFiles.
Coincidentally, this removes further queries that required knowledge about
the ontology in TrackerMinerFS.
|
|
|
|
|
| |
It's unused and unneeded, just set the TRACKER_DIRECTORY_FLAG_CHECK_MTIME
flag on the TrackerIndexingTree.
|
|
|
|
| |
It's been a no-op for years.
|
|
|
|
|
|
|
| |
It is cached once to be used once. Besides, the parent GFile is
obviously guaranteed to be a folder, and folders are (not so
obviously) guaranteed to be cached. Thus looking up the URN should
be fast enough.
|
|
|
|
|
| |
We delegate the SPARQL generation to this vmethod, in order to
keep TrackerMinerFS as agnostic of the ontology as possible.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only_children argument is a bit awkward as we emit ::remove-file
on a file that is not removed at all. The TrackerSparqlBuilder argument
has been also removed (the signals just have a gchar* return value
containing the SPARQL for the delete op) so it's up to the caller to
decide how to compose the SPARQL.
This allows removing some more knowledge about specific ontologies
from TrackerMinerFS, the ontology-dependent upper layers will know
better how to delete the corresponding entities.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's a few changes here:
- The 2 vmethods are now given a GTask, its cancellable is to be
used if the handling goes async.
- tracker_miner_fs_file_notify() has changed into a more generic
tracker_miner_fs_notify_finish() method, that takes such GTask
and completes it.
- The vmethods are no longer given a TrackerSparqlBuilder, instead
they are expected to create the SPARQL through whatever mean is
most fit. The sparql is given in the tracker_miner_fs_notify_finish()
func. This opens the door to TrackerMinerFS implementations using
TrackerResource.
The intent is 1) Pass something to these vmethods that the user
can't forge or mess with, as matching on GFile relies that it's the
same pointer that it was given in the vmethods. And 2) Make the finish()
function more generic to be fit to other methods going async.
|
|
|
|
|
|
|
| |
It's been deprecated for a long time, it stands in the middle of
detaching TrackerMiner from DBus, and it's one less piece of
ontology-dependent libtracker-miner code. Enough reasons to
finally remove this.
|
|
|
|
|
| |
The libmediaart dependency was disabled in commit 6a05068624bfa, it
doesn't make sense to drag this code around.
|
|
|
|
|
|
|
|
|
| |
The file might or might not be inserted to the queue, which meant that
the extra ref created outside the call might never dropped if the file
didn't end up inserted again. Fix this by doing the refcount increase
when actually inserting the file back in the queue.
Reported by Jose M. Arroyo <jose.m.arroyo.se@gmail.com>.
|
|
|
|
|
|
| |
The QUEUE_UPDATED elements where being additionally checked against
the QUEUE_WRITEBACK queue. This was harmless, but potentially confusing.
Spotted through Coverity.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Noticed this when executing functional tests for write-back:
(tracker-miner-fs:21288): Tracker-CRITICAL **: Could not execute sparql:
Subject `(null)' is not in domain `nfo:FileDataObject' of property
`nfo:fileName'
This warning happens in item_move() when the source just didn't have
time to be indexed. One example:
copy ("file.txt", "temp_XYZ.file.txt")
- received G_FILE_MONITOR_EVENT_CREATED ("temp.file.txt")
- received G_FILE_MONITOR_EVENT_CHANGED ("temp.file.txt")
- received G_FILE_MONITOR_EVENT_CHANGES_DONE_HINT ("temp.file.txt")
modify ("temp_XYZ.file.txt")
- received G_FILE_MONITOR_EVENT_CHANGED ("temp.file.txt")
- received G_FILE_MONITOR_EVENT_CHANGES_DONE_HINT ("temp.file.txt")
mv ("temp_XYZ.file.txt", "file.txt")
- received G_FILE_MONITOR_EVENT_MOVED ("temp.file.txt", "file.txt")
- emitted ITEM_MOVED ("temp.file.txt", "file.txt")
It was already handled in item_move() in past, but removed with eef0e7f
(libtracker-miner: Remove useless code) after previously misidentified
as useless in scope of ee58e67 (libtracker-miner: Add compat layer for
tracker_miner_fs_directory_*)
The comment from ee58e67 """FIXME: This situation shouldn't happen from
a TrackerFileNotifier event""" simply cannot be satisfied: no way to get
"temp.file.txt" indexed before ITEM_MOVED is processed - the file
disappears too fast.
https://bugzilla.gnome.org/show_bug.cgi?id=678986
|
|
|
|
|
|
|
|
| |
Fixes warnings when moving indexing roots around. This query expects this
property to be bound, resulting in no-op if that's not the case (e.g.
indexing roots), later reinsertions of nie:url and other properties with
max cardinality=1 trigger the whole update failure, because those weren't
properly removed.
|
|
|
|
|
| |
The query to update parent-dependent data was using the source file
urn as a graph urn.
|
|
|
|
|
|
|
| |
If we receive a tracker_miner_fs_check_file() request for a file out of
indexing trees, it'd usually end up recursing until it ran out of parents
(that is, up to file:///). This is quite pointless, if only one file was
requested to be indexed.
|
|
|
|
|
|
|
| |
If should_wait() returns TRUE for an element, we end up putting the file in
the queue again and incrementing its reentry counter. This situation should
be deemed normal, so we can just peek the element, and only pop it if we
should not wait.
|
|
|
|
|
| |
REENTRY_MAX is 2 currently, so counter<2 is effectively 1, which
doesn't make much for "reentry".
|
|
|
|
|
|
|
| |
If a folder being deleted affects operations currently in the currently
issued tasks (eg. those we emitted ::process-file on) and writeback buffers,
those operations would still attempt to proceed, with different degrees
of success.
|
|
|
|
| |
Moreover, an infinite loop may occurs if process-file signal always fails.
|
|
|
|
|
|
| |
Directories must get all children invalidated, because already queued tasks
might contain new instances of those same files, in which case they would
still find the previous URN.
|
|
|
|
| |
This will be useful for delete operations.
|
|
|
|
|
|
|
|
| |
These leaks had huge impact as each TrackerTask had a reference
to a GFile, which prevent them to be removed from TrackerFileSystem
when calling tracker_file_system_forget_files(). Due to this behavior,
adding/removing/re-adding folders resulted in some folders/files not
being indexed.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The check for these errors was done specifically so we could still
insert (even if incomplete) data on tracker-extract failures, when
we used to communicate with it directly from tracker-miner-fs.
Nowadays, tracker-extract is a TrackerDecorator, and tracker-miner-fs
should most likely receive only errors here on ENOENT and other errors
that affect the file and its info as a whole. In these situations
we end up with a task with a completely empty sparql string, which
doesn't help much here.
|