| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
| |
This register sets the output width/height of the CRTC, which will be
read in by the PV. This gets us to the point of mode sets to a new
resolution working sometimes, though occasionally I end up with just a
black screen.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
| |
This should be the fastest screen update path we have, but it doesn't
seem to have really improved performance with vblank_mode=0
fullscreen.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
| |
They proved valuable on i915, and I expect them to do so on vc4 as
well.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
| |
In order to support the pageflip ioctl, we have to be able to defer
the body of the mode set to a workqueue, so that the pageflip ioctl
can return immediately. Pageflipping gets us ~3x performance on
fullscreen GL rendering in X.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
| |
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
| |
It's unlikely that we'd hit another vblank during
drm_crtc_handle_vblank() time, but we might as well handle it if we
do.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
|
|
|
|
| |
This is just drm_atomic_helper_commit(), with wait_for_fence replaced
with V3D seqno waits based on msm's atomic commit's wait loop.
This means we now synchronize to our own rendering. The old
wait_for_fence loop did nothing since we never attached a dmabuf fence
to our plane state.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
| |
All of our current interfaces want to be interruptible, but
modesetting doesn't want that.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
| |
We want to have support for this from the beginning.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
|
|
|
|
|
|
| |
Thix fixes WARN_ONs and bad plane configuration when the X cursor hit
the top/left of the screen.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
|
| |
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This fixes a security hole of letting userspace map and modify the
buffers, keeps each userspace client from needing to hang on to a set
of them, reduces kernel validation overhead, and improves stability
(possibly due to reducing the severity of an addressing issue in the
hardware's tile state buffer).
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Validation was expensive, and there aren't enough variations of RCLs
to warrant it. This also may let us cache generated RCLs, to reduce
overhead even further.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| | |
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| | |
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
My surprise about having to set INTENA for OUTOMEM when I hadn't
cleared anything in it now makes sense, having found that there is
just one set of underlying enable bits behind the INTENA/INTDIS
registers.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| | |
The 256k was left over from early bringup.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
We get validation that direct texturing doesn't address off the start
of the array, some more symbolic #defines, and a fix for the dithering
bits.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| | |
This helps make sure that we don't have any state left around (like
the binner overflow allocation).
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is policy that we've often left up to userland in various DRM
drivers. However, given how thoroughly memory-constrained we are on
VC4, letting userspace run away with things and over-inflate their
caches is definitely not helping.
This hasn't reduced maximum throughput in my testing, since we're only
waiting until the exec just before the one we queued is finished, so
there will be a job to submit the moment the GPU sends us the frame
done interrupt. We would only actually idle the GPU if assembling
some scenes takes more CPU time than the time it takes to complete
some scenes on the GPU.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| | |
I was returning the seqno before the increment, so userspace's waits
on job completion could be too early.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| | |
This avoids generating a uidiv, and saves us another ~.6% in
check_tex_size as a result of loadstore_tile_buffer_general.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| | |
This avoids generating a uidiv, and saves us about a percent in
check_tex_size as a result of loadstore_tile_buffer_general.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If we're going to insist that the relocation target is the tile alloc
BO that we saw during binner config, then just ignore the relocations
and point at the tile alloc BO. Saves an immediate .6% or so of CPU,
and will let us ditch the relocations we've been processing for it,
too.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We typically get bin_tiles_x * bin_tiles_y of these, so it makes sense
to do a tiny bit of up front work to make this easier (particularly,
avoid a divide, since those are terrible). Saves about half a percent
of the CPU.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| | |
We use ~0 (584 years) to indicate "wait forever", so don't actually
set up a timer for 584 years from now. Saves about 1% CPU.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
All of our BOs are mapped write-combining both in the kernel and
userspace by drm_gem_cma_helper.c, so anything we've written will end
up in main memory, anyway. The only coherency trouble we might have
would be if we had content in the CPU's write combining buffers that
hadn't been flushed, but a CPU cache flush is unlikely to help with
that, anyway.
Improves no-swapbuffers drawarrays-mode isosurf performance by around 20%.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| | |
Improves no-swapbuffers drawarrays-mode isosurf performance by
0.219463% +/- 0.166978% (n=4/5)
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| | |
Fixes another deadlock error from lockdep.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
We often return -EPROBE_DEFER from the various components, and we
would have left a bunch of GEM stuff running and allocated. Cleans up
a boot-time warning about how we leaked the 256k of overflow_mem.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| | |
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| | |
This gets us a bit farther when running low on CMA memory.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| | |
The overflow handler notably walks through our BO cache structs, which
are protected by this lock.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| | |
The hard IRQ handler also takes this lock, so we need interrupts
disabled or we'll deadlock.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
vc4_gem_init() initializes the BO cache, but v3d probe was setting up
the interrupt handler, and potentially taking an interrupt and trying
to allocate an overflow BO out of the cache before then.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| | |
If there were stray IRQs, the core kernel would never had a chance to
shut it down.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| | |
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This is useful for doing blit operations using the render engine,
which can perform reads from raster textures into a tiled output, when
the texture fetch engine has proved incapable of fetching correctly.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| | |
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| | |
This more like how the simulator writes the aligment, and consistent
with the other tiling formats.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
executed."
This reverts commit cd95e2a4feb4cf7a5e47da3e320f9fc3336899e4.
This appears to be broken on 3.18 on rpi2, and I haven't figured out
why yet. It causes a system lockup with sometimes an angry RCU
complaint beforehand.
|
| |
| |
| |
| |
| |
| |
| | |
I've been relying on the dumb APIs until now, but that's against the
rules.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
danvet's talk recommended avoiding relative waits entirely, but his
example was of relative waits with units (vblank numbers) that were
potentially longer than the interval between signals coming in, so you
couldn't adjust the ioctl arguments appropriately.
This change also has a wait unit that might be too large (jiffies),
but given that the other argument (an emitted seqno) is guaranteed to
eventually result in a nonblocking successful return, this isn't
really a big deal.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This was one of danvet's recommendations from his ioctls talk, and it
makes sense. It also happens to occupy a 32-bit padding slot in the
struct.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| | |
This is a flag day for the kernel ABI, but it's a requirement for
merging the code.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If the user could rewrite shader code while the GPU was executing it,
then all of vc4_validate_shaders.c's safety checks would be
invalidated, notably by using the direct-addressing TMU fetches to
read arbitrary system memory.
This hopefully closes our last remaining root hole.
Signed-off-by: Eric Anholt <eric@anholt.net>
|
| |
| |
| |
| |
| |
| | |
This is a step in eliminating the rewrite-the-shaders root hole.
Signed-off-by: Eric Anholt <eric@anholt.net>
|