diff options
author | Gabriel Mazetto <brodock@gmail.com> | 2019-05-20 22:18:59 +0200 |
---|---|---|
committer | Gabriel Mazetto <brodock@gmail.com> | 2019-05-29 16:25:15 +0200 |
commit | bed5ae425bc4695d20cd4ad1980266a575472ab1 (patch) | |
tree | a4b4d12e17d7bf788abc5e6d5f440564410657f2 | |
parent | 785e55c23ee2480ef90510dcf63b5a7a2c6fe9ce (diff) | |
download | gitlab-ce-bed5ae425bc4695d20cd4ad1980266a575472ab1.tar.gz |
Added FDW development documentationee-10587-geo-development-fdw-docs
Improve existing FDW documentation in `doc/development` to include
additional information about the architecture and how to use FDW as
a developer.
-rw-r--r-- | doc/development/geo.md | 78 |
1 files changed, 72 insertions, 6 deletions
diff --git a/doc/development/geo.md b/doc/development/geo.md index 87ec34ec5c4..6e59fab34c7 100644 --- a/doc/development/geo.md +++ b/doc/development/geo.md @@ -10,6 +10,7 @@ the diagram below and are described in more detail within this document. ## Replication layer Geo handles replication for different components: + - [Database](#database-replication): includes the entire application, except cache and jobs. - [Git repositories](#repository-replication): includes both projects and wikis. - [Uploaded blobs](#uploads-replication): includes anything from images attached on issues @@ -209,20 +210,38 @@ bundle exec rake geo:db:migrate ### Foreign Data Wrapper -The use of [FDW](#fdw) was introduced in GitLab 10.1. +> Introduced in GitLab 10.1. + +Foreign Data Wrapper ([FDW](#fdw)) is used by the [Geo Log Cursor](#geo-log-cursor) and improves +the performance of many synchronization operations. -This is useful for the [Geo Log Cursor](#geo-log-cursor) and improves -the performance of some synchronization operations. +FDW is a PostgreSQL extension ([`postgres_fdw`](https://www.postgresql.org/docs/current/postgres-fdw.html)) that is enabled within +the Geo Tracking Database (on a **secondary** node), which allows it +to connect to the readonly database replica and perform queries and filter +data from both instances. While FDW is available in older versions of PostgreSQL, we needed to raise the minimum required version to 9.6 as this includes many performance improvements to the FDW implementation. +This persistent connection is configured as an FDW server +named `gitlab_secondary`. This configuration exists within the database's user +context only. To access the `gitlab_secondary`, GitLab needs to use the +same database user that had previously been configured. + +The Geo Tracking Database accesses the readonly database replica via FDW as a regular user, +limited by its own restrictions. The credentials are configured as a +`USER MAPPING` associated with the `SERVER` mapped previously +(`gitlab_secondary`). + +FDW configuration and credentials definition are managed automatically by the +Omnibus GitLab `gitlab-ctl reconfigure` command. + #### Refeshing the Foreign Tables -Whenever the database schema changes on the **primary** node, the -**secondary** node will need to refresh its foreign tables by running -the following: +Whenever a new Geo node is configured or the database schema changes on the +**primary** node, you must refresh the foreign tables on the **secondary** node +by running the following: ```sh bundle exec rake geo:db:refresh_foreign_tables @@ -243,6 +262,53 @@ STATEMENT: SELECT a.attname, format_type(a.atttypid, a.atttypmod) ORDER BY a.attnum ``` +#### Accessing data from a Foreign Table + +At the SQL level, all you have to do is `SELECT` data from `gitlab_secondary.*`. + +Here's an example of how to access all projects from the Geo Tracking Database's FDW: + +```sql +SELECT * FROM gitlab_secondary.projects; +``` + +As a more real-world example, this is how you filter for unarchived projects +on the Tracking Database: + +```sql +SELECT project_registry.* + FROM project_registry + JOIN gitlab_secondary.projects + ON (project_registry.project_id = gitlab_secondary.projects.id + AND gitlab_secondary.projects.archived IS FALSE) +``` + +At the ActiveRecord level, we have additional Models that represent the +foreign tables. They must be mapped in a slightly different way, and they are read-only. + +Check the existing FDW models in `ee/app/models/geo/fdw` for reference. + +From a developer's perspective, it's no different than creating a model that +represents a Database View. + +With the examples above, you can access the projects with: + +```ruby +Geo::Fdw::Project.all +``` + +and to access the `ProjectRegistry` filtering by unarchived projects: + +```ruby +# We have to use Arel here: +project_registry_table = Geo::ProjectRegistry.arel_table +fdw_project_table = Geo::Fdw::Project.arel_table + +project_registry_table.join(fdw_project_table) + .on(project_registry_table[:project_id].eq(fdw_project_table[:id])) + .where((fdw_project_table[:archived]).eq(true)) # if you append `.to_sql` you can check generated query +``` + ## Finders Geo uses [Finders](https://gitlab.com/gitlab-org/gitlab-ee/tree/master/app/finders), |