summaryrefslogtreecommitdiff
path: root/doc/TODO
blob: d19669b50e054e82f2c24554917ed7372be801df (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
TODO list for PostgreSQL
========================
Last updated:		Wed Apr  7 15:14:17 EDT 2004

Current maintainer:	Bruce Momjian (pgman@candle.pha.pa.us)

The most recent version of this document can be viewed at
the PostgreSQL web site, http://www.PostgreSQL.org.

A dash (-) marks changes that will appear in the upcoming 7.5 release.

Bracketed items "[]" have more detailed.


Urgent
======

* Add replication of distributed databases
	o Automatic failover
	o Load balancing
	o Master/slave replication
	o Multi-master replication
	o Partition data across servers
	o Queries across databases or servers (two-phase commit)
	o Allow replication over unreliable or non-persistent links
* Point-in-time data recovery using backup and write-ahead log,
  http://momjian.postgresql.org/main/writings/pgsql/project/pitr.html
* Create native Win32 port, http://momjian.postgresql.org/main/writings/pgsql/project/win32.html


Administration
==============

* Incremental backups
* Remove unreferenced table files and temp tables during database vacuum
  or postmaster startup (Bruce)
* Remove behavior of postmaster -o after making postmaster/postgres
  flags unique
* Allow configuration files to be specified in a different directory
* Allow limits on per-db/user connections
* Add group object ownership, so groups can rename/drop/grant on objects,
  so we can implement roles
* Add the concept of dataspaces/tablespaces [tablespaces]
* -Allow logging of only data definition(DDL), or DDL and modification statements
* -Allow log lines to include session-level information, like database and user
* Allow server log information to be output as INSERT statements
* Prevent default re-use of sysids for dropped users and groups
* Prevent dropping user that still owns objects, or auto-drop the objects
* Allow pooled connections to query prepared queries
* Allow pooled connections to close all open WITH HOLD cursors
* Allow major upgrades without dump/reload, perhaps using pg_upgrade
* Have SHOW ALL and pg_settings show descriptions for server-side variables(Joe)
* Allow external interfaces to extend the GUC variable set


Data Types
==========

* Remove Money type, add money formatting for decimal type
* -Change factorial to return a numeric (Gavin)
* Change NUMERIC to enforce the maximum precision, and increase it
* Add function to return compressed length of TOAST data values (Tom)
* Allow INET subnet tests using non-constants to be indexed
* Add transaction_timestamp(), statement_timestamp(), clock_timestamp() functionality
* Have sequence dependency track use of DEFAULT sequences, seqname.nextval
* Disallow changing default expression of a SERIAL column
* Allow infinite dates just like infinite timestamps
* Allow pg_dump to dump sequences using NO_MAXVALUE and NO_MINVALUE
* Allow backend to output result sets in XML
* Prevent whole-row references from leaking memory, e.g. SELECT COUNT(tab.*)
* Have initdb set DateStyle based on locale?
* Add pg_get_acldef(), pg_get_typedefault(), and pg_get_attrdef()
* Add ALTER DOMAIN, AGGREGATE, CONVERSION, SEQUENCE ... OWNER TO
* Allow to_char to print localized month names (Karel)
* Allow functions to have a search path specified at creation time
* Make LENGTH() of CHAR() not count trailing spaces
* Allow substring/replace() to get/set bit values
* Add GUC variable to allow output of interval values in ISO8601 format


* ARRAYS
	o Allow nulls in arrays
	o Allow MIN()/MAX() on arrays
	o Delay resolution of array expression type so assignment coercion 
	  can be performed on empty array expressions (Joe)
	o Modify array literal representation to handle array index lower bound
	  of other than one


* BINARY DATA
	o Improve vacuum of large objects, like /contrib/vacuumlo
	o Add security checking for large objects
	o Make file in/out interface for TOAST columns, similar to large object
	  interface (force out-of-line storage and no compression)
	o Auto-delete large objects when referencing row is deleted


Multi-Language Support
======================

* Add NCHAR (as distinguished from ordinary varchar),
* Allow locale to be set at database creation
* Allow locale on a per-column basis, default to ASCII
* Optimize locale to have minimal performance impact when not used (Peter E)
* Support multiple simultaneous character sets, per SQL92
* Improve Unicode combined character handling
* Add octet_length_server() and octet_length_client() (Thomas, Tatsuo)
* Make octet_length_client the same as octet_length() (?)
* Prevent mismatch of frontend/backend encodings from converting bytea
  data from being interpreted as encoded strings
* Fix upper()/lower() to work for multibyte encodings


Views / Rules
=============

* Automatically create rules on views so they are updateable, per SQL92 [view]
* Add the functionality for WITH CHECK OPTION clause of CREATE VIEW
* Allow NOTIFY in rules involving conditionals
* Have views on temporary tables exist in the temporary namespace
* Allow temporary views on non-temporary tables
* Allow RULE recompilation


Indexes
=======

* -Order duplicate index entries on creation by tid for faster heap lookups
* Allow inherited tables to inherit index, UNIQUE constraint, and primary
  key, foreign key  [inheritance]
* UNIQUE INDEX on base column not honored on inserts from inherited table
  INSERT INTO inherit_table (unique_index_col) VALUES (dup) should fail
  [inheritance]
* Add UNIQUE capability to non-btree indexes
* Add rtree index support for line, lseg, path, point
  columns and indexes with many duplicate keys
* Use indexes for min() and max() or convert to SELECT col FROM tab ORDER
  BY col DESC LIMIT 1 if appropriate index exists and WHERE clause acceptible
* Use index to restrict rows returned by multi-key index when used with
  non-consecutive keys or OR clauses, so fewer heap accesses
* Be smarter about insertion of already-ordered data into btree index
* Prevent index uniqueness checks when UPDATE does not modify the column
* Use bitmaps to fetch heap pages in sequential order [performance]
* Use bitmaps to combine existing indexes [performance]
* Allow use of indexes to search for NULLs
* Allow SELECT * FROM tab WHERE int2col = 4 to use int2col index, int8,
  float4, numeric/decimal too
* Add FILLFACTOR to btree index creation
* Add concurrency to GIST
* Allow a single index to index multiple tables (for inheritance and subtables)


Commands
========

* Add BETWEEN ASYMMETRIC/SYMMETRIC (Christopher)
* Change LIMIT/OFFSET to use int8
* CREATE TABLE AS can not determine column lengths from expressions [atttypmod]
* Allow UPDATE to handle complex aggregates [update]
* Allow command blocks to ignore certain types of errors
* Allow backslash handling in quoted strings to be disabled for portability
* Allow UPDATE, DELETE to handle table aliases for self-joins [delete]
* Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT
* Allow REINDEX to rebuild all indexes, remove /contrib/reindex
* Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
* Add schema option to createlang
* Allow savepoints / nested transactions [transactions] (Bruce)
* Use nested transactions to prevent syntax errors from aborting a transaction
* Allow UPDATE tab SET ROW (col, ...) = (...) for updating multiple columns
* Allow SET CONSTRAINTS to be qualified by schema/table
* Prevent COMMENT ON DATABASE from using a database name
* Add NO WAIT option to various SQL commands
* Allow TRUNCATE ... CASCADE/RESTRICT
* Allow PREPARE of cursors
* Allow LISTEN/NOTIFY to store info in memory rather than tables
* -COMMENT ON [ CAST | CONVERSION | OPERATOR CLASS | LARGE OBJECT | LANGUAGE ] 
  (Christopher) 
* Dump large object comments in custom dump format
* Add optional textual message to NOTIFY
* Allow more ISOLATION LEVELS to be accepted, but issue a warning for them
* Allow CREATE TABLE foo (f1 INT CHECK (f1 > 0) CHECK (f1 < 10)) to work
  by searching for non-conflicting constraint names, and prefix with table name
* Use more reliable method for CREATE DATABASE to get a consistent copy of db
* -Have psql \dn show only visible temp schemas using current_schemas()
* -Have psql '\i ~/<tab><tab>' actually load files it displays from home dir
* Ignore temporary tables from other session when processing inheritance
* Add GUC setting to make created tables default to WITHOUT OIDS
* Have pg_ctl look at PGHOST in case it is a socket directory
* Allow column-level privileges
* Add a session mode to warn about non-standard SQL usage

* ALTER
	o ALTER TABLE ADD COLUMN does not honor DEFAULT and non-CHECK CONSTRAINT
	o ALTER TABLE ADD COLUMN column DEFAULT should fill existing
	  rows with DEFAULT value
	o ALTER TABLE ADD COLUMN column SERIAL doesn't create sequence because
          of the item above
	o Have ALTER TABLE rename SERIAL sequences
	o Allow columns to be reordered using ALTER ... POSITION i col1 [,col2];
	  have SELECT * and INSERT honor such ordering
	o Allow ALTER TABLE to modify column lengths and change to binary
	  compatible types
	o Add ALTER DATABASE ... OWNER TO newowner
	o Allow ALTER TABLE ... ALTER CONSTRAINT ... RENAME
	o Allow ALTER TABLE to change constraint deferrability and actions
	o Disallow dropping of an inherited constraint
	o Allow the schema of objects to be changed

* CLUSTER
	o Automatically maintain clustering on a table
	o Add way to remove cluster specification on a table

* COPY
	o Allow dump/load of CSV format
	o Allow COPY to report error lines and continue;  optionally
	  allow error codes to be specified; requires savepoints or can
	  not be run in a multi-statement transaction
	o Allow COPY to understand \x as hex
	o Have COPY return number of rows loaded/unloaded

* CURSOR
	o Allow UPDATE/DELETE WHERE CURRENT OF cursor using per-cursor tid
	  stored in the backend (Gavin)
	o Prevent DROP of table being referenced by our own open cursor

* INSERT
	o Allow INSERT/UPDATE of system-generated oid value for a row
	o Allow INSERT INTO tab (col1, ..) VALUES (val1, ..), (val2, ..)
	o Allow INSERT/UPDATE ... RETURNING new.col or old.col; handle
	  RULE cases (Philip)

* SHOW/SET
	o Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM
	  ANALYZE, and CLUSTER
	o Add SET PATH for schemas

* SERVER-SIDE LANGUAGES
	o Allow PL/PgSQL's RAISE function to take expressions
	o Change PL/PgSQL to use palloc() instead of malloc()
	o Allow Java server-side programming [java]
	o Fix problems with complex temporary table creation/destruction
	  without using PL/PgSQL EXECUTE, needs cache prevention/invalidation
        o Fix PL/pgSQL RENAME to work on variables other than OLD/NEW
	o Improve PL/PgSQL exception handling
	o Allow parameters to be specified by name and type during definition
	o Allow function parameters to be passed by name,
	  get_employee_salary(emp_id => 12345, tax_year => 2001)
	o Add PL/PgSQL packages
	o Add table function support to pltcl, plperl, plpython
	o Add PL/PHP (Joe)
	o Allow PL/pgSQL to name columns by ordinal position, e.g. rec.(3)
	o Allow PL/pgSQL EXECUTE query_var INTO record_var;
	o Add capability to create and call PROCEDURES
	o Allow PL/pgSQL to handle %TYPE arrays, e.g. tab.col%TYPE[]


Clients
=======

* Add XML capability to pg_dump and COPY, when backend XML capability
* -Allow psql \du to show groups, and add \dg for groups
* Allow clients to query a list of WITH HOLD cursors and prepared statements
* Add a libpq function to support Parse/DescribeStatement capability
* Prevent libpq's PQfnumber() from lowercasing the column name
* -Allow pg_dump to dump CREATE CONVERSION (Christopher)
* Allow fastpast to pass values in portable format
* Allow libpq to return information about prepared queries
* Make pg_restore continue after errors, so it acts more like pg_dump scripts
* Have psql show more information about sequences
* Allow pg_dumpall to use non-text output formats
* Have pg_dump use multi-statement transactions for INSERT dumps
* Move psql backslash database information into the backend, use nmumonic
  commands? [psql]


* JDBC
	o Comprehensive test suite. This may be available already.
	o JDBC-standard BLOB support
	o Error Codes (pending backend implementation)
	o Support both 'make' and 'ant'
	o Fix LargeObject API to handle OIDs as unsigned ints
	o Use cursors implicitly to avoid large results (see setCursorName())
	o Add LISTEN/NOTIFY support to the JDBC driver (Barry)

* ECPG
	o Docs
	o Implement set descriptor, using descriptor
	o Solve cardinality > 1 for input descriptors / variables
	o Improve error handling
	o Add a semantic check level, e.g. check if a table really exists
	o fix handling of DB attributes that are arrays
	o Use backend prepare/execute facility for ecpg where possible
	o Implement SQLDA
	o Fix nested C comments
	o sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified
	o Make SET CONNECTION thread-aware, non-standard?
	o Allow multidimensional arrays

* Python
	o Allow users to register their own types with pg_
	o Allow SELECT to return a dictionary of dictionaries
	o Allow COPY BINARY FROM


Referential Integrity
=====================

* Add MATCH PARTIAL referential integrity
* Add deferred trigger queue file (Jan)
* Implement dirty reads or shared row locks and use them in RI triggers
* Enforce referential integrity for system tables
* Change foreign key constraint for array -> element to mean element
  in array
* Allow DEFERRABLE UNIQUE constraints
* Allow triggers to be disabled [trigger]
* With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY
* Allow statement-level triggers to access modified rows
* Support triggers on columns (Neil)
* Have AFTER triggers execute after the appropriate SQL statement in a 
  function, not at the end of the function
* Print table names with constraint names in error messages, or make constraint
  names unique within a schema
* -Issue NOTICE if foreign key data requires costly test to match primary key
* Remove CREATE CONSTRAINT TRIGGER


Dependency Checking
===================

* Flush cached query plans when their underlying catalog data changes
* -Use dependency information to dump data in proper order
* Have pg_dump -c clear the database using dependency information


Exotic Features
===============

* Add SQL99 WITH clause to SELECT (Tom, Fernando)
* Add SQL99 WITH RECURSIVE to SELECT (Tom, Fernando)
* Add pre-parsing phase that converts non-ANSI features to supported features
* Allow plug-in modules to emulate features from other databases
* SQL*Net listener that makes PostgreSQL appear as an Oracle database
  to clients
* Add two-phase commit to all distributed transactions with 
  offline/readonly server status or administrator notification for failure
* Allow cross-db queries with transaction semantics


PERFORMANCE
===========


Fsync
=====

* Delay fsync() when other backends are about to commit too
	o Determine optimal commit_delay value
* Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
	o Allow multiple blocks to be written to WAL with one write()


Cache
=====
* Shared catalog cache, reduce lseek()'s by caching table size in shared area
* Add free-behind capability for large sequential scans [fadvise]
* Consider use of open/fcntl(O_DIRECT) to minimize OS caching
* Cache last known per-tuple offsets to speed long tuple access
* Automatically place fixed-width, NOT NULL columns first in a table
* Use a fixed row count and a +/- count with MVCC visibility rules 
  to allow fast COUNT(*) queries with no WHERE clause(?) [count]


Vacuum
======

* Improve speed with indexes (perhaps recreate index instead)
* Reduce lock time by moving tuples with read lock, then write
  lock and truncate table
* Provide automatic running of vacuum in the background in backend
  rather than in /contrib
* Allow free space map to be auto-sized or warn when it is too small
* Maintain a map of recently-expired of pages so vacuum can reclaim
  free space without a sequential scan
* Have VACUUM FULL use REINDEX rather than index vacuum


Locking
=======

* Make locking of shared data structures more fine-grained
* Add code to detect an SMP machine and handle spinlocks accordingly
  from distributted.net, http://www1.distributed.net/source, 
  in client/common/cpucheck.cpp
* Research use of sched_yield() for spinlock acquisition failure


Startup Time
============

* Experiment with multi-threaded backend [thread]
* Add connection pooling [pool]
* Allow persistent backends [pool]
* Create a transaction processor to aid in persistent connections and
  connection pooling [pool]
* Do listen() in postmaster and accept() in pre-forked backend
* Have pre-forked backend pre-connect to last requested database or pass
  file descriptor to backend pre-forked for matching database


Write-Ahead Log
===============

* Have after-change WAL write()'s write only modified data to kernel
* Reduce number of after-change WAL writes; they exist only to gaurd against
  partial page writes [wal]
* Turn off after-change writes if fsync is disabled (?)
* Add WAL index reliability improvement to non-btree indexes
* Find proper defaults for postgresql.conf WAL entries
* Allow xlog directory location to be specified during initdb, perhaps
  using symlinks
* Allow WAL information to recover corrupted pg_controldata
* Find a way to reduce rotational delay when repeatedly writing
  last WAL page


Optimizer / Executor
====================

* Missing optimizer selectivities for date, r-tree, etc
* Allow ORDER BY ... LIMIT to select top values without sort or index
  using a sequential scan for highest/lowest values (Oleg)
* Precompile SQL functions to avoid overhead (Neil)
* Add utility to compute accurate random_page_cost value
* Improve ability to display optimizer analysis using OPTIMIZER_DEBUG
* Use CHECK constraints to improve optimizer decisions
* Check GUC geqo_threshold to see if it is still accurate
* Allow sorting, temp files, temp tables to use multiple work directories
* Improve the planner to use CHECK constraints to prune the plan (for subtables)
* Have EXPLAIN ANALYZE highlight poor optimizer estimates


Miscellaneous
=============

* Do async I/O for faster random read-ahead of data
* Use mmap() rather than SYSV shared memory or to write WAL files (?) [mmap]
* Improve caching of attribute offsets when NULLs exist in the row
* Add a script to ask system configuration questions and tune postgresql.conf
* Allow partitioning of table into multiple subtables
* -Use background process to write dirty shared buffers to disk


Source Code
===========

* Add use of 'const' for variables in source tree
* Rename some /contrib modules from pg* to pg_*
* Move some things from /contrib into main tree
* Remove warnings created by -Wcast-align
* Move platform-specific ps status display info from ps_status.c to ports
* Improve access-permissions check on data directory in Cygwin (Tom)
* Add documentation for perl, including mention of DBI/DBD perl location
* Create improved PostgreSQL introductory documentation for the PHP
  manuals
* Add optional CRC checksum to heap and index pages
* Change representation of whole-tuple parameters to functions
* Clarify use of 'application' and 'command' tags in SGML docs
* Better document ability to build only certain interfaces (Marc)
* Remove or relicense modules that are not under the BSD license, if possible
* Remove memory/file descriptor freeing before ereport(ERROR)  (Bruce)
* Acquire lock on a relation before building a relcache entry for it
* Research interaction of setitimer() and sleep() used by statement_timeout
* -Add checks for fclose() failure (Tom)
* -Change CVS ID to PostgreSQL
* Exit postmaster if postgresql.conf can not be opened
* Rename /scripts directory because they are all C programs now
* Allow the regression tests to start postmaster with -i so the tests
  can be run on systems that don't support unix-domain sockets
* Allow creation of a libpq-only tarball
* Promote debug_query_string into a server-side function current_query()
* Allow the identifier length to be increased via a configure option
* Improve CREATE SCHEMA regression test


* Wire Protocol Changes
	o Dynamic character set handling
	o Add decoded type, length, precision
	o Compression?
	o Update clients to use data types, typmod, schema.table.column names of
	  result sets using new query protocol

---------------------------------------------------------------------------


Developers who have claimed items are:
--------------------------------------
* Barry is Barry Lind <barry@xythos.com>
* Billy is Billy G. Allie <Bill.Allie@mug.org>
* Bruce is Bruce Momjian <pgman@candle.pha.pa.us> of Software Research Assoc.
* Christopher is Christopher Kings-Lynne <chriskl@familyhealth.com.au> of
    Family Health Network
* D'Arcy is D'Arcy J.M. Cain <darcy@druid.net> of The Cain Gang Ltd.
* Dave is Dave Cramer <dave@fastcrypt.com>
* Edmund is Edmund Mergl <E.Mergl@bawue.de>
* Fernando is Fernando Nasser <fnasser@redhat.com> of Red Hat
* Gavin is Gavin Sherry <swm@linuxworld.com.au> of Alcove Systems Engineering
* Greg is Greg Sabino Mullane <greg@turnstep.com>
* Hiroshi is Hiroshi Inoue <Inoue@tpf.co.jp>
* Karel is Karel Zak <zakkr@zf.jcu.cz>
* Jan is Jan Wieck <JanWieck@Yahoo.com> of Afilias, Inc.
* Joe is Joe Conway <mail@joeconway.com>
* Liam is Liam Stewart <liams@redhat.com> of Red Hat
* Marc is Marc Fournier <scrappy@hub.org> of PostgreSQL, Inc.
* Mark is Mark Hollomon <mhh@mindspring.com>
* Michael is Michael Meskes <meskes@postgresql.org> of Credativ
* Neil is Neil Conway <neilc@samurai.com>
* Oleg is Oleg Bartunov <oleg@sai.msu.su>
* Peter M is Peter T Mount <peter@retep.org.uk> of Retep Software
* Peter E is Peter Eisentraut <peter_e@gmx.net>
* Philip is Philip Warner <pjw@rhyme.com.au> of Albatross Consulting Pty. Ltd.
* Rod is Rod Taylor <pg@rbt.ca>
* Ross is Ross J. Reedstrom <reedstrm@wallace.ece.rice.edu>
* Stephan is Stephan Szabo <sszabo@megazone23.bigpanda.com>
* Tatsuo is Tatsuo Ishii <t-ishii@sra.co.jp> of Software Research Assoc.
* Thomas is Thomas Lockhart <lockhart@fourpalms.org> of Jet Propulsion Labratory
* Tom is Tom Lane <tgl@sss.pgh.pa.us> of Red Hat