summaryrefslogtreecommitdiff
path: root/Tools/c-analyzer/cpython
diff options
context:
space:
mode:
authorEric Snow <ericsnowcurrently@gmail.com>2020-10-22 18:42:51 -0600
committerGitHub <noreply@github.com>2020-10-22 18:42:51 -0600
commit345cd37abe324ad4f60f80e2c3133b8849e54e9b (patch)
tree5d965e662dca9dcac19e7eddd63a3d9d0b816fed /Tools/c-analyzer/cpython
parentec388cfb4ede56dace2bb78851ff6f38fa2a6abe (diff)
downloadcpython-git-345cd37abe324ad4f60f80e2c3133b8849e54e9b.tar.gz
bpo-36876: Fix the C analyzer tool. (GH-22841)
The original tool wasn't working right and it was simpler to create a new one, partially re-using some of the old code. At this point the tool runs properly on the master. (Try: ./python Tools/c-analyzer/c-analyzer.py analyze.) It take ~40 seconds on my machine to analyze the full CPython code base. Note that we'll need to iron out some OS-specific stuff (e.g. preprocessor). We're okay though since this tool isn't used yet in our workflow. We will also need to verify the analysis results in detail before activating the check in CI, though I'm pretty sure it's close. https://bugs.python.org/issue36876
Diffstat (limited to 'Tools/c-analyzer/cpython')
-rw-r--r--Tools/c-analyzer/cpython/README72
-rw-r--r--Tools/c-analyzer/cpython/__init__.py29
-rw-r--r--Tools/c-analyzer/cpython/__main__.py446
-rw-r--r--Tools/c-analyzer/cpython/_analyzer.py348
-rw-r--r--Tools/c-analyzer/cpython/_generate.py326
-rw-r--r--Tools/c-analyzer/cpython/_parser.py308
-rw-r--r--Tools/c-analyzer/cpython/files.py29
-rw-r--r--Tools/c-analyzer/cpython/find.py101
-rw-r--r--Tools/c-analyzer/cpython/ignored.tsv2
-rw-r--r--Tools/c-analyzer/cpython/known.py66
-rw-r--r--Tools/c-analyzer/cpython/known.tsv3
-rw-r--r--Tools/c-analyzer/cpython/supported.py398
12 files changed, 928 insertions, 1200 deletions
diff --git a/Tools/c-analyzer/cpython/README b/Tools/c-analyzer/cpython/README
deleted file mode 100644
index 772b8be270..0000000000
--- a/Tools/c-analyzer/cpython/README
+++ /dev/null
@@ -1,72 +0,0 @@
-#######################################
-# C Globals and CPython Runtime State.
-
-CPython's C code makes extensive use of global variables (whether static
-globals or static locals). Each such variable falls into one of several
-categories:
-
-* strictly const data
-* used exclusively in main or in the REPL
-* process-global state (e.g. managing process-level resources
- like signals and file descriptors)
-* Python "global" runtime state
-* per-interpreter runtime state
-
-The last one can be a problem as soon as anyone creates a second
-interpreter (AKA "subinterpreter") in a process. It is definitely a
-problem under subinterpreters if they are no longer sharing the GIL,
-since the GIL protects us from a lot of race conditions. Keep in mind
-that ultimately *all* objects (PyObject) should be treated as
-per-interpreter state. This includes "static types", freelists,
-_PyIdentifier, and singletons. Take that in for a second. It has
-significant implications on where we use static variables!
-
-Be aware that module-global state (stored in C statics) is a kind of
-per-interpreter state. There have been efforts across many years, and
-still going, to provide extension module authors mechanisms to store
-that state safely (see PEPs 3121, 489, etc.).
-
-(Note that there has been discussion around support for running multiple
-Python runtimes in the same process. That would ends up with the same
-problems, relative to static variables, that subinterpreters have.)
-
-Historically we have been bad at keeping per-interpreter state out of
-static variables, mostly because until recently subinterpreters were
-not widely used nor even factored in to solutions. However, the
-feature is growing in popularity and use in the community.
-
-Mandate: "Eliminate use of static variables for per-interpreter state."
-
-The "c-statics.py" script in this directory, along with its accompanying
-data files, are part of the effort to resolve existing problems with
-our use of static variables and to prevent future problems.
-
-#-------------------------
-## statics for actually-global state (and runtime state consolidation)
-
-In general, holding any kind of state in static variables
-increases maintenance burden and increases the complexity of code (e.g.
-we use TSS to identify the active thread state). So it is a good idea
-to avoid using statics for state even if for the "global" runtime or
-for process-global state.
-
-Relative to maintenance burden, one problem is where the runtime
-state is spread throughout the codebase in dozens of individual
-globals. Unlike the other globals, the runtime state represents a set
-of values that are constantly shifting in a complex way. When they are
-spread out it's harder to get a clear picture of what the runtime
-involves. Furthermore, when they are spread out it complicates efforts
-that change the runtime.
-
-Consequently, the globals for Python's runtime state have been
-consolidated under a single top-level _PyRuntime global. No new globals
-should be added for runtime state. Instead, they should be added to
-_PyRuntimeState or one of its sub-structs. The tools in this directory
-are run as part of the test suite to ensure that no new globals have
-been added. The script can be run manually as well:
-
- ./python Lib/test/test_c_statics/c-statics.py check
-
-If it reports any globals then they should be resolved. If the globals
-are runtime state then they should be folded into _PyRuntimeState.
-Otherwise they should be marked as ignored.
diff --git a/Tools/c-analyzer/cpython/__init__.py b/Tools/c-analyzer/cpython/__init__.py
index ae45b424e3..d0b3eff3c4 100644
--- a/Tools/c-analyzer/cpython/__init__.py
+++ b/Tools/c-analyzer/cpython/__init__.py
@@ -1,29 +1,20 @@
import os.path
-import sys
-TOOL_ROOT = os.path.abspath(
+TOOL_ROOT = os.path.normcase(
+ os.path.abspath(
os.path.dirname( # c-analyzer/
- os.path.dirname(__file__))) # cpython/
-DATA_DIR = TOOL_ROOT
+ os.path.dirname(__file__)))) # cpython/
REPO_ROOT = (
os.path.dirname( # ..
os.path.dirname(TOOL_ROOT))) # Tools/
INCLUDE_DIRS = [os.path.join(REPO_ROOT, name) for name in [
- 'Include',
- ]]
+ 'Include',
+]]
SOURCE_DIRS = [os.path.join(REPO_ROOT, name) for name in [
- 'Python',
- 'Parser',
- 'Objects',
- 'Modules',
- ]]
-
-#PYTHON = os.path.join(REPO_ROOT, 'python')
-PYTHON = sys.executable
-
-
-# Clean up the namespace.
-del sys
-del os
+ 'Python',
+ 'Parser',
+ 'Objects',
+ 'Modules',
+]]
diff --git a/Tools/c-analyzer/cpython/__main__.py b/Tools/c-analyzer/cpython/__main__.py
index 6b0f9bcb96..23a3de06f6 100644
--- a/Tools/c-analyzer/cpython/__main__.py
+++ b/Tools/c-analyzer/cpython/__main__.py
@@ -1,212 +1,280 @@
-import argparse
-import re
+import logging
import sys
-from c_analyzer.common import show
-from c_analyzer.common.info import UNKNOWN
+from c_common.fsutil import expand_filenames, iter_files_by_suffix
+from c_common.scriptutil import (
+ add_verbosity_cli,
+ add_traceback_cli,
+ add_commands_cli,
+ add_kind_filtering_cli,
+ add_files_cli,
+ process_args_by_key,
+ configure_logger,
+ get_prog,
+)
+from c_parser.info import KIND
+import c_parser.__main__ as c_parser
+import c_analyzer.__main__ as c_analyzer
+import c_analyzer as _c_analyzer
+from c_analyzer.info import UNKNOWN
+from . import _analyzer, _parser, REPO_ROOT
+
+
+logger = logging.getLogger(__name__)
+
+
+def _resolve_filenames(filenames):
+ if filenames:
+ resolved = (_parser.resolve_filename(f) for f in filenames)
+ else:
+ resolved = _parser.iter_filenames()
+ return resolved
+
+
+def fmt_summary(analysis):
+ # XXX Support sorting and grouping.
+ supported = []
+ unsupported = []
+ for item in analysis:
+ if item.supported:
+ supported.append(item)
+ else:
+ unsupported.append(item)
+ total = 0
+
+ def section(name, groupitems):
+ nonlocal total
+ items, render = c_analyzer.build_section(name, groupitems,
+ relroot=REPO_ROOT)
+ yield from render()
+ total += len(items)
+
+ yield ''
+ yield '===================='
+ yield 'supported'
+ yield '===================='
+
+ yield from section('types', supported)
+ yield from section('variables', supported)
+
+ yield ''
+ yield '===================='
+ yield 'unsupported'
+ yield '===================='
+
+ yield from section('types', unsupported)
+ yield from section('variables', unsupported)
+
+ yield ''
+ yield f'grand total: {total}'
+
-from . import SOURCE_DIRS
-from .find import supported_vars
-from .known import (
- from_file as known_from_file,
- DATA_FILE as KNOWN_FILE,
+#######################################
+# the checks
+
+CHECKS = dict(c_analyzer.CHECKS, **{
+ 'globals': _analyzer.check_globals,
+})
+
+#######################################
+# the commands
+
+FILES_KWARGS = dict(excluded=_parser.EXCLUDED, nargs='*')
+
+
+def _cli_parse(parser):
+ process_output = c_parser.add_output_cli(parser)
+ process_kind = add_kind_filtering_cli(parser)
+ process_preprocessor = c_parser.add_preprocessor_cli(
+ parser,
+ get_preprocessor=_parser.get_preprocessor,
+ )
+ process_files = add_files_cli(parser, **FILES_KWARGS)
+ return [
+ process_output,
+ process_kind,
+ process_preprocessor,
+ process_files,
+ ]
+
+
+def cmd_parse(filenames=None, **kwargs):
+ filenames = _resolve_filenames(filenames)
+ if 'get_file_preprocessor' not in kwargs:
+ kwargs['get_file_preprocessor'] = _parser.get_preprocessor()
+ c_parser.cmd_parse(filenames, **kwargs)
+
+
+def _cli_check(parser, **kwargs):
+ return c_analyzer._cli_check(parser, CHECKS, **kwargs, **FILES_KWARGS)
+
+
+def cmd_check(filenames=None, **kwargs):
+ filenames = _resolve_filenames(filenames)
+ kwargs['get_file_preprocessor'] = _parser.get_preprocessor(log_err=print)
+ c_analyzer.cmd_check(
+ filenames,
+ relroot=REPO_ROOT,
+ _analyze=_analyzer.analyze,
+ _CHECKS=CHECKS,
+ **kwargs
)
-from .supported import IGNORED_FILE
-
-
-def _check_results(unknown, knownvars, used):
- def _match_unused_global(variable):
- found = []
- for varid in knownvars:
- if varid in used:
- continue
- if varid.funcname is not None:
- continue
- if varid.name != variable.name:
- continue
- if variable.filename and variable.filename != UNKNOWN:
- if variable.filename == varid.filename:
- found.append(varid)
- else:
- found.append(varid)
- return found
-
- badknown = set()
- for variable in sorted(unknown):
- msg = None
- if variable.funcname != UNKNOWN:
- msg = f'could not find global symbol {variable.id}'
- elif m := _match_unused_global(variable):
- assert isinstance(m, list)
- badknown.update(m)
- elif variable.name in ('completed', 'id'): # XXX Figure out where these variables are.
- unknown.remove(variable)
- else:
- msg = f'could not find local symbol {variable.id}'
- if msg:
- #raise Exception(msg)
- print(msg)
- if badknown:
- print('---')
- print(f'{len(badknown)} globals in known.tsv, but may actually be local:')
- for varid in sorted(badknown):
- print(f'{varid.filename:30} {varid.name}')
- unused = sorted(varid
- for varid in set(knownvars) - used
- if varid.name != 'id') # XXX Figure out where these variables are.
- if unused:
- print('---')
- print(f'did not use {len(unused)} known vars:')
- for varid in unused:
- print(f'{varid.filename:30} {varid.funcname or "-":20} {varid.name}')
- raise Exception('not all known symbols used')
- if unknown:
- print('---')
- raise Exception('could not find all symbols')
-
-
-# XXX Move this check to its own command.
-def cmd_check_cache(cmd, *,
- known=KNOWN_FILE,
- ignored=IGNORED_FILE,
- _known_from_file=known_from_file,
- _find=supported_vars,
- ):
- known = _known_from_file(known)
-
- used = set()
- unknown = set()
- for var, supported in _find(known=known, ignored=ignored):
- if supported is None:
- unknown.add(var)
- continue
- used.add(var.id)
- _check_results(unknown, known['variables'], used)
-
-
-def cmd_check(cmd, *,
- known=KNOWN_FILE,
- ignored=IGNORED_FILE,
- _find=supported_vars,
- _show=show.basic,
- _print=print,
- ):
- """
- Fail if there are unsupported globals variables.
-
- In the failure case, the list of unsupported variables
- will be printed out.
- """
- unsupported = []
- for var, supported in _find(known=known, ignored=ignored):
- if not supported:
- unsupported.append(var)
-
- if not unsupported:
- #_print('okay')
- return
-
- _print('ERROR: found unsupported global variables')
- _print()
- _show(sorted(unsupported))
- _print(f' ({len(unsupported)} total)')
- sys.exit(1)
-
-
-def cmd_show(cmd, *,
- known=KNOWN_FILE,
- ignored=IGNORED_FILE,
- skip_objects=False,
- _find=supported_vars,
- _show=show.basic,
- _print=print,
- ):
- """
- Print out the list of found global variables.
-
- The variables will be distinguished as "supported" or "unsupported".
- """
- allsupported = []
- allunsupported = []
- for found, supported in _find(known=known,
- ignored=ignored,
- skip_objects=skip_objects,
- ):
- if supported is None:
- continue
- (allsupported if supported else allunsupported
- ).append(found)
-
- _print('supported:')
- _print('----------')
- _show(sorted(allsupported))
- _print(f' ({len(allsupported)} total)')
- _print()
- _print('unsupported:')
- _print('------------')
- _show(sorted(allunsupported))
- _print(f' ({len(allunsupported)} total)')
-
-
-#############################
-# the script
-COMMANDS = {
- 'check': cmd_check,
- 'show': cmd_show,
- }
-
-PROG = sys.argv[0]
-PROG = 'c-globals.py'
-
-
-def parse_args(prog=PROG, argv=sys.argv[1:], *, _fail=None):
- common = argparse.ArgumentParser(add_help=False)
- common.add_argument('--ignored', metavar='FILE',
- default=IGNORED_FILE,
- help='path to file that lists ignored vars')
- common.add_argument('--known', metavar='FILE',
- default=KNOWN_FILE,
- help='path to file that lists known types')
- #common.add_argument('dirs', metavar='DIR', nargs='*',
- # default=SOURCE_DIRS,
- # help='a directory to check')
- parser = argparse.ArgumentParser(
- prog=prog,
+def cmd_analyze(filenames=None, **kwargs):
+ formats = dict(c_analyzer.FORMATS)
+ formats['summary'] = fmt_summary
+ filenames = _resolve_filenames(filenames)
+ kwargs['get_file_preprocessor'] = _parser.get_preprocessor(log_err=print)
+ c_analyzer.cmd_analyze(
+ filenames,
+ _analyze=_analyzer.analyze,
+ formats=formats,
+ **kwargs
+ )
+
+
+def _cli_data(parser):
+ filenames = False
+ known = True
+ return c_analyzer._cli_data(parser, filenames, known)
+
+
+def cmd_data(datacmd, **kwargs):
+ formats = dict(c_analyzer.FORMATS)
+ formats['summary'] = fmt_summary
+ filenames = (file
+ for file in _resolve_filenames(None)
+ if file not in _parser.EXCLUDED)
+ kwargs['get_file_preprocessor'] = _parser.get_preprocessor(log_err=print)
+ if datacmd == 'show':
+ types = _analyzer.read_known()
+ results = []
+ for decl, info in types.items():
+ if info is UNKNOWN:
+ if decl.kind in (KIND.STRUCT, KIND.UNION):
+ extra = {'unsupported': ['type unknown'] * len(decl.members)}
+ else:
+ extra = {'unsupported': ['type unknown']}
+ info = (info, extra)
+ results.append((decl, info))
+ if decl.shortkey == 'struct _object':
+ tempinfo = info
+ known = _analyzer.Analysis.from_results(results)
+ analyze = None
+ elif datacmd == 'dump':
+ known = _analyzer.KNOWN_FILE
+ def analyze(files, **kwargs):
+ decls = []
+ for decl in _analyzer.iter_decls(files, **kwargs):
+ if not KIND.is_type_decl(decl.kind):
+ continue
+ if not decl.filename.endswith('.h'):
+ if decl.shortkey not in _analyzer.KNOWN_IN_DOT_C:
+ continue
+ decls.append(decl)
+ results = _c_analyzer.analyze_decls(
+ decls,
+ known={},
+ analyze_resolved=_analyzer.analyze_resolved,
)
- subs = parser.add_subparsers(dest='cmd')
+ return _analyzer.Analysis.from_results(results)
+ else:
+ known = _analyzer.read_known()
+ def analyze(files, **kwargs):
+ return _analyzer.iter_decls(files, **kwargs)
+ extracolumns = None
+ c_analyzer.cmd_data(
+ datacmd,
+ filenames,
+ known,
+ _analyze=analyze,
+ formats=formats,
+ extracolumns=extracolumns,
+ relroot=REPO_ROOT,
+ **kwargs
+ )
- check = subs.add_parser('check', parents=[common])
- show = subs.add_parser('show', parents=[common])
- show.add_argument('--skip-objects', action='store_true')
+# We do not define any other cmd_*() handlers here,
+# favoring those defined elsewhere.
- if _fail is None:
- def _fail(msg):
- parser.error(msg)
+COMMANDS = {
+ 'check': (
+ 'analyze and fail if the CPython source code has any problems',
+ [_cli_check],
+ cmd_check,
+ ),
+ 'analyze': (
+ 'report on the state of the CPython source code',
+ [(lambda p: c_analyzer._cli_analyze(p, **FILES_KWARGS))],
+ cmd_analyze,
+ ),
+ 'parse': (
+ 'parse the CPython source files',
+ [_cli_parse],
+ cmd_parse,
+ ),
+ 'data': (
+ 'check/manage local data (e.g. knwon types, ignored vars, caches)',
+ [_cli_data],
+ cmd_data,
+ ),
+}
+
+
+#######################################
+# the script
+
+def parse_args(argv=sys.argv[1:], prog=None, *, subset=None):
+ import argparse
+ parser = argparse.ArgumentParser(
+ prog=prog or get_prog(),
+ )
+
+# if subset == 'check' or subset == ['check']:
+# if checks is not None:
+# commands = dict(COMMANDS)
+# commands['check'] = list(commands['check'])
+# cli = commands['check'][1][0]
+# commands['check'][1][0] = (lambda p: cli(p, checks=checks))
+ processors = add_commands_cli(
+ parser,
+ commands=COMMANDS,
+ commonspecs=[
+ add_verbosity_cli,
+ add_traceback_cli,
+ ],
+ subset=subset,
+ )
- # Now parse the args.
args = parser.parse_args(argv)
ns = vars(args)
cmd = ns.pop('cmd')
- if not cmd:
- _fail('missing command')
- return cmd, ns
+ verbosity, traceback_cm = process_args_by_key(
+ args,
+ processors[cmd],
+ ['verbosity', 'traceback_cm'],
+ )
+ if cmd != 'parse':
+ # "verbosity" is sent to the commands, so we put it back.
+ args.verbosity = verbosity
+
+ return cmd, ns, verbosity, traceback_cm
-def main(cmd, cmdkwargs=None, *, _COMMANDS=COMMANDS):
+def main(cmd, cmd_kwargs):
try:
- cmdfunc = _COMMANDS[cmd]
+ run_cmd = COMMANDS[cmd][-1]
except KeyError:
- raise ValueError(
- f'unsupported cmd {cmd!r}' if cmd else 'missing cmd')
-
- cmdfunc(cmd, **cmdkwargs or {})
+ raise ValueError(f'unsupported cmd {cmd!r}')
+ run_cmd(**cmd_kwargs)
if __name__ == '__main__':
- cmd, cmdkwargs = parse_args()
- main(cmd, cmdkwargs)
+ cmd, cmd_kwargs, verbosity, traceback_cm = parse_args()
+ configure_logger(verbosity)
+ with traceback_cm:
+ main(cmd, cmd_kwargs)
diff --git a/Tools/c-analyzer/cpython/_analyzer.py b/Tools/c-analyzer/cpython/_analyzer.py
new file mode 100644
index 0000000000..98f8888651
--- /dev/null
+++ b/Tools/c-analyzer/cpython/_analyzer.py
@@ -0,0 +1,348 @@
+import os.path
+import re
+
+from c_common.clsutil import classonly
+from c_parser.info import (
+ KIND,
+ DeclID,
+ Declaration,
+ TypeDeclaration,
+ TypeDef,
+ Struct,
+ Member,
+ FIXED_TYPE,
+ is_type_decl,
+ is_pots,
+ is_funcptr,
+ is_process_global,
+ is_fixed_type,
+ is_immutable,
+)
+import c_analyzer as _c_analyzer
+import c_analyzer.info as _info
+import c_analyzer.datafiles as _datafiles
+from . import _parser, REPO_ROOT
+
+
+_DATA_DIR = os.path.dirname(__file__)
+KNOWN_FILE = os.path.join(_DATA_DIR, 'known.tsv')
+IGNORED_FILE = os.path.join(_DATA_DIR, 'ignored.tsv')
+KNOWN_IN_DOT_C = {
+ 'struct _odictobject': False,
+ 'PyTupleObject': False,
+ 'struct _typeobject': False,
+ 'struct _arena': True, # ???
+ 'struct _frame': False,
+ 'struct _ts': True, # ???
+ 'struct PyCodeObject': False,
+ 'struct _is': True, # ???
+ 'PyWideStringList': True, # ???
+ # recursive
+ 'struct _dictkeysobject': False,
+}
+# These are loaded from the respective .tsv files upon first use.
+_KNOWN = {
+ # {(file, ID) | ID => info | bool}
+ #'PyWideStringList': True,
+}
+#_KNOWN = {(Struct(None, typeid.partition(' ')[-1], None)
+# if typeid.startswith('struct ')
+# else TypeDef(None, typeid, None)
+# ): ([], {'unsupported': None if supported else True})
+# for typeid, supported in _KNOWN_IN_DOT_C.items()}
+_IGNORED = {
+ # {ID => reason}
+}
+
+KINDS = frozenset((*KIND.TYPES, KIND.VARIABLE))
+
+
+def read_known():
+ if not _KNOWN:
+ # Cache a copy the first time.
+ extracols = None # XXX
+ #extracols = ['unsupported']
+ known = _datafiles.read_known(KNOWN_FILE, extracols, REPO_ROOT)
+ # For now we ignore known.values() (i.e. "extra").
+ types, _ = _datafiles.analyze_known(
+ known,
+ analyze_resolved=analyze_resolved,
+ )
+ _KNOWN.update(types)
+ return _KNOWN.copy()
+
+
+def write_known():
+ raise NotImplementedError
+ datafiles.write_known(decls, IGNORED_FILE, ['unsupported'], relroot=REPO_ROOT)
+
+
+def read_ignored():
+ if not _IGNORED:
+ _IGNORED.update(_datafiles.read_ignored(IGNORED_FILE))
+ return dict(_IGNORED)
+
+
+def write_ignored():
+ raise NotImplementedError
+ datafiles.write_ignored(variables, IGNORED_FILE)
+
+
+def analyze(filenames, *,
+ skip_objects=False,
+ **kwargs
+ ):
+ if skip_objects:
+ # XXX Set up a filter.
+ raise NotImplementedError
+
+ known = read_known()
+
+ decls = iter_decls(filenames)
+ results = _c_analyzer.analyze_decls(
+ decls,
+ known,
+ analyze_resolved=analyze_resolved,
+ )
+ analysis = Analysis.from_results(results)
+
+ return analysis
+
+
+def iter_decls(filenames, **kwargs):
+ decls = _c_analyzer.iter_decls(
+ filenames,
+ # We ignore functions (and statements).
+ kinds=KINDS,
+ parse_files=_parser.parse_files,
+ **kwargs
+ )
+ for decl in decls:
+ if not decl.data:
+ # Ignore forward declarations.
+ continue
+ yield decl
+
+
+def analyze_resolved(resolved, decl, types, knowntypes, extra=None):
+ if decl.kind not in KINDS:
+ # Skip it!
+ return None
+
+ typedeps = resolved
+ if typedeps is _info.UNKNOWN:
+ if decl.kind in (KIND.STRUCT, KIND.UNION):
+ typedeps = [typedeps] * len(decl.members)
+ else:
+ typedeps = [typedeps]
+ #assert isinstance(typedeps, (list, TypeDeclaration)), typedeps
+
+ if extra is None:
+ extra = {}
+ elif 'unsupported' in extra:
+ raise NotImplementedError((decl, extra))
+
+ unsupported = _check_unsupported(decl, typedeps, types, knowntypes)
+ extra['unsupported'] = unsupported
+
+ return typedeps, extra
+
+
+def _check_unsupported(decl, typedeps, types, knowntypes):
+ if typedeps is None:
+ raise NotImplementedError(decl)
+
+ if decl.kind in (KIND.STRUCT, KIND.UNION):
+ return _check_members(decl, typedeps, types, knowntypes)
+ elif decl.kind is KIND.ENUM:
+ if typedeps:
+ raise NotImplementedError((decl, typedeps))
+ return None
+ else:
+ return _check_typedep(decl, typedeps, types, knowntypes)
+
+
+def _check_members(decl, typedeps, types, knowntypes):
+ if isinstance(typedeps, TypeDeclaration):
+ raise NotImplementedError((decl, typedeps))
+
+ #members = decl.members or () # A forward decl has no members.
+ members = decl.members
+ if not members:
+ # A forward decl has no members, but that shouldn't surface here..
+ raise NotImplementedError(decl)
+ if len(members) != len(typedeps):
+ raise NotImplementedError((decl, typedeps))
+
+ unsupported = []
+ for member, typedecl in zip(members, typedeps):
+ checked = _check_typedep(member, typedecl, types, knowntypes)
+ unsupported.append(checked)
+ if any(None if v is FIXED_TYPE else v for v in unsupported):
+ return unsupported
+ elif FIXED_TYPE in unsupported:
+ return FIXED_TYPE
+ else:
+ return None
+
+
+def _check_typedep(decl, typedecl, types, knowntypes):
+ if not isinstance(typedecl, TypeDeclaration):
+ if hasattr(type(typedecl), '__len__'):
+ if len(typedecl) == 1:
+ typedecl, = typedecl
+ if typedecl is None:
+ # XXX Fail?
+ return 'typespec (missing)'
+ elif typedecl is _info.UNKNOWN:
+ # XXX Is this right?
+ return 'typespec (unknown)'
+ elif not isinstance(typedecl, TypeDeclaration):
+ raise NotImplementedError((decl, typedecl))
+
+ if isinstance(decl, Member):
+ return _check_vartype(decl, typedecl, types, knowntypes)
+ elif not isinstance(decl, Declaration):
+ raise NotImplementedError(decl)
+ elif decl.kind is KIND.TYPEDEF:
+ return _check_vartype(decl, typedecl, types, knowntypes)
+ elif decl.kind is KIND.VARIABLE:
+ if not is_process_global(decl):
+ return None
+ checked = _check_vartype(decl, typedecl, types, knowntypes)
+ return 'mutable' if checked is FIXED_TYPE else checked
+ else:
+ raise NotImplementedError(decl)
+
+
+def _check_vartype(decl, typedecl, types, knowntypes):
+ """Return failure reason."""
+ checked = _check_typespec(decl, typedecl, types, knowntypes)
+ if checked:
+ return checked
+ if is_immutable(decl.vartype):
+ return None
+ if is_fixed_type(decl.vartype):
+ return FIXED_TYPE
+ return 'mutable'
+
+
+def _check_typespec(decl, typedecl, types, knowntypes):
+ typespec = decl.vartype.typespec
+ if typedecl is not None:
+ found = types.get(typedecl)
+ if found is None:
+ found = knowntypes.get(typedecl)
+
+ if found is not None:
+ _, extra = found
+ if extra is None:
+ # XXX Under what circumstances does this happen?
+ extra = {}
+ unsupported = extra.get('unsupported')
+ if unsupported is FIXED_TYPE:
+ unsupported = None
+ return 'typespec' if unsupported else None
+ # Fall back to default known types.
+ if is_pots(typespec):
+ return None
+ elif _info.is_system_type(typespec):
+ return None
+ elif is_funcptr(decl.vartype):
+ return None
+ return 'typespec'
+
+
+class Analyzed(_info.Analyzed):
+
+ @classonly
+ def is_target(cls, raw):
+ if not super().is_target(raw):
+ return False
+ if raw.kind not in KINDS:
+ return False
+ return True
+
+ #@classonly
+ #def _parse_raw_result(cls, result, extra):
+ # typedecl, extra = super()._parse_raw_result(result, extra)
+ # if typedecl is None:
+ # return None, extra
+ # raise NotImplementedError
+
+ def __init__(self, item, typedecl=None, *, unsupported=None, **extra):
+ if 'unsupported' in extra:
+ raise NotImplementedError((item, typedecl, unsupported, extra))
+ if not unsupported:
+ unsupported = None
+ elif isinstance(unsupported, (str, TypeDeclaration)):
+ unsupported = (unsupported,)
+ elif unsupported is not FIXED_TYPE:
+ unsupported = tuple(unsupported)
+ self.unsupported = unsupported
+ extra['unsupported'] = self.unsupported # ...for __repr__(), etc.
+ if self.unsupported is None:
+ #self.supported = None
+ self.supported = True
+ elif self.unsupported is FIXED_TYPE:
+ if item.kind is KIND.VARIABLE:
+ raise NotImplementedError(item, typedecl, unsupported)
+ self.supported = True
+ else:
+ self.supported = not self.unsupported
+ super().__init__(item, typedecl, **extra)
+
+ def render(self, fmt='line', *, itemonly=False):
+ if fmt == 'raw':
+ yield repr(self)
+ return
+ rendered = super().render(fmt, itemonly=itemonly)
+ # XXX ???
+ #if itemonly:
+ # yield from rendered
+ supported = self._supported
+ if fmt in ('line', 'brief'):
+ rendered, = rendered
+ parts = [
+ '+' if supported else '-' if supported is False else '',
+ rendered,
+ ]
+ yield '\t'.join(parts)
+ elif fmt == 'summary':
+ raise NotImplementedError(fmt)
+ elif fmt == 'full':
+ yield from rendered
+ if supported:
+ yield f'\tsupported:\t{supported}'
+ else:
+ raise NotImplementedError(fmt)
+
+
+class Analysis(_info.Analysis):
+ _item_class = Analyzed
+
+ @classonly
+ def build_item(cls, info, result=None):
+ if not isinstance(info, Declaration) or info.kind not in KINDS:
+ raise NotImplementedError((info, result))
+ return super().build_item(info, result)
+
+
+def check_globals(analysis):
+ # yield (data, failure)
+ ignored = read_ignored()
+ for item in analysis:
+ if item.kind != KIND.VARIABLE:
+ continue
+ if item.supported:
+ continue
+ if item.id in ignored:
+ continue
+ reason = item.unsupported
+ if not reason:
+ reason = '???'
+ elif not isinstance(reason, str):
+ if len(reason) == 1:
+ reason, = reason
+ reason = f'({reason})'
+ yield item, f'not supported {reason:20}\t{item.storage or ""} {item.vartype}'
diff --git a/Tools/c-analyzer/cpython/_generate.py b/Tools/c-analyzer/cpython/_generate.py
deleted file mode 100644
index 3456604b81..0000000000
--- a/Tools/c-analyzer/cpython/_generate.py
+++ /dev/null
@@ -1,326 +0,0 @@
-# The code here consists of hacks for pre-populating the known.tsv file.
-
-from c_analyzer.parser.preprocessor import _iter_clean_lines
-from c_analyzer.parser.naive import (
- iter_variables, parse_variable_declaration, find_variables,
- )
-from c_analyzer.common.known import HEADER as KNOWN_HEADER
-from c_analyzer.common.info import UNKNOWN, ID
-from c_analyzer.variables import Variable
-from c_analyzer.util import write_tsv
-
-from . import SOURCE_DIRS, REPO_ROOT
-from .known import DATA_FILE as KNOWN_FILE
-from .files import iter_cpython_files
-
-
-POTS = ('char ', 'wchar_t ', 'int ', 'Py_ssize_t ')
-POTS += tuple('const ' + v for v in POTS)
-STRUCTS = ('PyTypeObject', 'PyObject', 'PyMethodDef', 'PyModuleDef', 'grammar')
-
-
-def _parse_global(line, funcname=None):
- line = line.strip()
- if line.startswith('static '):
- if '(' in line and '[' not in line and ' = ' not in line:
- return None, None
- name, decl = parse_variable_declaration(line)
- elif line.startswith(('Py_LOCAL(', 'Py_LOCAL_INLINE(')):
- name, decl = parse_variable_declaration(line)
- elif line.startswith('_Py_static_string('):
- decl = line.strip(';').strip()
- name = line.split('(')[1].split(',')[0].strip()
- elif line.startswith('_Py_IDENTIFIER('):
- decl = line.strip(';').strip()
- name = 'PyId_' + line.split('(')[1].split(')')[0].strip()
- elif funcname:
- return None, None
-
- # global-only
- elif line.startswith('PyAPI_DATA('): # only in .h files
- name, decl = parse_variable_declaration(line)
- elif line.startswith('extern '): # only in .h files
- name, decl = parse_variable_declaration(line)
- elif line.startswith('PyDoc_VAR('):
- decl = line.strip(';').strip()
- name = line.split('(')[1].split(')')[0].strip()
- elif line.startswith(POTS): # implied static
- if '(' in line and '[' not in line and ' = ' not in line:
- return None, None
- name, decl = parse_variable_declaration(line)
- elif line.startswith(STRUCTS) and line.endswith(' = {'): # implied static
- name, decl = parse_variable_declaration(line)
- elif line.startswith(STRUCTS) and line.endswith(' = NULL;'): # implied static
- name, decl = parse_variable_declaration(line)
- elif line.startswith('struct '):
- if not line.endswith(' = {'):
- return None, None
- if not line.partition(' ')[2].startswith(STRUCTS):
- return None, None
- # implied static
- name, decl = parse_variable_declaration(line)
-
- # file-specific
- elif line.startswith(('SLOT1BINFULL(', 'SLOT1BIN(')):
- # Objects/typeobject.c
- funcname = line.split('(')[1].split(',')[0]
- return [
- ('op_id', funcname, '_Py_static_string(op_id, OPSTR)'),
- ('rop_id', funcname, '_Py_static_string(op_id, OPSTR)'),
- ]
- elif line.startswith('WRAP_METHOD('):
- # Objects/weakrefobject.c
- funcname, name = (v.strip() for v in line.split('(')[1].split(')')[0].split(','))
- return [
- ('PyId_' + name, funcname, f'_Py_IDENTIFIER({name})'),
- ]
-
- else:
- return None, None
- return name, decl
-
-
-def _pop_cached(varcache, filename, funcname, name, *,
- _iter_variables=iter_variables,
- ):
- # Look for the file.
- try:
- cached = varcache[filename]
- except KeyError:
- cached = varcache[filename] = {}
- for variable in _iter_variables(filename,
- parse_variable=_parse_global,
- ):
- variable._isglobal = True
- cached[variable.id] = variable
- for var in cached:
- print(' ', var)
-
- # Look for the variable.
- if funcname == UNKNOWN:
- for varid in cached:
- if varid.name == name:
- break
- else:
- return None
- return cached.pop(varid)
- else:
- return cached.pop((filename, funcname, name), None)
-
-
-def find_matching_variable(varid, varcache, allfilenames, *,
- _pop_cached=_pop_cached,
- ):
- if varid.filename and varid.filename != UNKNOWN:
- filenames = [varid.filename]
- else:
- filenames = allfilenames
- for filename in filenames:
- variable = _pop_cached(varcache, filename, varid.funcname, varid.name)
- if variable is not None:
- return variable
- else:
- if varid.filename and varid.filename != UNKNOWN and varid.funcname is None:
- for filename in allfilenames:
- if not filename.endswith('.h'):
- continue
- variable = _pop_cached(varcache, filename, None, varid.name)
- if variable is not None:
- return variable
- return None
-
-
-MULTILINE = {
- # Python/Python-ast.c
- 'Load_singleton': 'PyObject *',
- 'Store_singleton': 'PyObject *',
- 'Del_singleton': 'PyObject *',
- 'AugLoad_singleton': 'PyObject *',
- 'AugStore_singleton': 'PyObject *',
- 'Param_singleton': 'PyObject *',
- 'And_singleton': 'PyObject *',
- 'Or_singleton': 'PyObject *',
- 'Add_singleton': 'static PyObject *',
- 'Sub_singleton': 'static PyObject *',
- 'Mult_singleton': 'static PyObject *',
- 'MatMult_singleton': 'static PyObject *',
- 'Div_singleton': 'static PyObject *',
- 'Mod_singleton': 'static PyObject *',
- 'Pow_singleton': 'static PyObject *',
- 'LShift_singleton': 'static PyObject *',
- 'RShift_singleton': 'static PyObject *',
- 'BitOr_singleton': 'static PyObject *',
- 'BitXor_singleton': 'static PyObject *',
- 'BitAnd_singleton': 'static PyObject *',
- 'FloorDiv_singleton': 'static PyObject *',
- 'Invert_singleton': 'static PyObject *',
- 'Not_singleton': 'static PyObject *',
- 'UAdd_singleton': 'static PyObject *',
- 'USub_singleton': 'static PyObject *',
- 'Eq_singleton': 'static PyObject *',
- 'NotEq_singleton': 'static PyObject *',
- 'Lt_singleton': 'static PyObject *',
- 'LtE_singleton': 'static PyObject *',
- 'Gt_singleton': 'static PyObject *',
- 'GtE_singleton': 'static PyObject *',
- 'Is_singleton': 'static PyObject *',
- 'IsNot_singleton': 'static PyObject *',
- 'In_singleton': 'static PyObject *',
- 'NotIn_singleton': 'static PyObject *',
- # Python/symtable.c
- 'top': 'static identifier ',
- 'lambda': 'static identifier ',
- 'genexpr': 'static identifier ',
- 'listcomp': 'static identifier ',
- 'setcomp': 'static identifier ',
- 'dictcomp': 'static identifier ',
- '__class__': 'static identifier ',
- # Python/compile.c
- '__doc__': 'static PyObject *',
- '__annotations__': 'static PyObject *',
- # Objects/floatobject.c
- 'double_format': 'static float_format_type ',
- 'float_format': 'static float_format_type ',
- 'detected_double_format': 'static float_format_type ',
- 'detected_float_format': 'static float_format_type ',
- # Python/dtoa.c
- 'private_mem': 'static double private_mem[PRIVATE_mem]',
- 'pmem_next': 'static double *',
- # Modules/_weakref.c
- 'weakref_functions': 'static PyMethodDef ',
-}
-INLINE = {
- # Modules/_tracemalloc.c
- 'allocators': 'static struct { PyMemAllocatorEx mem; PyMemAllocatorEx raw; PyMemAllocatorEx obj; } ',
- # Modules/faulthandler.c
- 'fatal_error': 'static struct { int enabled; PyObject *file; int fd; int all_threads; PyInterpreterState *interp; void *exc_handler; } ',
- 'thread': 'static struct { PyObject *file; int fd; PY_TIMEOUT_T timeout_us; int repeat; PyInterpreterState *interp; int exit; char *header; size_t header_len; PyThread_type_lock cancel_event; PyThread_type_lock running; } ',
- # Modules/signalmodule.c
- 'Handlers': 'static volatile struct { _Py_atomic_int tripped; PyObject *func; } Handlers[NSIG]',
- 'wakeup': 'static volatile struct { SOCKET_T fd; int warn_on_full_buffer; int use_send; } ',
- # Python/dynload_shlib.c
- 'handles': 'static struct { dev_t dev; ino_t ino; void *handle; } handles[128]',
- # Objects/obmalloc.c
- '_PyMem_Debug': 'static struct { debug_alloc_api_t raw; debug_alloc_api_t mem; debug_alloc_api_t obj; } ',
- # Python/bootstrap_hash.c
- 'urandom_cache': 'static struct { int fd; dev_t st_dev; ino_t st_ino; } ',
- }
-FUNC = {
- # Objects/object.c
- '_Py_abstract_hack': 'Py_ssize_t (*_Py_abstract_hack)(PyObject *)',
- # Parser/myreadline.c
- 'PyOS_InputHook': 'int (*PyOS_InputHook)(void)',
- # Python/pylifecycle.c
- '_PyOS_mystrnicmp_hack': 'int (*_PyOS_mystrnicmp_hack)(const char *, const char *, Py_ssize_t)',
- # Parser/myreadline.c
- 'PyOS_ReadlineFunctionPointer': 'char *(*PyOS_ReadlineFunctionPointer)(FILE *, FILE *, const char *)',
- }
-IMPLIED = {
- # Objects/boolobject.c
- '_Py_FalseStruct': 'static struct _longobject ',
- '_Py_TrueStruct': 'static struct _longobject ',
- # Modules/config.c
- '_PyImport_Inittab': 'struct _inittab _PyImport_Inittab[]',
- }
-GLOBALS = {}
-GLOBALS.update(MULTILINE)
-GLOBALS.update(INLINE)
-GLOBALS.update(FUNC)
-GLOBALS.update(IMPLIED)
-
-LOCALS = {
- 'buildinfo': ('Modules/getbuildinfo.c',
- 'Py_GetBuildInfo',
- 'static char buildinfo[50 + sizeof(GITVERSION) + ((sizeof(GITTAG) > sizeof(GITBRANCH)) ? sizeof(GITTAG) : sizeof(GITBRANCH))]'),
- 'methods': ('Python/codecs.c',
- '_PyCodecRegistry_Init',
- 'static struct { char *name; PyMethodDef def; } methods[]'),
- }
-
-
-def _known(symbol):
- if symbol.funcname:
- if symbol.funcname != UNKNOWN or symbol.filename != UNKNOWN:
- raise KeyError(symbol.name)
- filename, funcname, decl = LOCALS[symbol.name]
- varid = ID(filename, funcname, symbol.name)
- elif not symbol.filename or symbol.filename == UNKNOWN:
- raise KeyError(symbol.name)
- else:
- varid = symbol.id
- try:
- decl = GLOBALS[symbol.name]
- except KeyError:
-
- if symbol.name.endswith('_methods'):
- decl = 'static PyMethodDef '
- elif symbol.filename == 'Objects/exceptions.c' and symbol.name.startswith(('PyExc_', '_PyExc_')):
- decl = 'static PyTypeObject '
- else:
- raise
- if symbol.name not in decl:
- decl = decl + symbol.name
- return Variable(varid, 'static', decl)
-
-
-def known_row(varid, decl):
- return (
- varid.filename,
- varid.funcname or '-',
- varid.name,
- 'variable',
- decl,
- )
-
-
-def known_rows(symbols, *,
- cached=True,
- _get_filenames=iter_cpython_files,
- _find_match=find_matching_variable,
- _find_symbols=find_variables,
- _as_known=known_row,
- ):
- filenames = list(_get_filenames())
- cache = {}
- if cached:
- for symbol in symbols:
- try:
- found = _known(symbol)
- except KeyError:
- found = _find_match(symbol, cache, filenames)
- if found is None:
- found = Variable(symbol.id, UNKNOWN, UNKNOWN)
- yield _as_known(found.id, found.vartype)
- else:
- raise NotImplementedError # XXX incorporate KNOWN
- for variable in _find_symbols(symbols, filenames,
- srccache=cache,
- parse_variable=_parse_global,
- ):
- #variable = variable._replace(
- # filename=os.path.relpath(variable.filename, REPO_ROOT))
- if variable.funcname == UNKNOWN:
- print(variable)
- if variable.vartype== UNKNOWN:
- print(variable)
- yield _as_known(variable.id, variable.vartype)
-
-
-def generate(symbols, filename=None, *,
- _generate_rows=known_rows,
- _write_tsv=write_tsv,
- ):
- if not filename:
- filename = KNOWN_FILE + '.new'
-
- rows = _generate_rows(symbols)
- _write_tsv(filename, KNOWN_HEADER, rows)
-
-
-if __name__ == '__main__':
- from c_symbols import binary
- symbols = binary.iter_symbols(
- binary.PYTHON,
- find_local_symbol=None,
- )
- generate(symbols)
diff --git a/Tools/c-analyzer/cpython/_parser.py b/Tools/c-analyzer/cpython/_parser.py
new file mode 100644
index 0000000000..35fa296251
--- /dev/null
+++ b/Tools/c-analyzer/cpython/_parser.py
@@ -0,0 +1,308 @@
+import os.path
+import re
+
+from c_common.fsutil import expand_filenames, iter_files_by_suffix
+from c_parser.preprocessor import (
+ get_preprocessor as _get_preprocessor,
+)
+from c_parser import (
+ parse_file as _parse_file,
+ parse_files as _parse_files,
+)
+from . import REPO_ROOT, INCLUDE_DIRS, SOURCE_DIRS
+
+
+GLOB_ALL = '**/*'
+
+
+def clean_lines(text):
+ """Clear out comments, blank lines, and leading/trailing whitespace."""
+ lines = (line.strip() for line in text.splitlines())
+ lines = (line.partition('#')[0].rstrip()
+ for line in lines
+ if line and not line.startswith('#'))
+ glob_all = f'{GLOB_ALL} '
+ lines = (re.sub(r'^[*] ', glob_all, line) for line in lines)
+ lines = (os.path.join(REPO_ROOT, line) for line in lines)
+ return list(lines)
+
+
+'''
+@begin=sh@
+./python ../c-parser/cpython.py
+ --exclude '+../c-parser/EXCLUDED'
+ --macros '+../c-parser/MACROS'
+ --incldirs '+../c-parser/INCL_DIRS'
+ --same './Include/cpython/'
+ Include/*.h
+ Include/internal/*.h
+ Modules/**/*.c
+ Objects/**/*.c
+ Parser/**/*.c
+ Python/**/*.c
+@end=sh@
+'''
+
+GLOBS = [
+ 'Include/*.h',
+ 'Include/internal/*.h',
+ 'Modules/**/*.c',
+ 'Objects/**/*.c',
+ 'Parser/**/*.c',
+ 'Python/**/*.c',
+]
+
+EXCLUDED = clean_lines('''
+# @begin=conf@
+
+# Rather than fixing for this one, we manually make sure it's okay.
+Modules/_sha3/kcp/KeccakP-1600-opt64.c
+
+# OSX
+#Modules/_ctypes/darwin/*.c
+#Modules/_ctypes/libffi_osx/*.c
+Modules/_scproxy.c # SystemConfiguration/SystemConfiguration.h
+
+# Windows
+Modules/_winapi.c # windows.h
+Modules/overlapped.c # winsock.h
+Python/dynload_win.c # windows.h
+
+# other OS-dependent
+Python/dynload_dl.c # dl.h
+Python/dynload_hpux.c # dl.h
+Python/dynload_aix.c # sys/ldr.h
+
+# @end=conf@
+''')
+
+# XXX Fix the parser.
+EXCLUDED += clean_lines('''
+# The tool should be able to parse these...
+
+Modules/_dbmmodule.c
+Modules/cjkcodecs/_codecs_*.c
+Modules/expat/xmlrole.c
+Modules/expat/xmlparse.c
+Python/initconfig.c
+''')
+
+INCL_DIRS = clean_lines('''
+# @begin=tsv@
+
+glob dirname
+* .
+* ./Include
+* ./Include/internal
+
+Modules/_tkinter.c /usr/include/tcl8.6
+Modules/tkappinit.c /usr/include/tcl
+Modules/_decimal/**/*.c Modules/_decimal/libmpdec
+
+# @end=tsv@
+''')[1:]
+
+MACROS = clean_lines('''
+# @begin=tsv@
+
+glob name value
+
+Include/internal/*.h Py_BUILD_CORE 1
+Python/**/*.c Py_BUILD_CORE 1
+Parser/**/*.c Py_BUILD_CORE 1
+Objects/**/*.c Py_BUILD_CORE 1
+
+Modules/faulthandler.c Py_BUILD_CORE 1
+Modules/_functoolsmodule.c Py_BUILD_CORE 1
+Modules/gcmodule.c Py_BUILD_CORE 1
+Modules/getpath.c Py_BUILD_CORE 1
+Modules/_io/*.c Py_BUILD_CORE 1
+Modules/itertoolsmodule.c Py_BUILD_CORE 1
+Modules/_localemodule.c Py_BUILD_CORE 1
+Modules/main.c Py_BUILD_CORE 1
+Modules/posixmodule.c Py_BUILD_CORE 1
+Modules/signalmodule.c Py_BUILD_CORE 1
+Modules/_threadmodule.c Py_BUILD_CORE 1
+Modules/_tracemalloc.c Py_BUILD_CORE 1
+Modules/_asynciomodule.c Py_BUILD_CORE 1
+Modules/mathmodule.c Py_BUILD_CORE 1
+Modules/cmathmodule.c Py_BUILD_CORE 1
+Modules/_weakref.c Py_BUILD_CORE 1
+Modules/sha256module.c Py_BUILD_CORE 1
+Modules/sha512module.c Py_BUILD_CORE 1
+Modules/_datetimemodule.c Py_BUILD_CORE 1
+Modules/_ctypes/cfield.c Py_BUILD_CORE 1
+Modules/_heapqmodule.c Py_BUILD_CORE 1
+Modules/_posixsubprocess.c Py_BUILD_CORE 1
+
+Modules/_json.c Py_BUILD_CORE_BUILTIN 1
+Modules/_pickle.c Py_BUILD_CORE_BUILTIN 1
+Modules/_testinternalcapi.c Py_BUILD_CORE_BUILTIN 1
+
+Include/cpython/abstract.h Py_CPYTHON_ABSTRACTOBJECT_H 1
+Include/cpython/bytearrayobject.h Py_CPYTHON_BYTEARRAYOBJECT_H 1
+Include/cpython/bytesobject.h Py_CPYTHON_BYTESOBJECT_H 1
+Include/cpython/ceval.h Py_CPYTHON_CEVAL_H 1
+Include/cpython/code.h Py_CPYTHON_CODE_H 1
+Include/cpython/dictobject.h Py_CPYTHON_DICTOBJECT_H 1
+Include/cpython/fileobject.h Py_CPYTHON_FILEOBJECT_H 1
+Include/cpython/fileutils.h Py_CPYTHON_FILEUTILS_H 1
+Include/cpython/frameobject.h Py_CPYTHON_FRAMEOBJECT_H 1
+Include/cpython/import.h Py_CPYTHON_IMPORT_H 1
+Include/cpython/interpreteridobject.h Py_CPYTHON_INTERPRETERIDOBJECT_H 1
+Include/cpython/listobject.h Py_CPYTHON_LISTOBJECT_H 1
+Include/cpython/methodobject.h Py_CPYTHON_METHODOBJECT_H 1
+Include/cpython/object.h Py_CPYTHON_OBJECT_H 1
+Include/cpython/objimpl.h Py_CPYTHON_OBJIMPL_H 1
+Include/cpython/pyerrors.h Py_CPYTHON_ERRORS_H 1
+Include/cpython/pylifecycle.h Py_CPYTHON_PYLIFECYCLE_H 1
+Include/cpython/pymem.h Py_CPYTHON_PYMEM_H 1
+Include/cpython/pystate.h Py_CPYTHON_PYSTATE_H 1
+Include/cpython/sysmodule.h Py_CPYTHON_SYSMODULE_H 1
+Include/cpython/traceback.h Py_CPYTHON_TRACEBACK_H 1
+Include/cpython/tupleobject.h Py_CPYTHON_TUPLEOBJECT_H 1
+Include/cpython/unicodeobject.h Py_CPYTHON_UNICODEOBJECT_H 1
+
+# implied include of pyport.h
+Include/**/*.h PyAPI_DATA(RTYPE) extern RTYPE
+Include/**/*.h PyAPI_FUNC(RTYPE) RTYPE
+Include/**/*.h Py_DEPRECATED(VER) /* */
+Include/**/*.h _Py_NO_RETURN /* */
+Include/**/*.h PYLONG_BITS_IN_DIGIT 30
+Modules/**/*.c PyMODINIT_FUNC PyObject*
+Objects/unicodeobject.c PyMODINIT_FUNC PyObject*
+Python/marshal.c PyMODINIT_FUNC PyObject*
+Python/_warnings.c PyMODINIT_FUNC PyObject*
+Python/Python-ast.c PyMODINIT_FUNC PyObject*
+Python/import.c PyMODINIT_FUNC PyObject*
+Modules/_testcapimodule.c PyAPI_FUNC(RTYPE) RTYPE
+Python/getargs.c PyAPI_FUNC(RTYPE) RTYPE
+
+# implied include of exports.h
+#Modules/_io/bytesio.c Py_EXPORTED_SYMBOL /* */
+
+# implied include of object.h
+Include/**/*.h PyObject_HEAD PyObject ob_base;
+Include/**/*.h PyObject_VAR_HEAD PyVarObject ob_base;
+
+# implied include of pyconfig.h
+Include/**/*.h SIZEOF_WCHAR_T 4
+
+# implied include of <unistd.h>
+Include/**/*.h _POSIX_THREADS 1
+
+# from Makefile
+Modules/getpath.c PYTHONPATH 1
+Modules/getpath.c PREFIX ...
+Modules/getpath.c EXEC_PREFIX ...
+Modules/getpath.c VERSION ...
+Modules/getpath.c VPATH ...
+
+# from Modules/_sha3/sha3module.c
+Modules/_sha3/kcp/KeccakP-1600-inplace32BI.c PLATFORM_BYTE_ORDER 4321 # force big-endian
+Modules/_sha3/kcp/*.c KeccakOpt 64
+Modules/_sha3/kcp/*.c KeccakP200_excluded 1
+Modules/_sha3/kcp/*.c KeccakP400_excluded 1
+Modules/_sha3/kcp/*.c KeccakP800_excluded 1
+
+# See: setup.py
+Modules/_decimal/**/*.c CONFIG_64 1
+Modules/_decimal/**/*.c ASM 1
+Modules/expat/xmlparse.c HAVE_EXPAT_CONFIG_H 1
+Modules/expat/xmlparse.c XML_POOR_ENTROPY 1
+Modules/_dbmmodule.c HAVE_GDBM_DASH_NDBM_H 1
+
+# @end=tsv@
+''')[1:]
+
+# -pthread
+# -Wno-unused-result
+# -Wsign-compare
+# -g
+# -Og
+# -Wall
+# -std=c99
+# -Wextra
+# -Wno-unused-result -Wno-unused-parameter
+# -Wno-missing-field-initializers
+# -Werror=implicit-function-declaration
+
+SAME = [
+ './Include/cpython/',
+]
+
+
+def resolve_filename(filename):
+ orig = filename
+ filename = os.path.normcase(os.path.normpath(filename))
+ if os.path.isabs(filename):
+ if os.path.relpath(filename, REPO_ROOT).startswith('.'):
+ raise Exception(f'{orig!r} is outside the repo ({REPO_ROOT})')
+ return filename
+ else:
+ return os.path.join(REPO_ROOT, filename)
+
+
+def iter_filenames(*, search=False):
+ if search:
+ yield from iter_files_by_suffix(INCLUDE_DIRS, ('.h',))
+ yield from iter_files_by_suffix(SOURCE_DIRS, ('.c',))
+ else:
+ globs = (os.path.join(REPO_ROOT, file) for file in GLOBS)
+ yield from expand_filenames(globs)
+
+
+def get_preprocessor(*,
+ file_macros=None,
+ file_incldirs=None,
+ file_same=None,
+ **kwargs
+ ):
+ macros = tuple(MACROS)
+ if file_macros:
+ macros += tuple(file_macros)
+ incldirs = tuple(INCL_DIRS)
+ if file_incldirs:
+ incldirs += tuple(file_incldirs)
+ return _get_preprocessor(
+ file_macros=macros,
+ file_incldirs=incldirs,
+ file_same=file_same,
+ **kwargs
+ )
+
+
+def parse_file(filename, *,
+ match_kind=None,
+ ignore_exc=None,
+ log_err=None,
+ ):
+ get_file_preprocessor = get_preprocessor(
+ ignore_exc=ignore_exc,
+ log_err=log_err,
+ )
+ yield from _parse_file(
+ filename,
+ match_kind=match_kind,
+ get_file_preprocessor=get_file_preprocessor,
+ )
+
+
+def parse_files(filenames=None, *,
+ match_kind=None,
+ ignore_exc=None,
+ log_err=None,
+ get_file_preprocessor=None,
+ **file_kwargs
+ ):
+ if get_file_preprocessor is None:
+ get_file_preprocessor = get_preprocessor(
+ ignore_exc=ignore_exc,
+ log_err=log_err,
+ )
+ yield from _parse_files(
+ filenames,
+ match_kind=match_kind,
+ get_file_preprocessor=get_file_preprocessor,
+ **file_kwargs
+ )
diff --git a/Tools/c-analyzer/cpython/files.py b/Tools/c-analyzer/cpython/files.py
deleted file mode 100644
index 543097af7b..0000000000
--- a/Tools/c-analyzer/cpython/files.py
+++ /dev/null
@@ -1,29 +0,0 @@
-from c_analyzer.common.files import (
- C_SOURCE_SUFFIXES, walk_tree, iter_files_by_suffix,
- )
-
-from . import SOURCE_DIRS, REPO_ROOT
-
-# XXX need tests:
-# * iter_files()
-
-
-def iter_files(*,
- walk=walk_tree,
- _files=iter_files_by_suffix,
- ):
- """Yield each file in the tree for each of the given directory names."""
- excludedtrees = [
- os.path.join('Include', 'cpython', ''),
- ]
- def is_excluded(filename):
- for root in excludedtrees:
- if filename.startswith(root):
- return True
- return False
- for filename in _files(SOURCE_DIRS, C_SOURCE_SUFFIXES, REPO_ROOT,
- walk=walk,
- ):
- if is_excluded(filename):
- continue
- yield filename
diff --git a/Tools/c-analyzer/cpython/find.py b/Tools/c-analyzer/cpython/find.py
deleted file mode 100644
index a7bc0b477b..0000000000
--- a/Tools/c-analyzer/cpython/find.py
+++ /dev/null
@@ -1,101 +0,0 @@
-import os.path
-
-from c_analyzer.common import files
-from c_analyzer.common.info import UNKNOWN, ID
-from c_analyzer.variables import find as _common
-
-from . import SOURCE_DIRS, PYTHON, REPO_ROOT
-from .known import (
- from_file as known_from_file,
- DATA_FILE as KNOWN_FILE,
- )
-from .supported import (
- ignored_from_file, IGNORED_FILE, is_supported, _is_object,
- )
-
-# XXX need tests:
-# * vars_from_binary()
-# * vars_from_source()
-# * supported_vars()
-
-
-def _handle_id(filename, funcname, name, *,
- _relpath=os.path.relpath,
- ):
- filename = _relpath(filename, REPO_ROOT)
- return ID(filename, funcname, name)
-
-
-def vars_from_binary(*,
- known=KNOWN_FILE,
- _known_from_file=known_from_file,
- _iter_files=files.iter_files_by_suffix,
- _iter_vars=_common.vars_from_binary,
- ):
- """Yield a Variable for each found Symbol.
-
- Details are filled in from the given "known" variables and types.
- """
- if isinstance(known, str):
- known = _known_from_file(known)
- dirnames = SOURCE_DIRS
- suffixes = ('.c',)
- filenames = _iter_files(dirnames, suffixes)
- # XXX For now we only use known variables (no source lookup).
- filenames = None
- yield from _iter_vars(PYTHON,
- known=known,
- filenames=filenames,
- handle_id=_handle_id,
- check_filename=(lambda n: True),
- )
-
-
-def vars_from_source(*,
- preprocessed=None,
- known=KNOWN_FILE,
- _known_from_file=known_from_file,
- _iter_files=files.iter_files_by_suffix,
- _iter_vars=_common.vars_from_source,
- ):
- """Yield a Variable for each declaration in the raw source code.
-
- Details are filled in from the given "known" variables and types.
- """
- if isinstance(known, str):
- known = _known_from_file(known)
- dirnames = SOURCE_DIRS
- suffixes = ('.c',)
- filenames = _iter_files(dirnames, suffixes)
- yield from _iter_vars(filenames,
- preprocessed=preprocessed,
- known=known,
- handle_id=_handle_id,
- )
-
-
-def supported_vars(*,
- known=KNOWN_FILE,
- ignored=IGNORED_FILE,
- skip_objects=False,
- _known_from_file=known_from_file,
- _ignored_from_file=ignored_from_file,
- _iter_vars=vars_from_binary,
- _is_supported=is_supported,
- ):
- """Yield (var, is supported) for each found variable."""
- if isinstance(known, str):
- known = _known_from_file(known)
- if isinstance(ignored, str):
- ignored = _ignored_from_file(ignored)
-
- for var in _iter_vars(known=known):
- if not var.isglobal:
- continue
- elif var.vartype == UNKNOWN:
- yield var, None
- # XXX Support proper filters instead.
- elif skip_objects and _is_object(found.vartype):
- continue
- else:
- yield var, _is_supported(var, ignored, known)
diff --git a/Tools/c-analyzer/cpython/ignored.tsv b/Tools/c-analyzer/cpython/ignored.tsv
new file mode 100644
index 0000000000..2c456db063
--- /dev/null
+++ b/Tools/c-analyzer/cpython/ignored.tsv
@@ -0,0 +1,2 @@
+filename funcname name reason
+#??? - somevar ???
diff --git a/Tools/c-analyzer/cpython/known.py b/Tools/c-analyzer/cpython/known.py
deleted file mode 100644
index c3cc2c0602..0000000000
--- a/Tools/c-analyzer/cpython/known.py
+++ /dev/null
@@ -1,66 +0,0 @@
-import csv
-import os.path
-
-from c_analyzer.parser.declarations import extract_storage
-from c_analyzer.variables import known as _common
-from c_analyzer.variables.info import Variable
-
-from . import DATA_DIR
-
-
-# XXX need tests:
-# * from_file()
-# * look_up_variable()
-
-
-DATA_FILE = os.path.join(DATA_DIR, 'known.tsv')
-
-
-def _get_storage(decl, infunc):
- # statics
- if decl.startswith(('Py_LOCAL(', 'Py_LOCAL_INLINE(')):
- return 'static'
- if decl.startswith(('_Py_IDENTIFIER(', '_Py_static_string(')):
- return 'static'
- if decl.startswith('PyDoc_VAR('):
- return 'static'
- if decl.startswith(('SLOT1BINFULL(', 'SLOT1BIN(')):
- return 'static'
- if decl.startswith('WRAP_METHOD('):
- return 'static'
- # public extern
- if decl.startswith('PyAPI_DATA('):
- return 'extern'
- # Fall back to the normal handler.
- return extract_storage(decl, infunc=infunc)
-
-
-def _handle_var(varid, decl):
-# if varid.name == 'id' and decl == UNKNOWN:
-# # None of these are variables.
-# decl = 'int id';
- storage = _get_storage(decl, varid.funcname)
- return Variable(varid, storage, decl)
-
-
-def from_file(infile=DATA_FILE, *,
- _from_file=_common.from_file,
- _handle_var=_handle_var,
- ):
- """Return the info for known declarations in the given file."""
- return _from_file(infile, handle_var=_handle_var)
-
-
-def look_up_variable(varid, knownvars, *,
- _lookup=_common.look_up_variable,
- ):
- """Return the known variable matching the given ID.
-
- "knownvars" is a mapping of ID to Variable.
-
- "match_files" is used to verify if two filenames point to
- the same file.
-
- If no match is found then None is returned.
- """
- return _lookup(varid, knownvars)
diff --git a/Tools/c-analyzer/cpython/known.tsv b/Tools/c-analyzer/cpython/known.tsv
new file mode 100644
index 0000000000..a48ef02dc6
--- /dev/null
+++ b/Tools/c-analyzer/cpython/known.tsv
@@ -0,0 +1,3 @@
+filename funcname name kind declaration
+#filename funcname name kind is_supported declaration
+#??? - PyWideStringList typedef ???
diff --git a/Tools/c-analyzer/cpython/supported.py b/Tools/c-analyzer/cpython/supported.py
deleted file mode 100644
index 18786eefd8..0000000000
--- a/Tools/c-analyzer/cpython/supported.py
+++ /dev/null
@@ -1,398 +0,0 @@
-import os.path
-import re
-
-from c_analyzer.common.info import ID
-from c_analyzer.common.util import read_tsv, write_tsv
-
-from . import DATA_DIR
-
-# XXX need tests:
-# * generate / script
-
-
-IGNORED_FILE = os.path.join(DATA_DIR, 'ignored.tsv')
-
-IGNORED_COLUMNS = ('filename', 'funcname', 'name', 'kind', 'reason')
-IGNORED_HEADER = '\t'.join(IGNORED_COLUMNS)
-
-# XXX Move these to ignored.tsv.
-IGNORED = {
- # global
- 'PyImport_FrozenModules': 'process-global',
- 'M___hello__': 'process-global',
- 'inittab_copy': 'process-global',
- 'PyHash_Func': 'process-global',
- '_Py_HashSecret_Initialized': 'process-global',
- '_TARGET_LOCALES': 'process-global',
-
- # startup (only changed before/during)
- '_PyRuntime': 'runtime startup',
- 'runtime_initialized': 'runtime startup',
- 'static_arg_parsers': 'runtime startup',
- 'orig_argv': 'runtime startup',
- 'opt_ptr': 'runtime startup',
- '_preinit_warnoptions': 'runtime startup',
- '_Py_StandardStreamEncoding': 'runtime startup',
- 'Py_FileSystemDefaultEncoding': 'runtime startup',
- '_Py_StandardStreamErrors': 'runtime startup',
- 'Py_FileSystemDefaultEncodeErrors': 'runtime startup',
- 'Py_BytesWarningFlag': 'runtime startup',
- 'Py_DebugFlag': 'runtime startup',
- 'Py_DontWriteBytecodeFlag': 'runtime startup',
- 'Py_FrozenFlag': 'runtime startup',
- 'Py_HashRandomizationFlag': 'runtime startup',
- 'Py_IgnoreEnvironmentFlag': 'runtime startup',
- 'Py_InspectFlag': 'runtime startup',
- 'Py_InteractiveFlag': 'runtime startup',
- 'Py_IsolatedFlag': 'runtime startup',
- 'Py_NoSiteFlag': 'runtime startup',
- 'Py_NoUserSiteDirectory': 'runtime startup',
- 'Py_OptimizeFlag': 'runtime startup',
- 'Py_QuietFlag': 'runtime startup',
- 'Py_UTF8Mode': 'runtime startup',
- 'Py_UnbufferedStdioFlag': 'runtime startup',
- 'Py_VerboseFlag': 'runtime startup',
- '_Py_path_config': 'runtime startup',
- '_PyOS_optarg': 'runtime startup',
- '_PyOS_opterr': 'runtime startup',
- '_PyOS_optind': 'runtime startup',
- '_Py_HashSecret': 'runtime startup',
-
- # REPL
- '_PyOS_ReadlineLock': 'repl',
- '_PyOS_ReadlineTState': 'repl',
-
- # effectively const
- 'tracemalloc_empty_traceback': 'const',
- '_empty_bitmap_node': 'const',
- 'posix_constants_pathconf': 'const',
- 'posix_constants_confstr': 'const',
- 'posix_constants_sysconf': 'const',
- '_PySys_ImplCacheTag': 'const',
- '_PySys_ImplName': 'const',
- 'PyImport_Inittab': 'const',
- '_PyImport_DynLoadFiletab': 'const',
- '_PyParser_Grammar': 'const',
- 'Py_hexdigits': 'const',
- '_PyImport_Inittab': 'const',
- '_PyByteArray_empty_string': 'const',
- '_PyLong_DigitValue': 'const',
- '_Py_SwappedOp': 'const',
- 'PyStructSequence_UnnamedField': 'const',
-
- # signals are main-thread only
- 'faulthandler_handlers': 'signals are main-thread only',
- 'user_signals': 'signals are main-thread only',
- 'wakeup': 'signals are main-thread only',
-
- # hacks
- '_PySet_Dummy': 'only used as a placeholder',
- }
-
-BENIGN = 'races here are benign and unlikely'
-
-
-def is_supported(variable, ignored=None, known=None, *,
- _ignored=(lambda *a, **k: _is_ignored(*a, **k)),
- _vartype_okay=(lambda *a, **k: _is_vartype_okay(*a, **k)),
- ):
- """Return True if the given global variable is okay in CPython."""
- if _ignored(variable,
- ignored and ignored.get('variables')):
- return True
- elif _vartype_okay(variable.vartype,
- ignored.get('types')):
- return True
- else:
- return False
-
-
-def _is_ignored(variable, ignoredvars=None, *,
- _IGNORED=IGNORED,
- ):
- """Return the reason if the variable is a supported global.
-
- Return None if the variable is not a supported global.
- """
- if ignoredvars and (reason := ignoredvars.get(variable.id)):
- return reason
-
- if variable.funcname is None:
- if reason := _IGNORED.get(variable.name):
- return reason
-
- # compiler
- if variable.filename == 'Python/graminit.c':
- if variable.vartype.startswith('static state '):
- return 'compiler'
- if variable.filename == 'Python/symtable.c':
- if variable.vartype.startswith('static identifier '):
- return 'compiler'
- if variable.filename == 'Python/Python-ast.c':
- # These should be const.
- if variable.name.endswith('_field'):
- return 'compiler'
- if variable.name.endswith('_attribute'):
- return 'compiler'
-
- # other
- if variable.filename == 'Python/dtoa.c':
- # guarded by lock?
- if variable.name in ('p5s', 'freelist'):
- return 'dtoa is thread-safe?'
- if variable.name in ('private_mem', 'pmem_next'):
- return 'dtoa is thread-safe?'
- if variable.filename == 'Python/thread.c':
- # Threads do not become an issue until after these have been set
- # and these never get changed after that.
- if variable.name in ('initialized', 'thread_debug'):
- return 'thread-safe'
- if variable.filename == 'Python/getversion.c':
- if variable.name == 'version':
- # Races are benign here, as well as unlikely.
- return BENIGN
- if variable.filename == 'Python/fileutils.c':
- if variable.name == 'force_ascii':
- return BENIGN
- if variable.name == 'ioctl_works':
- return BENIGN
- if variable.name == '_Py_open_cloexec_works':
- return BENIGN
- if variable.filename == 'Python/codecs.c':
- if variable.name == 'ucnhash_CAPI':
- return BENIGN
- if variable.filename == 'Python/bootstrap_hash.c':
- if variable.name == 'getrandom_works':
- return BENIGN
- if variable.filename == 'Objects/unicodeobject.c':
- if variable.name == 'ucnhash_CAPI':
- return BENIGN
- if variable.name == 'bloom_linebreak':
- # *mostly* benign
- return BENIGN
- if variable.filename == 'Modules/getbuildinfo.c':
- if variable.name == 'buildinfo':
- # The static is used for pre-allocation.
- return BENIGN
- if variable.filename == 'Modules/posixmodule.c':
- if variable.name == 'ticks_per_second':
- return BENIGN
- if variable.name == 'dup3_works':
- return BENIGN
- if variable.filename == 'Modules/timemodule.c':
- if variable.name == 'ticks_per_second':
- return BENIGN
- if variable.filename == 'Objects/longobject.c':
- if variable.name == 'log_base_BASE':
- return BENIGN
- if variable.name == 'convwidth_base':
- return BENIGN
- if variable.name == 'convmultmax_base':
- return BENIGN
-
- return None
-
-
-def _is_vartype_okay(vartype, ignoredtypes=None):
- if _is_object(vartype):
- return None
-
- if vartype.startswith('static const '):
- return 'const'
- if vartype.startswith('const '):
- return 'const'
-
- # components for TypeObject definitions
- for name in ('PyMethodDef', 'PyGetSetDef', 'PyMemberDef'):
- if name in vartype:
- return 'const'
- for name in ('PyNumberMethods', 'PySequenceMethods', 'PyMappingMethods',
- 'PyBufferProcs', 'PyAsyncMethods'):
- if name in vartype:
- return 'const'
- for name in ('slotdef', 'newfunc'):
- if name in vartype:
- return 'const'
-
- # structseq
- for name in ('PyStructSequence_Desc', 'PyStructSequence_Field'):
- if name in vartype:
- return 'const'
-
- # other definiitions
- if 'PyModuleDef' in vartype:
- return 'const'
-
- # thread-safe
- if '_Py_atomic_int' in vartype:
- return 'thread-safe'
- if 'pthread_condattr_t' in vartype:
- return 'thread-safe'
-
- # startup
- if '_Py_PreInitEntry' in vartype:
- return 'startup'
-
- # global
-# if 'PyMemAllocatorEx' in vartype:
-# return True
-
- # others
-# if 'PyThread_type_lock' in vartype:
-# return True
-
- # XXX ???
- # _Py_tss_t
- # _Py_hashtable_t
- # stack_t
- # _PyUnicode_Name_CAPI
-
- # functions
- if '(' in vartype and '[' not in vartype:
- return 'function pointer'
-
- # XXX finish!
- # * allow const values?
- #raise NotImplementedError
- return None
-
-
-PYOBJECT_RE = re.compile(r'''
- ^
- (
- # must start with "static "
- static \s+
- (
- identifier
- )
- \b
- ) |
- (
- # may start with "static "
- ( static \s+ )?
- (
- .*
- (
- PyObject |
- PyTypeObject |
- _? Py \w+ Object |
- _PyArg_Parser |
- _Py_Identifier |
- traceback_t |
- PyAsyncGenASend |
- _PyAsyncGenWrappedValue |
- PyContext |
- method_cache_entry
- )
- \b
- ) |
- (
- (
- _Py_IDENTIFIER |
- _Py_static_string
- )
- [(]
- )
- )
- ''', re.VERBOSE)
-
-
-def _is_object(vartype):
- if 'PyDictKeysObject' in vartype:
- return False
- if PYOBJECT_RE.match(vartype):
- return True
- if vartype.endswith((' _Py_FalseStruct', ' _Py_TrueStruct')):
- return True
-
- # XXX Add more?
-
- #for part in vartype.split():
- # # XXX const is automatic True?
- # if part == 'PyObject' or part.startswith('PyObject['):
- # return True
- return False
-
-
-def ignored_from_file(infile, *,
- _read_tsv=read_tsv,
- ):
- """Yield a Variable for each ignored var in the file."""
- ignored = {
- 'variables': {},
- #'types': {},
- #'constants': {},
- #'macros': {},
- }
- for row in _read_tsv(infile, IGNORED_HEADER):
- filename, funcname, name, kind, reason = row
- if not funcname or funcname == '-':
- funcname = None
- id = ID(filename, funcname, name)
- if kind == 'variable':
- values = ignored['variables']
- else:
- raise ValueError(f'unsupported kind in row {row}')
- values[id] = reason
- return ignored
-
-
-##################################
-# generate
-
-def _get_row(varid, reason):
- return (
- varid.filename,
- varid.funcname or '-',
- varid.name,
- 'variable',
- str(reason),
- )
-
-
-def _get_rows(variables, ignored=None, *,
- _as_row=_get_row,
- _is_ignored=_is_ignored,
- _vartype_okay=_is_vartype_okay,
- ):
- count = 0
- for variable in variables:
- reason = _is_ignored(variable,
- ignored and ignored.get('variables'),
- )
- if not reason:
- reason = _vartype_okay(variable.vartype,
- ignored and ignored.get('types'))
- if not reason:
- continue
-
- print(' ', variable, repr(reason))
- yield _as_row(variable.id, reason)
- count += 1
- print(f'total: {count}')
-
-
-def _generate_ignored_file(variables, filename=None, *,
- _generate_rows=_get_rows,
- _write_tsv=write_tsv,
- ):
- if not filename:
- filename = IGNORED_FILE + '.new'
- rows = _generate_rows(variables)
- _write_tsv(filename, IGNORED_HEADER, rows)
-
-
-if __name__ == '__main__':
- from cpython import SOURCE_DIRS
- from cpython.known import (
- from_file as known_from_file,
- DATA_FILE as KNOWN_FILE,
- )
- # XXX This is wrong!
- from . import find
- known = known_from_file(KNOWN_FILE)
- knownvars = (known or {}).get('variables')
- variables = find.globals_from_binary(knownvars=knownvars,
- dirnames=SOURCE_DIRS)
-
- _generate_ignored_file(variables)