Safe Haskell
Safe Haskell is an extension to the Haskell language that is implemented in
GHC as of version 7.2. It allows for unsafe code to be securely included in a
trusted code base by restricting the features of GHC Haskell the code is
allowed to use. Put simply, it makes the types of programs trustable. Safe
Haskell itself is aimed to be as minimal as possible while still providing
strong enough guarantees about compiled Haskell code for more advance secure
systems to be built on top of it. These include techniques such as
information flow control security or encrypted computations.
The design of Safe Haskell covers the following aspects:
A safe language dialect of
Haskell that provides guarantees about the code. It allows types and
module boundaries to be trusted.
A new safe import extension that specifies
that the module being imported must be trusted.
A definition of trust (or safety) and how it
operates, along with ways of defining and changing the trust of modules
and packages.
Uses of Safe Haskell
Safe Haskell has been designed with two use cases in mind:
Enforcing strict type safety at compile timeCompiling and executing untrusted codeStrict type-safety (good style)
Haskell offers a powerful type system and separation of pure and
effectual functions through the IO monad. There are
several loop holes in the type system though, the most obvious offender
being the unsafePerformIO :: IO a -> a function. The
safe language dialect of Safe Haskell disallows the use of such
functions. This can be useful for a variety of purposes as it makes
Haskell code easier to analyze and reason about. It also codifies an
existing culture in the Haskell community of trying to avoid using such
unsafe functions unless absolutely necessary. As such using the safe
language (through the flag) can be though of as a
way of enforcing good style, similar to the function of
.
Building secure systems (restricted IO Monads)
Safe Haskell is designed to give users enough guarantees about the safety
properties of compiled code so that secure systems can be built using
Haskell. A lot of work has been done with Haskell, building such systems
as information flow control security, capability based security, DSLs for
working with encrypted data... etc. These systems all rely on properties
of the Haskell language that aren't true in the general case where uses
of functions like unsafePerformIO are allowed.
As an example lets define an interface for a plugin system where the
plugin authors are untrusted, possibly malicious third-parties. We do
this by restricting the plugin interface to pure functions or to a
restricted IO monad that we have defined that only
allows a safe subset of IO actions to be executed. We
define the plugin interface here so that it requires the plugin module,
Danger, to export a single computation,
Danger.runMe, of type RIO (), where
RIO is a new monad defined as follows:
-- Either of the following Safe Haskell pragmas would do
{-# LANGUAGE Trustworthy #-}
{-# LANGUAGE Safe #-}
module RIO (RIO(), runRIO, rioReadFile, rioWriteFile) where
-- Notice that symbol UnsafeRIO is not exported from this module!
newtype RIO a = UnsafeRIO { runRIO :: IO a }
instance Monad RIO where
return = UnsafeRIO . return
(UnsafeRIO m) >>= k = UnsafeRIO $ m >>= runRIO . k
-- Returns True iff access is allowed to file name
pathOK :: FilePath -> IO Bool
pathOK file = {- Implement some policy based on file name -}
rioReadFile :: FilePath -> RIO String
rioReadFile file = UnsafeRIO $ do
ok <- pathOK file
if ok then readFile file else return ""
rioWriteFile :: FilePath -> String -> RIO ()
rioWriteFile file contents = UnsafeRIO $ do
ok <- pathOK file
if ok then writeFile file contents else return ()
We compile Danger using the new Safe Haskell flag:
{-# LANGUAGE Safe #-}
module Danger ( runMe ) where
runMe :: RIO ()
runMe = ...
Before going into the Safe Haskell details, lets point out some of
the reasons this design would fail without Safe Haskell:
The design attempts to restrict the operations that Danger
can perform by using types, specifically the RIO
type wrapper around IO. The author of Danger can
subvert this though by simply writing arbitrary
IO actions and using unsafePerformIO ::
IO a -> a to execute them as pure functions.
The design also relies on the Danger module not being able
to access the UnsafeRIO constructor.
Unfortunately Template Haskell can be used to subvert module
boundaries and so could be used gain access to this constructor.
There is no way to place restrictions on the modules that
the untrusted Danger module can import. This gives the author of
Danger a very large attack surface, essentially any package
currently installed on the system. Should any of these packages
have a vulnerability then the Danger module can exploit this. The
only way to stop this would be to patch or remove packages with
known vulnerabilities even if they should only be used by
trusted code such as the RIO module.
To stop these attacks Safe Haskell can be used. This is done by compiling
the RIO module with the flag and compiling
the Danger module with the flag.
The use of the flag to compile the Danger module
restricts the features of Haskell that can be used to a
safe subset. This includes
disallowing unsafePerfromIO, Template Haskell, pure
FFI functions, Generalized Newtype Deriving, RULES and restricting the
operation of Overlapping Instances. The flag also
restricts the modules can be imported by Danger to only those that are
considered trusted. Trusted modules are those compiled with
, where GHC provides a mechanical guarantee that
the code is safe. Or those modules compiled with
, where the module author claims that the
module is Safe.
This is why the RIO module is compiled with
, to allow the Danger module to import it.
The flag doesn't place any restrictions on
the module like does. Instead the module author
claims that while code may use unsafe features internally, it only
exposes an API that can used in a safe manner. There is an issue here as
may be used by an arbitrary module and
module author. Because of this for trustworthy modules to be considered
trusted, and so allowed to be used in compiled
code, the client C compiling the code must tell GHC that they trust the
package the trustworthy module resides in. This is essentially a way of
for C to say, while this package contains trustworthy modules that can be
used by untrusted modules compiled with , I trust
the author(s) of this package and trust the modules only expose a safe
API. The trust of a package can be changed at any time, so if a
vulnerability found in a package, C can declare that package untrusted so
that any future compilation against that package would fail. For a more
detailed overview of this mechanism see .
So Danger can import module RIO because RIO is marked trustworthy. Thus,
Danger can make use of the rioReadFile and rioWriteFile functions to
access permitted file names. The main application then imports both RIO
and Danger. To run the plugin, it calls RIO.runRIO Danger.runMe within
the IO monad. The application is safe in the knowledge that the only IO
to ensue will be to files whose paths were approved by the pathOK test.
Safe Language
The Safe Haskell safe language guarantees the
following properties:
Referential transparency — Functions
in the safe language are deterministic, evaluating them will not
cause any side effects. Functions in the IO monad
are still allowed and behave as usual. Any pure function though, as
according to its type, is guaranteed to indeed be pure. This property
allows a user of the safe language to trust the types. This means,
for example, that the unsafePerformIO :: IO a -> a
function is disallowed in the safe language.
Module boundary control — Haskell
code compiled using the safe language is guaranteed to only access
symbols that are publicly available to it through other modules export
lists. An important part of this is that safe compiled code is not
able to examine or create data values using data constructors
that it cannot import. If a module M establishes some invariants
through careful use of its export list then code compiled using the
safe language that imports M is guaranteed to respect those invariants.
Because of this, Template
Haskell and
GeneralizedNewtypeDeriving
are both disabled in the safe language as they can be used
to violate this property.
Semantic consistency — The safe
language is strictly a subset of Haskell as implemented by GHC. Any
expression that compiles in the safe language has the same meaning as
it does when compiled in normal Haskell. In addition, in any module
that imports a safe language module, expressions that compile both
with and without the safe import have the same meaning in both cases.
That is, importing a module using the safe language cannot change the
meaning of existing code that isn't dependent on that module. So for
example, there are some restrictions placed on the
Overlapping Instances
extension as it can violate this property.
These three properties guarantee that you can trust the types in the safe
language, can trust that module export lists are respected in the safe
language and can trust that code that successfully compiles using the safe
language has the same meaning as it normally would.
Lets now look at the details of the safe language. In the safe language
dialect (enabled by ) we disable completely the
following features:
GeneralizedNewtypeDeriving — It can
be used to violate constructor access control, by allowing untrusted
code to manipulate protected data types in ways the data type author
did not intend. For example can be used to break invariants of data
structures.TemplateHaskell — Is particularly
dangerous, as it can cause side effects even at compilation time and
can be used to access abstract data types. It is very easy to break
module boundaries with TH.
In the safe language dialect we restrict the following features:
ForeignFunctionInterface — This is
mostly safe, but foreign import declarations that import a function
with a non-IO type are disallowed. All FFI imports must reside in the
IO Monad.RULES — As they can change the
behaviour of trusted code in unanticipated ways, violating semantic
consistency they are restricted in function. Specifically any RULES
defined in a module M compiled with are
dropped. RULES defined in trustworthy modules that M imports are still
valid and will fire as usual.OverlappingInstances — This
extension can be used to violate semantic consistency, because
malicious code could redefine a type instance (by containing a more
specific instance definition) in a way that changes the behaviour of
code importing the untrusted module. The extension is not disabled for
a module M compiled with but restricted. While M
can define overlapping instance declarations, they can only overlap
other instance declaration defined in M. If in a module N that imports
M, at a call site that uses a type-class function there is a choice of
which instance to use (i.e. an overlap) and the most specific instances
is from M, then all the other choices must also be from M. If not, a
compilation error will occur. A simple way to think of this is a
same origin policy for overlapping instances
defined in Safe compiled modules.Safe Imports
Safe Haskell enables a small extension to the usual import syntax of
Haskell, adding a safe keyword:
impdecl -> import [safe] [qualified] modid [as modid] [impspec]
When used, the module being imported with the safe keyword must be a
trusted module, otherwise a compilation error will occur. The safe import
extension is enabled by either of the ,
, or
flags and corresponding PRAGMA's. When the flag
is used, the safe keyword is allowed but meaningless, every import
is required to be safe regardless.
Trust
The Safe Haskell extension introduces the following two new language flags:
-XSafe — Enables the safe language
dialect, asking GHC to guarantee trust. The safe language dialect
requires that all imports be trusted or a compilation error will
occur.-XTrustworthy — Means that while
this module may invoke unsafe functions internally, the module's
author claims that it exports an API that can't be used in an unsafe
way. This doesn't enable the safe language or place any restrictions
on the allowed Haskell code. The trust guarantee is provided by the
module author, not GHC. An import statement with the safe keyword
results in a compilation error if the imported module is not trusted.
An import statement without the keyword behaves as usual and can
import any module whether trusted or not.
Whether or not a module is trusted depends on a notion of trust for
packages, which is determined by the client C invoking GHC (i.e. you). A
package P is trusted when either C's package database
records that P is trusted (and no command-line
arguments override this), or C's command-line flags say to trust it
regardless of what is recorded in the package database. In either case, C
is the only authority on package trust. It is up to the client to decide
which packages they trust.
So a module M in a package P is trusted by a client C
if and only if:
Both of these hold:
The module was compiled with
All of M's direct imports are trusted by COR all of these hold:
The module was compiled with
All of M's direct safe imports are trusted by CPackage P is trusted by C
For the first trust definition the trust guarantee is provided by GHC
through the restrictions imposed by the safe language. For the second
definition of trust, the guarantee is provided initially by the
module author. The client C then establishes that they trust the
module author by indicating they trust the package the module resides
in. This trust chain is required as GHC provides no guarantee for
-XTrustworthy compiled modules.
Example
Package Wuggle:
{-# LANGUAGE Safe #-}
module Buggle where
import Prelude
f x = ...blah...
Package P:
{-# LANGUAGE Trustworthy #-}
module M where
import System.IO.Unsafe
import safe Buggle
Suppose a client C decides to trust package P. Then does C trust module
M? To decide, GHC must check M's imports — M imports
System.IO.Unsafe. M was compiled with , so
P's author takes responsibility for that import. C trusts P's author, so
C trusts M to only use its unsafe imports in a safe and consistent
manner with respect to the API M exposes. M also has a safe import of
Buggle, so for this import P's author takes no responsibility for the
safety, so GHC must check whether Buggle is trusted by C. Is it? Well,
it is compiled with , so the code in Buggle
itself is machine-checked to be OK, but again under the assumption that
all of Buggle's imports are trusted by C. Prelude comes from base, which
C trusts, and is compiled with (While
Prelude is typically imported implicitly, it still obeys the same rules
outlined here). So Buggle is considered trusted.
Notice that C didn't need to trust package Wuggle; the machine checking
is enough. C only needs to trust packages that contain
modules.
Package Trust
Safe Haskell gives packages a new Boolean property, that of trust.
Several new options are available at the GHC command-line to specify the
trust property of packages:
-trust P — Exposes package P if it
was hidden and considers it a trusted package regardless of the
package database.-distrust P — Exposes package P if
it was hidden and considers it an untrusted package regardless of the
package database.-distrust-all-packages — Considers
all packages distrusted unless they are explicitly set to be trusted
by subsequent command-line options.
To set a package's trust property in the package database please refer to
.
Safe Imports without Trust
If you are writing a module and want to import a module from an untrusted
author, then you would use the following syntax:
import safe Untrusted.Module
As the safe import keyword is a feature of Safe Haskell and not Haskell98
this would fail though unless you enabled Safe imports through on the of
the Safe Haskell language flags. Three flags enable safe imports,
and
. However and
do more then just enable the keyword which
may be undesirable. Using the language
flag allows you to enable safe imports and nothing more.
Safe Haskell Flag Summary
In summary, Safe Haskell consists of the following language flags:
-XSafeTo be trusted, all of the module's direct imports must be
trusted, but the module itself need not reside in a trusted
package, because the compiler vouches for its trustworthiness. The
"safe" keyword is allowed but meaningless in import statements,
every import is required to be safe regardless.
Module Trusted — YesHaskell Language — Restricted to Safe
LanguageImported Modules — All forced to be
safe imports, all must be trusted.-XTrustworthyThis establishes that the module is trusted, but the
guarantee is provided by the module's author. A client of this
module then specifies that they trust the module author by
specifying they trust the package containing the module.
has no effect on the accepted range
of Haskell programs or their semantics, except that they allow the
safe import keyword.
Module Trusted — Yes but only if the
package the module resides in is also trusted.Haskell Language — Unrestricted
Imported Modules — Under control of
module author which ones must be trusted.-XSafeImportEnable the Safe Import extension so that a module can
require a dependency to be trusted without asserting any trust
about itself.
Module Trusted — NoHaskell Language —
UnrestrictedImported Modules — Under control of
module author which ones must be trusted.