summaryrefslogtreecommitdiff
path: root/docs/numpy.ipynb
blob: 34cbef438a5df4faebd95d058a2390c89ac3a993 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "NumPy Support\n",
    "=============\n",
    "\n",
    "The magnitude of a Pint quantity can be of any numerical scalar type, and you are free\n",
    "to choose it according to your needs. For numerical applications requiring arrays, it is\n",
    "quite convenient to use [NumPy ndarray](http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html) (or [ndarray-like types supporting NEP-18](https://numpy.org/neps/nep-0018-array-function-protocol.html)),\n",
    "and therefore these are the array types supported by Pint.\n",
    "\n",
    "Pint follows Numpy's recommendation ([NEP29](https://numpy.org/neps/nep-0029-deprecation_policy.html)) for minimal Numpy/Python versions support across the Scientific Python ecosystem.\n",
    "This ensures compatibility with other third party libraries (matplotlib, pandas, scipy).\n",
    "\n",
    "First, we import the relevant packages:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Import NumPy\n",
    "import numpy as np\n",
    "\n",
    "# Import Pint\n",
    "import pint\n",
    "ureg = pint.UnitRegistry()\n",
    "Q_ = ureg.Quantity\n",
    "\n",
    "# Silence NEP 18 warning\n",
    "import warnings\n",
    "with warnings.catch_warnings():\n",
    "    warnings.simplefilter(\"ignore\")\n",
    "    Q_([])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "and then we create a quantity the standard way"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "legs1 = Q_(np.asarray([3., 4.]), 'meter')\n",
    "print(legs1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "legs1 = [3., 4.] * ureg.meter\n",
    "print(legs1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "All usual Pint methods can be used with this quantity. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(legs1.to('kilometer'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(legs1.dimensionality)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    legs1.to('joule')\n",
    "except pint.DimensionalityError as exc:\n",
    "    print(exc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "NumPy functions are supported by Pint. For example if we define:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "legs2 = [400., 300.] * ureg.centimeter\n",
    "print(legs2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "we can calculate the hypotenuse of the right triangles with legs1 and legs2."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hyps = np.hypot(legs1, legs2)\n",
    "print(hyps)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice that before the `np.hypot` was used, the numerical value of legs2 was\n",
    "internally converted to the units of legs1 as expected.\n",
    "\n",
    "Similarly, when you apply a function that expects angles in radians, a conversion\n",
    "is applied before the requested calculation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "angles = np.arccos(legs2/hyps)\n",
    "print(angles)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can convert the result to degrees using usual unit conversion:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(angles.to('degree'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Applying a function that expects angles to a quantity with a different dimensionality\n",
    "results in an error:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    np.arccos(legs2)\n",
    "except pint.DimensionalityError as exc:\n",
    "    print(exc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Function/Method Support\n",
    "-----------------------\n",
    "\n",
    "The following [ufuncs](http://docs.scipy.org/doc/numpy/reference/ufuncs.html) can be applied to a Quantity object:\n",
    "\n",
    "- **Math operations**: `add`, `subtract`, `multiply`, `divide`, `logaddexp`, `logaddexp2`, `true_divide`, `floor_divide`, `negative`, `remainder`, `mod`, `fmod`, `absolute`, `rint`, `sign`, `conj`, `exp`, `exp2`, `log`, `log2`, `log10`, `expm1`, `log1p`, `sqrt`, `square`, `cbrt`, `reciprocal`\n",
    "- **Trigonometric functions**: `sin`, `cos`, `tan`, `arcsin`, `arccos`, `arctan`, `arctan2`, `hypot`, `sinh`, `cosh`, `tanh`, `arcsinh`, `arccosh`, `arctanh`\n",
    "- **Comparison functions**: `greater`, `greater_equal`, `less`, `less_equal`, `not_equal`, `equal`\n",
    "- **Floating functions**: `isreal`, `iscomplex`, `isfinite`, `isinf`, `isnan`, `signbit`, `sign`, `copysign`, `nextafter`, `modf`, `ldexp`, `frexp`, `fmod`, `floor`, `ceil`, `trunc`\n",
    "\n",
    "And the following NumPy functions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pint.numpy_func import HANDLED_FUNCTIONS\n",
    "print(sorted(list(HANDLED_FUNCTIONS)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And the following [NumPy ndarray methods](http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html#array-methods):\n",
    "\n",
    "- `argmax`, `argmin`, `argsort`, `astype`, `clip`, `compress`, `conj`, `conjugate`, `cumprod`, `cumsum`, `diagonal`, `dot`, `fill`, `flatten`, `flatten`, `item`, `max`, `mean`, `min`, `nonzero`, `prod`, `ptp`, `put`, `ravel`, `repeat`, `reshape`, `round`, `searchsorted`, `sort`, `squeeze`, `std`, `sum`, `take`, `trace`, `transpose`, `var`\n",
    "\n",
    "Pull requests are welcome for any NumPy function, ufunc, or method that is not currently\n",
    "supported.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Array Type Support\n",
    "------------------\n",
    "\n",
    "### Overview\n",
    "\n",
    "When not wrapping a scalar type, a Pint `Quantity` can be considered a [\"duck array\"](https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html), that is, an array-like type that implements (all or most of) NumPy's API for `ndarray`. Many other such duck arrays exist in the Python ecosystem, and Pint aims to work with as many of them as reasonably possible. To date, the following are specifically tested and known to work:\n",
    "\n",
    "- xarray: `DataArray`, `Dataset`, and `Variable`\n",
    "- Sparse: `COO`\n",
    "\n",
    "and the following have partial support, with full integration planned:\n",
    "\n",
    "- NumPy masked arrays (NOTE: Masked Array compatibility has changed with Pint 0.10 and versions of NumPy up to at least 1.18, see the example below)\n",
    "- Dask arrays\n",
    "- CuPy arrays\n",
    "\n",
    "### Technical Commentary\n",
    "\n",
    "Starting with version 0.10, Pint aims to interoperate with other duck arrays in a well-defined and well-supported fashion. Part of this support lies in implementing [`__array_ufunc__` to support NumPy ufuncs](https://numpy.org/neps/nep-0013-ufunc-overrides.html) and [`__array_function__` to support NumPy functions](https://numpy.org/neps/nep-0018-array-function-protocol.html). However, the central component to this interoperability is respecting a [type casting hierarchy](https://numpy.org/neps/nep-0018-array-function-protocol.html) of duck arrays. When all types in the hierarchy properly defer to those above it (in wrapping, arithmetic, and NumPy operations), a well-defined nesting and operator precedence order exists. When they don't, the graph of relations becomes cyclic, and the expected result of mixed-type operations becomes ambiguous.\n",
    "\n",
    "For Pint, following this hierarchy means declaring a list of types that are above it in the hierarchy and to which it defers (\"upcast types\") and assuming all others are below it and wrappable by it (\"downcast types\"). To date, Pint's declared upcast types are:\n",
    "\n",
    "- `PintArray`, as defined by pint-pandas\n",
    "- `Series`, as defined by Pandas\n",
    "- `DataArray`, `Dataset`, and `Variable`, as defined by xarray\n",
    "\n",
    "(Note: if your application requires extension of this collection of types, it is available in Pint's API at `pint.compat.upcast_types`.)\n",
    "\n",
    "While Pint assumes it can wrap any other duck array (meaning, for now, those that implement `__array_function__`, `shape`, `ndim`, and `dtype`, at least until [NEP 30](https://numpy.org/neps/nep-0030-duck-array-protocol.html) is implemented), there are a few common types that Pint explicitly tests (or plans to test) for optimal interoperability. These are listed above in the overview section and included in the below chart.\n",
    "\n",
    "This type casting hierarchy of ndarray-like types can be shown by the below acyclic graph, where solid lines represent declared support, and dashed lines represent planned support:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from graphviz import Digraph\n",
    "\n",
    "g = Digraph(graph_attr={'size': '8,5'}, node_attr={'fontname': 'courier'})\n",
    "g.edge('Dask array', 'NumPy ndarray')\n",
    "g.edge('Dask array', 'CuPy ndarray')\n",
    "g.edge('Dask array', 'Sparse COO')\n",
    "g.edge('Dask array', 'NumPy masked array', style='dashed')\n",
    "g.edge('CuPy ndarray', 'NumPy ndarray')\n",
    "g.edge('Sparse COO', 'NumPy ndarray')\n",
    "g.edge('NumPy masked array', 'NumPy ndarray')\n",
    "g.edge('Jax array', 'NumPy ndarray')\n",
    "g.edge('Pint Quantity', 'Dask array', style='dashed')\n",
    "g.edge('Pint Quantity', 'NumPy ndarray')\n",
    "g.edge('Pint Quantity', 'CuPy ndarray', style='dashed')\n",
    "g.edge('Pint Quantity', 'Sparse COO')\n",
    "g.edge('Pint Quantity', 'NumPy masked array', style='dashed')\n",
    "g.edge('xarray Dataset/DataArray/Variable', 'Dask array')\n",
    "g.edge('xarray Dataset/DataArray/Variable', 'CuPy ndarray', style='dashed')\n",
    "g.edge('xarray Dataset/DataArray/Variable', 'Sparse COO')\n",
    "g.edge('xarray Dataset/DataArray/Variable', 'NumPy ndarray')\n",
    "g.edge('xarray Dataset/DataArray/Variable', 'NumPy masked array', style='dashed')\n",
    "g.edge('xarray Dataset/DataArray/Variable', 'Pint Quantity')\n",
    "g.edge('xarray Dataset/DataArray/Variable', 'Jax array', style='dashed')\n",
    "g"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Examples\n",
    "\n",
    "**xarray wrapping Pint Quantity**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import xarray as xr\n",
    "\n",
    "# Load tutorial data\n",
    "air = xr.tutorial.load_dataset('air_temperature')['air'][0]\n",
    "\n",
    "# Convert to Quantity\n",
    "air.data = Q_(air.data, air.attrs.pop('units', ''))\n",
    "\n",
    "print(air)\n",
    "print()\n",
    "print(air.max())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Pint Quantity wrapping Sparse COO**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sparse import COO\n",
    "\n",
    "np.random.seed(80243963)\n",
    "\n",
    "x = np.random.random((100, 100, 100))\n",
    "x[x < 0.9] = 0  # fill most of the array with zeros\n",
    "s = COO(x)\n",
    "\n",
    "q = s * ureg.m\n",
    "\n",
    "print(q)\n",
    "print()\n",
    "print(np.mean(q))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Pint Quantity wrapping NumPy Masked Array**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "m = np.ma.masked_array([2, 3, 5, 7], mask=[False, True, False, True])\n",
    "\n",
    "# Must create using Quantity class\n",
    "print(repr(ureg.Quantity(m, 'm')))\n",
    "print()\n",
    "\n",
    "# DO NOT create using multiplication until\n",
    "# https://github.com/numpy/numpy/issues/15200 is resolved, as\n",
    "# unexpected behavior may result\n",
    "print(repr(m * ureg.m))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Pint Quantity wrapping Dask Array**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import dask.array as da\n",
    "\n",
    "d = da.arange(500, chunks=50)\n",
    "\n",
    "# Must create using Quantity class, otherwise Dask will wrap Pint Quantity\n",
    "q = ureg.Quantity(d, ureg.kelvin)\n",
    "\n",
    "print(repr(q))\n",
    "print()\n",
    "\n",
    "# DO NOT create using multiplication on the right until\n",
    "# https://github.com/dask/dask/issues/4583 is resolved, as\n",
    "# unexpected behavior may result\n",
    "print(repr(d * ureg.kelvin))\n",
    "print(repr(ureg.kelvin * d))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**xarray wrapping Pint Quantity wrapping Dask array wrapping Sparse COO**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import dask.array as da\n",
    "\n",
    "x = da.random.random((100, 100, 100), chunks=(100, 1, 1))\n",
    "x[x < 0.95] = 0\n",
    "\n",
    "data = xr.DataArray(\n",
    "    Q_(x.map_blocks(COO), 'm'),\n",
    "    dims=('z', 'y', 'x'),\n",
    "    coords={\n",
    "        'z': np.arange(100),\n",
    "        'y': np.arange(100) - 50,\n",
    "        'x': np.arange(100) * 1.5 - 20\n",
    "    },\n",
    "    name='test'\n",
    ")\n",
    "\n",
    "print(data)\n",
    "print()\n",
    "print(data.sel(x=125.5, y=-46).mean())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Compatibility Packages\n",
    "\n",
    "To aid in integration between various array types and Pint (such as by providing convenience methods), the following compatibility packages are available:\n",
    "\n",
    "- [pint-pandas](https://github.com/hgrecco/pint-pandas)\n",
    "- [pint-xarray](https://github.com/xarray-contrib/pint-xarray/)\n",
    "\n",
    "(Note: if you have developed a compatibility package for Pint, please submit a pull request to add it to this list!)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Additional Comments\n",
    "\n",
    "What follows is a short discussion about how NumPy support is implemented in Pint's `Quantity` Object.\n",
    "\n",
    "For the supported functions, Pint expects certain units and attempts to convert the input (or inputs). For example, the argument of the exponential function (`numpy.exp`) must be dimensionless. Units will be simplified (converting the magnitude appropriately) and `numpy.exp` will be applied to the resulting magnitude. If the input is not dimensionless, a `DimensionalityError` exception will be raised.\n",
    "\n",
    "In some functions that take 2 or more arguments (e.g. `arctan2`), the second argument is converted to the units of the first. Again, a `DimensionalityError` exception will be raised if this is not possible. ndarray or downcast type arguments are generally treated as if they were dimensionless quantities, whereas Pint defers to its declared upcast types by always returning `NotImplemented` when they are encountered (see above).\n",
    "\n",
    "To achive these function and ufunc overrides, Pint uses the ``__array_function__`` and ``__array_ufunc__`` protocols respectively, as recommened by NumPy. This means that functions and ufuncs that Pint does not explicitly handle will error, rather than return a value with units stripped (in contrast to Pint's behavior prior to v0.10). For more\n",
    "information on these protocols, see <https://docs.scipy.org/doc/numpy-1.17.0/user/basics.dispatch.html>.\n",
    "\n",
    "This behaviour introduces some performance penalties and increased memory usage. Quantities that must be converted to other units require additional memory and CPU cycles. Therefore, for numerically intensive code, you might want to convert the objects first and then use directly the magnitude, such as by using Pint's `wraps` utility (see [wrapping](wrapping.rst)).\n",
    "\n",
    "Attempting to access array interface protocol attributes (such as `__array_struct__` and `__array_interface__`) on Pint Quantities will raise an AttributeError, since a Quantity is meant to behave as a \"duck array,\" and not a pure ndarray."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}