gPhoto2 Camera Library Developer's Guide

Scott Fritzinger
2000-07-26
Revision 1

\tableofcontents{}

1 Reverse Engineering the Camera Protocol

The most difficult part for most developers is obtaining the transfer
protocol. If the OEM's are lucky enough, they will simply provide
us with the protocol specifications for their cameras and the drivers
will be written at no cost to them. Most OEM's refuse to do so though,
citing trade secrets or company policy; this is truly unfortunate
in that they have effectively told their own customers who use operating
systems other than Windows and the Mac that they don't want their
future business and that they aren't valued customer to begin with.

When OEM's do not cooperate, the developer is left to determine the
protocol him/herself through reverse engineering. 

1.1 Sniffing the Protocol

What follows are the most common setups for sniffing camera protocol
traffic. In all setups, a host computer runs the native camera drivers.;
typically, the Windows serial port drivers are used for reverse engineering.
The drivers are run through a series of functions that include getting
a picture index, downloading thumbnails, download full images, deleting
images, camera configuration options, in addition to any other features
a camera might have. During these operations, one or more of the following
methods are used to capture the communication between the host computer
and the camera.

1.1.1 Serial Repeater

A serial repeater consists of the host computer, a computer used as
a repeater, and the camera. The setup is shown in figure .

The repeater runs special software which reads data from one serial
port, logs the communication, and then outputs the data to the other
serial port. Data that is from the host computer to the camera and
from the camera to the host computer is logged sequentially in a single
log file. Information logged includes hexadecimal data values, direction
of the communication, as well as time stamps for synchronization.
An example sniffer to use for this configuration is ``sersniff''.

1.1.2 ``Y'' Serial Cable

To avoid using two computers, a Y serial cable can be used. The ``trunk''
end of the serial cable attaches to the camera's serial transfer cable,
while the two ``branches'' plug in to two serial ports on the host
computer. Figure shows this setup.

The camera drivers use one of the serial ports on the host computer,
while the other port is opened with a hexadecimal monitor application
that dumps all communications on the port to a file. The downside
to this approach is the developer would have to determine which sets
of data was generated by the camera or the host computer. Also, a
Y cable would have to be either built or purchased from an electronics
supply store.

1.1.3 Virtual Device Driver Hooks

The Windows platform allows virtual device drivers to ``hook'' into
other drivers to provide additional functionality or feature enhancements.
A combination GUI and device driver named PortMon by Systems Internals
is a communications debugging utility that hooks into the existing
Windows serial device driver (vcomm.vxd) and logs communications.
Figure shows this equipment arrangement.

This setup allows the developer to not use any extra hardware by simply
relying on software. This is perhaps the easiest method for capturing
camera data.

1.2 Making Sense Out of the Protocol

What follows are some pointers on decoding camera protocols. It uses
a protocol that isn't really any camera protocol in particular, but
should demonstrate some commonalities between most camera protocols.

1. Cameras like to ping. This is the in the form of an "ACK"" command
  that is different for different cameras. Basically, it is usually
  a short packet (probably 1 byte) that is sent both ways in order
  for the camera to know the computer is there or vice versa. It is
  also sometimes used to wake up a camera that has gone into power-save
  mode. It usually starts out the communications, as well as confirms
  each packet in any sort of "mass" transfer. The opposite, a "NAK",
  is sent to basically say the last packet was not received, or an
  error has occurred. Again, this is usually just a single byte as
  well.
  
  Example:
  Computer: 01
  Camera : 01
  
  The Camera sent an ACK ("01") and the Computer responded with an
  ACK as well.

2. Transfers are usually in "reverse network order", meaning least significant
  bytes come before most significant bytes. For example, ``00 08''
  should actually be reassembled as ``08 00''. 

3. Most protocols use starting and stopping bytes. 
  
  Example:
  Computer: 03 50 00 0f e0 04 
  Camera : 03 03 00 3f 03 04 
  Computer: 01 
  
  For this example, notice the packets begin with "03" and end with
  "04" (don't pay attention to what is between them). Also notice
  the Computer sent an "ACK" to confirm it got the packet.

4. Packets usually have a "command" byte, which tells either the computer
  or the camera what to do. Let's say you told the software to retrieve
  the number of pictures, which at the time happened to be "8", and
  you got the following:
  
  Computer: 03 01 00 00 00 04 
  Camera : 03 01 00 00 08 04 
  Computer: 01
  
  In this example, you notice the "03" and "04" specifying the start
  and stop of the packet. Also, you notice the second byte in the
  Computer packet is "01". The camera responds with the above packet,
  and low and behold, you see the number 8 in the same packet. It
  would appear, initially, that the second byte is used as a command
  byte, and that "01" specifies the camera to return the number of
  pictures. This may very well be right, but don't jump into it yet.
  Make sure you look at a bunch of similar situations to confirm this.
  (Again, notice the "ACK" sent by the computer).

5. Most protocols have a "data size" byte(s) in data packets. Let's
  say that you told the camera to retrieve thumbnail 8 and you get
  the following:
  
  Computer: 03 02 00 00 08 04
  Camera : 03 02 00 0F (15 bytes) 04
  Computer: 01 
  
  OK, here's a brief breakdown of this transaction:
  
  -Looks like the command to retrieve a thumbnail is "02" (2nd byte
  in the computer packet), and that the byte that is "08" specifies
  which thumbnail to return.
  -The camera responds with a "02" in the command field, specifying
  it is returning a thumbnail, and then sends "0F", and 15 bytes of
  data. 
  -It looks like the byte "0F" specifies how many bytes are after it
  in the same packet. This is a data size byte. 
  (Note: this is a simplistic example. No thumbnail will only be 15
  bytes :) this leads up to the next thing to consider)

6. Most protocols have an "order" or "counter" byte. This is used so
  that, in large data transfers where the picture may be split up
  into several different packets, the computer knows how to reassemble
  all the data. The entire thumbnail more than likely will not be
  contained in a single packet for logistical reasons, so they break
  up the data into many different packets and give each packet a unique
  number (or ``order'' byte). Let's say you told your camera to return
  thumbnail 8 (which is, as mentioned, pretty big), and you get the
  following:
  
  Computer: 02 03 00 00 08 03 
  Camera : 02 03 00 0F (15 bytes) 03 
  Computer: 01
  Camera : 02 03 01 0F (15 bytes) 03 
  Computer: 01
  Camera : 02 03 02 0F (15 bytes) 03
  Computer: 01
  ... 5 more packets and ACKs ...
  Camera : 02 03 08 09 (9 bytes) 03
  Computer: 01
  
  You notice that the 3rd byte of each of the camera packets increments
  with each packet sent from the camera. This looks like it is an
  order (counter) byte. the computer can then reassemble the data
  from all the packets in order to reproduce the image.

7. Most protocols have some sort of error detection byte(s) at the end
  of the packet. This is usually a simple checksum (summation of bytes),
  or a CRC (a somewhat complex algorithm that reduces the probability
  of mis-diagnosing a packet with errors by magnitudes). These bytes
  can take into account only the data, or maybe the entire packet
  excluding those error detection bytes. If this isn't a known scheme,
  this winds up being the hardest part of reimplementing the protocol.
  Lets take the above example again, this time we'll add a couple
  bytes on the end for error detection:
  
  Computer: 02 03 00 00 08 03
  Camera : 02 03 00 0F (15 bytes) 0f 02 03
  Computer: 01
  Camera : 02 03 01 0F (15 bytes) 0e 00 03
  Computer: 02
  Camera : 02 03 01 0F (15 bytes) fa d0 03
  Computer: 01
  Camera : 02 03 02 0F (15 bytes) fa d0 03
  Computer: 01
  ... 5 more packets and ACKs
  Camera : 02 03 08 09 (9 bytes) d7 38 03
  Computer: 01
  
  Notice how the error detection bytes are usually different for each
  packet. These may be checksums, or CRC's, or something else. Only
  way to find out really is to try each one, on different combinations
  of packet parts (data, order byte, command byte, etc...) and see
  if you get the same thing. Try this on the shorter packets to make
  life easier.
  
  Look at one more thing that sticks out in this transaction: for packet
  with order byte ``01'', the Computer responded with a "02''. and
  the Camera then resent the same packet it just did. This shows that
  the NAK byte is "02", and this could happen because maybe the error
  detection bytes didn't match with the data, or maybe something else
  happened. either way, the camera resent the last packet, and now
  you know how the camera can recover from transfer errors. If you
  didn't get the packet you were expecting, send the camera a NAK
  and it will resend the same packet again.

2 Understanding the gPhoto2 Design

The gPhoto2 design is the same three-tiered structure that has worked
extremely well in the past with other software packages. Here is a
listing of the 3 tiers:

* the camera library

* the I/O library

* the front-end

* the ``core''

2.1 Role of the Camera Library

The camera library is in charge of talking directly with the camera.
The library uses the gPhoto2 Camera API in order to provide a common
access-method for the library itself. Being dynamically linked, the
libraries are loaded at run-time depending on the camera model the
end-user would like to access. 

In order to provide flexilibity with variations in camera design, there
are camera ``abilities'' which list, well, the abilities of each camera
model. Some camera may support serial port connections only, while
others may be able to use USB and a serial port. We've run into cameras
that don't support thumbnailing on the camera so there is an ``abilities''
field to specify whether or not the camera supports thumbnailing.
The ``abilities'' also list other things such as supported serial
transfer speeds, file deletion, and other functionality.

The camera libraries only make functions calls to the I/O library and
to the gPhoto2 core.

There is more information on the specifics of the camera library in
section 3 of this document.

2.2 Role of the I/O Library

The gPhoto2 I/O library is a platform-independent communications library
that support serial, parallel, USB, firewire, and network connections.
It is a work-in-progress with a constantly expanding list of supported
platforms. This library uses the gPhoto2 I/O library API for accessing
communications devices. It enumerates the devices available on a system,
and provides read/write access.

The camera libraries all use the I/O library for communications with
the cameras. By doing having all communications go through a single
library, the camera libraries become as portable as the I/O library.
Porting gPhoto2 to other platforms become extremely easy.

There is more information on the specifics of the I/O library in section
3 of this document.

2.3 Role of the Front-end

The front-end is the application that the user interacts with. It is
usually a command-line program, or a graphical point-and-click interface.
The front-end talks only with the gPhoto2 core in order to retrieve
pictures and perform other functions with the camera.

2.4 Role of the gPhoto2 Core

The gPhoto2 ``core'' is the heart of gPhoto2. It provides services
to both the camera libraries and the front-ends. Most of the services
deal with error-checking and enumeration of devices (cameras, I/O
devices, etc...). The core performs validity checking on data passed
to/from the front-end or the camera library.

You could consider the core a translator/interpreter/spell-checker/army-general
in the ``big picture'' of gPhoto2. It does the grunt-work and performs
the coordination of the other parts.

3 Implementing the Library

gPhoto2 camera libraries use the gPhoto2 Camera API (CAPI) for implementation.
Here is a listing of the CAPI functions: 

camera_id 

camera_abilities 

camera_init 

camera_exit

camera_folder_list

camera_file_list 

camera_file_get

camera_file_get_preview

camera_file_put 

camera_file_delete 

camera_config_get

camera_config_set

camera_capture

camera_summary

camera_manual

camera_about


Section 3.1 details the purpose of each of these functions, while Section
3.2 discusses how to use the I/O library.

3.1 Camera API

The CAPI provides the full set of functions for doing various tasks
with the camera. All CAPI functions return either GP_OK for successful
execution , or GP_ERROR for a failure of execution

What follows is a listing of the functions, including prototypes and
data exchange:

3.1.1 camera_id

Purpose: Retrieve the unique id for the camera library.

Prototype: int camera_id (CameraText *id); 

Arguments: 

CameraText *id : unique string to represent the camera library


In order to guarantee that only once instance of the camera library
is loaded for each instance of the core, the camera library must copy
a unique string into the ``id''. Please consult the gPhoto developers
to determine which string you should use.


Example:

int camera_id(CameraText *id) {

strcpy(id->text, ``my-unique-string'');

return (GP_OK);

}

3.1.2 camera_abilities

Purpose: Retrieve the list of supported cameras and the abilities for
each camera

Prototype: int camera_abilities (CameraAbilitiesList *list); 

Arguments: d

CameraAbilities *abilities : the list of abilities for the supported
cameras

int *count : the number of 

3.1.3 camera_init

Purpose: Initialize the camera

Prototype: int camera_init (Camera *camera, CameraInit *init); 

Arguments: d

3.1.4 camera_exit 

Purpose: Close the camera

Prototype: int camera_exit (Camera *camera); 

Arguments: d

3.1.5 camera_file_list

Purpose: List the files in a particular folder on the camera

Prototype: int camera_file_list(Camera *camera, CameraList *list, char
*folder); 

Arguments: d

3.1.6 camera_folder_list

Purpose: List the subfolders in a particular folder on the camera

Prototype: int camera_folder_list(Camera *camera, CameraList *list,
char *folder); 

Arguments: d

3.1.7 camera_file_get

Purpose: Retrieve a file from the camera

Prototype: int camera_file_get (Camera *camera, CameraFile *file, char
*folder, char *filename); 

Arguments: d

3.1.8 camera_file_get_preview

Purpose: Retrieve a file's preview from the camera

Prototype: int camera_file_get_preview (Camera *camera, CameraFile
*file, char *folder, char *filename); 

Arguments: d

3.1.9 camera_file_put

Purpose: Place (upload) a file to the camera

Prototype: int camera_file_put (Camera *camera, CameraFile *file, char
*folder); 

Arguments: d

3.1.10 camera_file_delete

Purpose: Delete a file from the camera

Prototype: int camera_file_delete (Camera *camera, char *folder, char
*filename); 

Arguments: d

3.1.11 camera_config_get

Purpose: Retrieve the configuration window.

Prototype: int camera_config_get (Camera *camera, CameraWidget *window); 

Arguments: d

3.1.12 camera_config_set

Purpose: Set camera configuration

Prototype: int camera_config_set (Camera *camera, CameraSetting *setting,
int count); 

Arguments: d

3.1.13 camera_capture

Purpose: Retrieve live data from the camera

Prototype: int camera_capture (Camera *camera, CameraFile *file, CameraCaptureInfo
*info); 

Arguments: d

3.1.14 camera_summary

Purpose: Retrieve the camera summary information

Prototype: int camera_summary (Camera *camera, CameraText *summary); 

Arguments: d

3.1.15 camera_manual

Purpose: Retrieve the camera user's guide (manual)

Prototype: int camera_manual (Camera *camera, CameraText *manual); 

Arguments: d

3.1.16 camera_about

Purpose: Retrieve information about the camera library

Prototype: int camera_about (Camera *camera, CameraText *about);

Arguments: d

3.2 The gPhoto2 I/O Library