gPhoto2 Camera Library Developer's Guide Scott Fritzinger 2000-07-26 Revision 1 \tableofcontents{} 1 Reverse Engineering the Camera Protocol The most difficult part for most developers is obtaining the transfer protocol. If the OEM's are lucky enough, they will simply provide us with the protocol specifications for their cameras and the drivers will be written at no cost to them. Most OEM's refuse to do so though, citing trade secrets or company policy; this is truly unfortunate in that they have effectively told their own customers who use operating systems other than Windows and the Mac that they don't want their future business and that they aren't valued customer to begin with. When OEM's do not cooperate, the developer is left to determine the protocol him/herself through reverse engineering. 1.1 Sniffing the Protocol What follows are the most common setups for sniffing camera protocol traffic. In all setups, a host computer runs the native camera drivers.; typically, the Windows serial port drivers are used for reverse engineering. The drivers are run through a series of functions that include getting a picture index, downloading thumbnails, download full images, deleting images, camera configuration options, in addition to any other features a camera might have. During these operations, one or more of the following methods are used to capture the communication between the host computer and the camera. 1.1.1 Serial Repeater A serial repeater consists of the host computer, a computer used as a repeater, and the camera. The setup is shown in figure . The repeater runs special software which reads data from one serial port, logs the communication, and then outputs the data to the other serial port. Data that is from the host computer to the camera and from the camera to the host computer is logged sequentially in a single log file. Information logged includes hexadecimal data values, direction of the communication, as well as time stamps for synchronization. An example sniffer to use for this configuration is ``sersniff''. 1.1.2 ``Y'' Serial Cable To avoid using two computers, a Y serial cable can be used. The ``trunk'' end of the serial cable attaches to the camera's serial transfer cable, while the two ``branches'' plug in to two serial ports on the host computer. Figure shows this setup. The camera drivers use one of the serial ports on the host computer, while the other port is opened with a hexadecimal monitor application that dumps all communications on the port to a file. The downside to this approach is the developer would have to determine which sets of data was generated by the camera or the host computer. Also, a Y cable would have to be either built or purchased from an electronics supply store. 1.1.3 Virtual Device Driver Hooks The Windows platform allows virtual device drivers to ``hook'' into other drivers to provide additional functionality or feature enhancements. A combination GUI and device driver named PortMon by Systems Internals is a communications debugging utility that hooks into the existing Windows serial device driver (vcomm.vxd) and logs communications. Figure shows this equipment arrangement. This setup allows the developer to not use any extra hardware by simply relying on software. This is perhaps the easiest method for capturing camera data. 1.2 Making Sense Out of the Protocol What follows are some pointers on decoding camera protocols. It uses a protocol that isn't really any camera protocol in particular, but should demonstrate some commonalities between most camera protocols. 1. Cameras like to ping. This is the in the form of an "ACK"" command that is different for different cameras. Basically, it is usually a short packet (probably 1 byte) that is sent both ways in order for the camera to know the computer is there or vice versa. It is also sometimes used to wake up a camera that has gone into power-save mode. It usually starts out the communications, as well as confirms each packet in any sort of "mass" transfer. The opposite, a "NAK", is sent to basically say the last packet was not received, or an error has occurred. Again, this is usually just a single byte as well. Example: Computer: 01 Camera : 01 The Camera sent an ACK ("01") and the Computer responded with an ACK as well. 2. Transfers are usually in "reverse network order", meaning least significant bytes come before most significant bytes. For example, ``00 08'' should actually be reassembled as ``08 00''. 3. Most protocols use starting and stopping bytes. Example: Computer: 03 50 00 0f e0 04 Camera : 03 03 00 3f 03 04 Computer: 01 For this example, notice the packets begin with "03" and end with "04" (don't pay attention to what is between them). Also notice the Computer sent an "ACK" to confirm it got the packet. 4. Packets usually have a "command" byte, which tells either the computer or the camera what to do. Let's say you told the software to retrieve the number of pictures, which at the time happened to be "8", and you got the following: Computer: 03 01 00 00 00 04 Camera : 03 01 00 00 08 04 Computer: 01 In this example, you notice the "03" and "04" specifying the start and stop of the packet. Also, you notice the second byte in the Computer packet is "01". The camera responds with the above packet, and low and behold, you see the number 8 in the same packet. It would appear, initially, that the second byte is used as a command byte, and that "01" specifies the camera to return the number of pictures. This may very well be right, but don't jump into it yet. Make sure you look at a bunch of similar situations to confirm this. (Again, notice the "ACK" sent by the computer). 5. Most protocols have a "data size" byte(s) in data packets. Let's say that you told the camera to retrieve thumbnail 8 and you get the following: Computer: 03 02 00 00 08 04 Camera : 03 02 00 0F (15 bytes) 04 Computer: 01 OK, here's a brief breakdown of this transaction: -Looks like the command to retrieve a thumbnail is "02" (2nd byte in the computer packet), and that the byte that is "08" specifies which thumbnail to return. -The camera responds with a "02" in the command field, specifying it is returning a thumbnail, and then sends "0F", and 15 bytes of data. -It looks like the byte "0F" specifies how many bytes are after it in the same packet. This is a data size byte. (Note: this is a simplistic example. No thumbnail will only be 15 bytes :) this leads up to the next thing to consider) 6. Most protocols have an "order" or "counter" byte. This is used so that, in large data transfers where the picture may be split up into several different packets, the computer knows how to reassemble all the data. The entire thumbnail more than likely will not be contained in a single packet for logistical reasons, so they break up the data into many different packets and give each packet a unique number (or ``order'' byte). Let's say you told your camera to return thumbnail 8 (which is, as mentioned, pretty big), and you get the following: Computer: 02 03 00 00 08 03 Camera : 02 03 00 0F (15 bytes) 03 Computer: 01 Camera : 02 03 01 0F (15 bytes) 03 Computer: 01 Camera : 02 03 02 0F (15 bytes) 03 Computer: 01 ... 5 more packets and ACKs ... Camera : 02 03 08 09 (9 bytes) 03 Computer: 01 You notice that the 3rd byte of each of the camera packets increments with each packet sent from the camera. This looks like it is an order (counter) byte. the computer can then reassemble the data from all the packets in order to reproduce the image. 7. Most protocols have some sort of error detection byte(s) at the end of the packet. This is usually a simple checksum (summation of bytes), or a CRC (a somewhat complex algorithm that reduces the probability of mis-diagnosing a packet with errors by magnitudes). These bytes can take into account only the data, or maybe the entire packet excluding those error detection bytes. If this isn't a known scheme, this winds up being the hardest part of reimplementing the protocol. Lets take the above example again, this time we'll add a couple bytes on the end for error detection: Computer: 02 03 00 00 08 03 Camera : 02 03 00 0F (15 bytes) 0f 02 03 Computer: 01 Camera : 02 03 01 0F (15 bytes) 0e 00 03 Computer: 02 Camera : 02 03 01 0F (15 bytes) fa d0 03 Computer: 01 Camera : 02 03 02 0F (15 bytes) fa d0 03 Computer: 01 ... 5 more packets and ACKs Camera : 02 03 08 09 (9 bytes) d7 38 03 Computer: 01 Notice how the error detection bytes are usually different for each packet. These may be checksums, or CRC's, or something else. Only way to find out really is to try each one, on different combinations of packet parts (data, order byte, command byte, etc...) and see if you get the same thing. Try this on the shorter packets to make life easier. Look at one more thing that sticks out in this transaction: for packet with order byte ``01'', the Computer responded with a "02''. and the Camera then resent the same packet it just did. This shows that the NAK byte is "02", and this could happen because maybe the error detection bytes didn't match with the data, or maybe something else happened. either way, the camera resent the last packet, and now you know how the camera can recover from transfer errors. If you didn't get the packet you were expecting, send the camera a NAK and it will resend the same packet again. 2 Understanding the gPhoto2 Design The gPhoto2 design is the same three-tiered structure that has worked extremely well in the past with other software packages. Here is a listing of the 3 tiers: * the camera library * the I/O library * the front-end * the ``core'' 2.1 Role of the Camera Library The camera library is in charge of talking directly with the camera. The library uses the gPhoto2 Camera API in order to provide a common access-method for the library itself. Being dynamically linked, the libraries are loaded at run-time depending on the camera model the end-user would like to access. In order to provide flexilibity with variations in camera design, there are camera ``abilities'' which list, well, the abilities of each camera model. Some camera may support serial port connections only, while others may be able to use USB and a serial port. We've run into cameras that don't support thumbnailing on the camera so there is an ``abilities'' field to specify whether or not the camera supports thumbnailing. The ``abilities'' also list other things such as supported serial transfer speeds, file deletion, and other functionality. The camera libraries only make functions calls to the I/O library and to the gPhoto2 core. There is more information on the specifics of the camera library in section 3 of this document. 2.2 Role of the I/O Library The gPhoto2 I/O library is a platform-independent communications library that support serial, parallel, USB, firewire, and network connections. It is a work-in-progress with a constantly expanding list of supported platforms. This library uses the gPhoto2 I/O library API for accessing communications devices. It enumerates the devices available on a system, and provides read/write access. The camera libraries all use the I/O library for communications with the cameras. By doing having all communications go through a single library, the camera libraries become as portable as the I/O library. Porting gPhoto2 to other platforms become extremely easy. There is more information on the specifics of the I/O library in section 3 of this document. 2.3 Role of the Front-end The front-end is the application that the user interacts with. It is usually a command-line program, or a graphical point-and-click interface. The front-end talks only with the gPhoto2 core in order to retrieve pictures and perform other functions with the camera. 2.4 Role of the gPhoto2 Core The gPhoto2 ``core'' is the heart of gPhoto2. It provides services to both the camera libraries and the front-ends. Most of the services deal with error-checking and enumeration of devices (cameras, I/O devices, etc...). The core performs validity checking on data passed to/from the front-end or the camera library. You could consider the core a translator/interpreter/spell-checker/army-general in the ``big picture'' of gPhoto2. It does the grunt-work and performs the coordination of the other parts. 3 Implementing the Library gPhoto2 camera libraries use the gPhoto2 Camera API (CAPI) for implementation. Here is a listing of the CAPI functions: camera_id camera_abilities camera_init camera_exit camera_folder_list camera_file_list camera_file_get camera_file_get_preview camera_file_put camera_file_delete camera_config_get camera_config_set camera_capture camera_summary camera_manual camera_about Section 3.1 details the purpose of each of these functions, while Section 3.2 discusses how to use the I/O library. 3.1 Camera API The CAPI provides the full set of functions for doing various tasks with the camera. All CAPI functions return either GP_OK for successful execution , or GP_ERROR for a failure of execution What follows is a listing of the functions, including prototypes and data exchange: 3.1.1 camera_id Purpose: Retrieve the unique id for the camera library. Prototype: int camera_id (CameraText *id); Arguments: CameraText *id : unique string to represent the camera library In order to guarantee that only once instance of the camera library is loaded for each instance of the core, the camera library must copy a unique string into the ``id''. Please consult the gPhoto developers to determine which string you should use. Example: int camera_id(CameraText *id) { strcpy(id->text, ``my-unique-string''); return (GP_OK); } 3.1.2 camera_abilities Purpose: Retrieve the list of supported cameras and the abilities for each camera Prototype: int camera_abilities (CameraAbilitiesList *list); Arguments: d CameraAbilities *abilities : the list of abilities for the supported cameras int *count : the number of 3.1.3 camera_init Purpose: Initialize the camera Prototype: int camera_init (Camera *camera, CameraInit *init); Arguments: d 3.1.4 camera_exit Purpose: Close the camera Prototype: int camera_exit (Camera *camera); Arguments: d 3.1.5 camera_file_list Purpose: List the files in a particular folder on the camera Prototype: int camera_file_list(Camera *camera, CameraList *list, char *folder); Arguments: d 3.1.6 camera_folder_list Purpose: List the subfolders in a particular folder on the camera Prototype: int camera_folder_list(Camera *camera, CameraList *list, char *folder); Arguments: d 3.1.7 camera_file_get Purpose: Retrieve a file from the camera Prototype: int camera_file_get (Camera *camera, CameraFile *file, char *folder, char *filename); Arguments: d 3.1.8 camera_file_get_preview Purpose: Retrieve a file's preview from the camera Prototype: int camera_file_get_preview (Camera *camera, CameraFile *file, char *folder, char *filename); Arguments: d 3.1.9 camera_file_put Purpose: Place (upload) a file to the camera Prototype: int camera_file_put (Camera *camera, CameraFile *file, char *folder); Arguments: d 3.1.10 camera_file_delete Purpose: Delete a file from the camera Prototype: int camera_file_delete (Camera *camera, char *folder, char *filename); Arguments: d 3.1.11 camera_config_get Purpose: Retrieve the configuration window. Prototype: int camera_config_get (Camera *camera, CameraWidget *window); Arguments: d 3.1.12 camera_config_set Purpose: Set camera configuration Prototype: int camera_config_set (Camera *camera, CameraSetting *setting, int count); Arguments: d 3.1.13 camera_capture Purpose: Retrieve live data from the camera Prototype: int camera_capture (Camera *camera, CameraFile *file, CameraCaptureInfo *info); Arguments: d 3.1.14 camera_summary Purpose: Retrieve the camera summary information Prototype: int camera_summary (Camera *camera, CameraText *summary); Arguments: d 3.1.15 camera_manual Purpose: Retrieve the camera user's guide (manual) Prototype: int camera_manual (Camera *camera, CameraText *manual); Arguments: d 3.1.16 camera_about Purpose: Retrieve information about the camera library Prototype: int camera_about (Camera *camera, CameraText *about); Arguments: d 3.2 The gPhoto2 I/O Library