INVITED COMMENTARY ON HOT ARTICLES
I read with great interest the recent paper by Karargyris et al on the use of elastic video interpolation in the three-dimensional (3-D) reconstruction of the digestive wall in capsule endoscopy (CE) videos. The article presents promising 3-D approaches in CE; due to its technical nature though it may be slightly difficult for the general gastroenterologist. Therefore, I invited its first author to help me present the clinically relevant points. These authors believe that proper consideration should be given for its further clinical application in CE.
Wireless CE, a milestone in minimally invasive investigation of the gastrointestinal tract, was made possible by the advent of power efficient/low cost image sensors based on complementary metal oxide semiconductor (CMOS) technology[2,3]. Upon the 1950’s debut of the world’s first gastro-camera, few could have envisaged that, just half a century later, imaging of the human digestive tract would become wireless[4-6]. However, this spectacular advent in medical endoscopy is not without its limitations. Let us not forget that a complete capsule platform comprises six fundamental modules: locomotion, vision, telemetry, localisation, power and diagnostic/tissue manipulation tools[1,7]. To present, due to space constraints, the majority of commercially available capsules include only a subset of the aforementioned modules. Furthermore, power limitation is a major hurdle which current CE technology has yet to overcome[1,5,7].
Since the capsule needs 6-8 h to traverse through the small-bowel[7,8], cameras within the currently marketed capsule endoscopes work at a capture rate of 2-3 frames per second (fps) in order to comply with power requirements. Nonetheless, this has an adverse effect on the smoothness of motion between consecutive frames and creates a visually unpleasant effect to the human eye[1,7]. Furthermore, shape is an important element in human perception, yet CE suffers, unlike other diagnostic modalities i.e., Computed tomography, magnetic resonance imaging, from lack of 3-D information. Practically, this can only be feasible with the use of new generation devices that have yet to be realised.
3-D technology is currently in use e.g., a magnetometer can provide not only acceleration values on the three axes but also the 3-D orientation of the device (Figure 1A). Commercial time-of-flight range cameras (i.e., Microsoft’s Kinect Project, Figure 1B) already exist in the market and in the near future this may be further improved and miniaturised for use inside a capsule endoscope. These cameras offer information on depth and colour. We should not forget that 3-D guidance systems are already used for endoscopic surgeries offering 3-D position information of the sensor (Figure 1C). Therefore, using the acquired information (orientation, acceleration, depth values, position etc.) from these miniature sensors in conjunction with sophisticated registration software algorithms, an accurate 3-D representation of the digestive tract could be created successfully.
Figure 1 Various sensors which are commercially available.
A: Miniature magnetometers offer orientation and acceleration information, [Bertda Services (SEA) Pte. Ltd.]; B: Three-dimensional (3-D) range camera used in a widely commercial product; C: 3-D guidance system used in endoscopy devices (inition).
3-D representation for endoscopy
To date, limited research has been carried out in developing methods and materials that offer 3-D representation of the digestive tract. For conventional endoscopy systems, stereo technology has been introduced to capture stereo images and to create depth information and therefore 3-D reconstruction of digestive structures. However, due to issues with size, such systems have not been widely accepted[1,5]. Likewise, in CE there has been a hardware approach that provides in real time both 3-D information and texture using an infrared projector and a CMOS camera. The major drawbacks of this system are its size, power consumption and packaging issues. Therefore, in order to tackle the problem of the current hardware limitations, Karargyris et al[1,12] proposed the use of a software approach to approximate a 3-D representation of digestive tract surface utilising current CE technology.
Shape from shading
The Shape from Shading (SfS) technique is a member of a family of “shape recovery” algorithms called shape-from-X techniques[1,13]. Essentially, SfS algorithms try to recover the shape of objects by using the gradual variation of shading (Figure 2). SfS techniques can be divided into four groups or approaches: minimisation, propagation, local, and linear approaches[13,14].
Figure 2 The Shape from Shading flow[12,13].
Capturing a surface using a camera removes depth information. Shape from Shading techniques try to reproduce the missing depth information from a given image.
Karargyris et al propose using a specific sub-category of SfS methods: the Tsai’s method (or linear approach)[1,15] because (1) it produces good results for spherical surfaces, which in fact is the case with most shapes of digestive tract pathology e.g., polyps; (2) it is very fast; and (3) it “behaves” relatively well with specular surfaces (surfaces with mirror-like reflection of light). Of note, in CE the light source axis and the miniature camera are basically aligned. Tsai’s method works well on smooth objects and spherical structures. Additionally, Tsai’s method takes into consideration only the light direction, which in our case is 0o degrees (since the lighting direction and that of the camera are parallel).
A correction on the camera distortion is not performed because there is insufficient data on the capsule camera optics’ specifications. The result of applying Tsai’s method in a CE frame is given in Figure 3. Figure 3A shows a polyp in a conventional 2-D CE frame, whereas in Figure 3B the same polyp in presented using the 3-D software. The SfS approach gives promising outcomes, especially for visualizing polyps and vascular lesions, but less so for ulcers or lymphangiectasia (Figure 4). In the first 2 cases, the 3-D result is rather exciting, supporting the argument for further evaluation of this technique for use in clinical practice. Additionally, a 3-D representation may be helpful in the design of more accurate and robust computer-aided detection algorithms, incorporating other image enhancement tools e.g., virtual chromoendoscopy (FICE) or colour (blue) mode analysis of CE videos instead of still CE frames. However, prior to any software integration, further qualitative assessment and accuracy assessment should take place (work currently undertaken by these authors).
Figure 3 Original capsule endoscopy and 3-D represented image depicting a protrusion.
A: Original capsule endoscopy image depicting a protrusion (polyp); B: Its 3-D represented image processed by the proposed Shape from Shading approach.
Figure 4 Two-dimensional capsule endoscopy images (upper panel) and three-dimensional representation of above structures is seen (lower panel).
A1, A2: P2 angioectasia; B1, B2: P1 angiectasia; C1, C2: Lymphoid hyperplasia with superficial ulceration; D1, D2: Aphthous ulcers; E1, E2: Serpiginous ulcers; F1, F2: Nodular lymphangiectasia; G1, G2: Another capsule endoscope still inside the small-bowel.
Conversely, in cases of sudden large intensity change, Tsai’s method fails because it takes into consideration the preservation of intensity gradient. However, in non-artificial images such as CE frames, the brightness (image pixel values) transitions smoothly with no abrupt changes. Technically however, one can notice in Figure 5 the presence of highlights, hence false information about the shape. Highlights are essentially linear combinations of specular and diffuse reflection (light reflected at various angles) components of the surface.
Figure 5 3-D representation of a single video capsule image with highlights removal (arrows).
Many objects in the real world are dielectric and homogeneous, hence displaying both types of reflections. Most digestive structures fall into this category. When the light beams fall on to such an object, some of them reflect back immediately creating the specular reflection, while the rest of the beams first penetrate the object and then reflect creating the diffuse reflection. Along with 3-D representation algorithm a highlight suppression scheme is applied to the original CE images to produce desired better results, whilst maintaining the shapes and structures of the digestive tract. Figure 5 shows the successful highlight removal without affecting the shapes of other objects. It has to be mentioned that highlights from light reflections on the surface of the digestive tract are still an open problem not only for a 3-D representation but also for traditional CE review.
In conclusion, the software by Karargyris et al offers a new potential to reform and enhance the currently existing reading software in capsule endoscopy by improving lesion demarcation and highlighting the textural features of ulcers, angioectasias and polyps[5,19]. Further work is needed to prove its clinical validity but the idea of a 3-D aid (with the present level of CE technology) seems, at least for this author, not only captivating but promising as well. However, it should not be forgotten that true 3-D capability requires dual video images, although the inclusion of two cameras within the shell of a capsule endoscopy might be unwieldy at present.