It’s been an extremely hectic, but exciting week thus far thanks to the caBIG 2009 Annual Meeting. First, some background on the situation before I explain the exciting part. My focus within caBIG lies with the Imaging workspace which seeks to provide both open source imaging applications and freely available radiological data sets to their community. A large chunk of my day job is spent working with their most mature tool which is called the National Biomedical Imaging Archive (NBIA). The purpose of this tool is to serve as an indexed data warehouse for DICOM medical images and one of my major tasks at work is helping facilitate the accrual of scientifically interesting data sets into the archive for researchers to make use of.
DICOM is the standard for medical images these days, and they’re structured very similarly to how MP3 files are. They consist of a header that contains a great deal of “tags” or meta data much like how you store the artist name, song name, etc for an MP3. However in DICOM there are thousands of tags that provide information on a wide range of things. Some of it is patient centric like patient name, height or weight to name a few. There are also things like modality, slice thickness, software version or hardware manufacturer’s name that provide details about the machine the patient was laying on top of. There are lots of other types of tags but I would end up having to write a book if I got into explaining all of that and there are much smarter people like David Clunie who could give you better information on the subject.
So when these images are submitted to the NBIA the server parses the meta data found in a subset of these tags so that end users can then search based on that information to find the type of images they want. The link I provided above points to the NBIA server hosted at the National Cancer Institute here in the United States, but thanks to caBIG’s federated grid structure institutions across the world can also setup their own NBIA servers and choose to expose subsets of their own data sets across the grid to share with each other.
Now, on to the new from the conference. This year at caBIG’s annual meeting there were some great things announced regarding new advances in the devleopment of Imaging workspace applications as well as some interesting news about data sets being added to NCI’s NBIA server. The most exciting centers around a project called The Cancer Genome Atlas (TCGA). Here is the official project summary from their site:
The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing. TCGA is a joint effort of the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), which are both part of the National Institutes of Health, U.S. Department of Health and Human Services.
The overarching goal of TCGA is to improve our ability to diagnose, treat, and prevent cancer.
In order to support the vast amount of data accrual that has been occurring in support of this project the NCI established the TCGA Data Portal. While this has been a fantastic pilot project to this point one element that was not officially included in the data accrual at NCI was imaging. However, over the past couple months the team I work on at the Cancer Imaging Program has been in constant communication with a few of the sites providing data to the TCGA project. I’m proud to report that these efforts have paid off greatly, resulting in a partnership with both Henry Ford and the Univerity of California San Diego (UCSD) to retroactively obtain images for TCGA cases and load them to the NBIA here at the NCI! This was a critical step and was the cornerstone in allowing the development of the new projects announced at this year’s caBIG meeting.
During the meeting it was officially announced that the Imaging workspace within caBIG would be starting a project of their own to add even more support to the TCGA initiatives via the addition of some new open source tools to enrich the data set. The goal is to provide visualization and annotation tools to accompany the image data that would allow physicians/researchers to load up the images, make measurements of the tumors, record text annotations, and then store them back to a data service that could be queried by others who were interested in reviewing them. This will be done by developing new plugins for two very popular existing medical image viewers and one NCI homegrown one. The first is ClearCanvas which was developed for the Windows platform. The second is Osirix (free, but I don’t think it’s fully open source) which was developed for Mac. There is also a third cross platform tool called AVT created using yet another Imaging workspace product called the XIP Toolkit. XIP was designed for rapid imaging application development and is based on QT which all of you KDE folks already know and love.
Hopefully I will be able to report back on some positive progress in these areas prior to the RSNA annual meeting which occurs at the end of November. I believe the current goal is to have functional applications ready in time show off at that meeting!