Newsflash
| Quantum GIS Version 0.11.0 has been released. It is available in source form, and as binary executables for Microsoft Windows, Mac OS X, and GNU/Linux. All versions can be obtained from our download page. If the version for your platform is not available please check back in a day or two as some packages are still being built. |
| Interview with Frank Warmerdam |
|
This week we have the pleasure of interviewing Frank Warmerdam. Frank is well known in the Open Source GIS Community for his GDAL (Generic Data Access Library)/OGR which provides seamless access to a wide range of vector and raster GIS data formats. GDAL/OGR is used in virtually all of the popular Open Source GIS applications out there - including of course QGIS
TS: Welcome Frank Warmerdam to this the 10th in the QGISSER interview series! Can you begin by giving us a brief personal introduction to yourself Where do you live, how many pets do you have etc....
I am a software developer, living in Toronto, Ontario, Canada. I have a wife and two kids (8 & 10). My wife has a cat. TS: How did you become a 'geo-spatial programmer for hire'? Do you make a living exclusively by contracting. While I was a student at the University of Waterloo, I did several co-op terms for a scientific visualization company. When I graduated in 1991 I decided I wanted to work for a scientific company, hopefully in something visualization related. I interviewed with PCI (now PCI Geomatics) and took the job, despite no background in GIS or geospatial. So, I am definitely a "learned on the job" sort of geospatial guy, rather than having a formal background in it. In late 1998 I left PCI, and really wanted to be my own boss, and to ensure I would own the intellectual product of my labour. So I decided to strike out as a contract developer. I had been interested in open source since the late 80's, but had been pretty turned on in 1998 by "The Cathedral and the Bazaar" and it encouraged me to focus on open source development. I have been working for a wide variety of companies ever since, mostly on open source work, and making a respectable living. TS: Could you give us a brief history of GDAL/OGR? Why did you start writing it? I did a lot of work on data translation at PCI, including writing a TIFF implementation in FORTRAN (ack!). Circa 1993 I started a library at PCI called GDB (Generic Database). It was intended to be transparent read/write access to a variety of raster and vector geospatial formats. While I worked on a variety of other things while I was there, I think that was my greatest technical achievement. When I left PCI, I spent some time thinking about a niche I could work in. Something companies would be willing to farm out and that I could offer good value in. I decided that data access was a great niche I knew a lot about, and liked working on. So GDAL/OGR is in many senses the "next generation GDB". TS: Did you start GDAL in a void or were other (open source) libraries trying to achieve the same end goal? I was not, and still am not aware of any open source libraries trying to address general raster (and vector) data access in the geospatial field. There are a number of projects with multi-format support built into the project, but not especially trying to offer the multi format library as a separable item. In the commercial world, there was FME addressing this area for vector data, but mostly as an application rather than a library. In the raster realm, there were folks like Blue Marble offering some components for this but they didn't seem to have widespread use as library components. There were also lots of multi-format raster libraries aimed at general image formats, but without a geospatial focus. But the main influences in the formation of GDAL were GDB, libtiff and the Open GIS Consortium. TS: Can you tell us a bit about how the GDAL development process works How many contributions do you get from other developers, or are you mainly 'flying solo'? I would say, I am the primary developer, and rule the architecture as a dictator. Unlike projects such as MapServer, QGIS or GRASS - GDAL isn't exactly a "community" project. Andrey Kiselev (in St. Petersburg) has been a prolific contributor on the raster side of things. His work was often on subcontract through me from my clients, but he also does some work (ie. HDF) driven by needs of his co-workers in Russia. There are also a number of other raster drivers contributed (and sometimes maintained) by other folks, such as Keyhole, the folks at ITC in the Netherlands, Gillian Walter at Atlantis Scientific and a number of others. On the vector side, Daniel Morissette is the largest co-contributor having done the majority of the mapinfo support. There are also a variety of other contributors with fixes, new features, docs and improvements to the build system. I hesitate to name to many more names, since I know how easy it is to miss important folks. So, there are definately lots of folks out there supporting GDAL in a lot of ways. TS: Write support in GDAL/OGR is not as complete as read. Is this because demand mainly lies with read access to data, or are there issues relating to many formats that make implementing write access more difficult that read access It is some of both factors. Support for esoteric formats is usually driven by the need to pull data into more generally useful formats. Many "product formats" like CEOS, HDF, TIGER, SDTS , etc are things that usually one or a few data production agencies would ever actually produce. Also, complex formats are often harder to write *properly* than to read. I often curse folks who implement crappy writers for formats since it makes my job writing readers so much more problem prone. So I try not to implement write support for a format unless I can be sure I am doing a decent job. But I feel it is less harmful to do an incomplete job in a reader. "Back in the day" I lost some hair implementing read support for USGS DEM files, because so many folks had written crappy writers that violated various rules I had come to depend on.
TS: Can you tell us about prospects of free and open source readers for MrSid and ECW What kind of issues are involved with integrating these formats? MrSID and ECW formats are both quite complex and proprietary. As such I am dubious about true open source implementations of either formats without the support of the companies owning the existing rights. However, we have made excellent recent progress with getting support for this formats in open source software using their not-entirely-free libraries. TS: Is it even possible to produce open drivers for these formats or do they have the IP locked down too tight? There are likely to be patent issues with any independently developed implementation of MrSID or ECW formats. But then patents can impinge even on relatively open formats like JPEG2000 too. So for ECW and MrSID we are essentially dependent on what the vendors are willing to provide. In the case of ECW we have source code for the ECW SDK and this allows us to port it freely. However, it comes with various strings attached, including the onerous restriction about not being allowed to use it for "servers". For MrSID, the strings seem less onerous, but we are only provided with binary libraries, no source. I would add I am trying to tread lightly in this area since ERMapper is my primary sponsor. Unlike some folks in the OSGIS world, I still want to maintain good working relationships with a lot of the commercial software vendors so I try not to be too pedantic, especially in public. I do work directly, and indirectly for many of the big names. TS: When you do contract work for companies, is it mainly to extend or add specific functionality to GDAL (as opposed to writing separate bespoke software). If so, how do companies feel about the idea of intellectual property developed with their money entering the public domain? Do they 'get the idea' that they are also gaining a lot by you bringing a mature stable library into the arena onto which you will build their software (thus saving them a lot of development time and costs)? Now days most work is for folks using GDAL already, so there is generally no problem with the improvements being rolled back into the free software. When I got started it was sometimes a bit complicated to justify. I did an early contract for Intergraph and the way we satisfied their lawyers was that we let the software copyright be held by them, but under the normal GDAL license. Somehow that satisfied some suits that they were "getting something". But part of the reason I picked file translators, is that very few companies see it as a "crown jewels" sort of property. It is, rather, a cost of doing business. Now days some folks accept that the side effect of me building them something is an addition to GDAL. For others this is seen as a big plus, and they are eager to support GDAL since they have already gotten lots of benefit. But I am pleased that in general the open sourceness of the product has not been an issue. I would add that I take care to use a "business friendly" MIT/X license, and I am not particularly ideological in discussions with clients. TS: Where do your plans for the future direction of GDAL lie? Where does major user demand lie in terms of features? Better performance? More data formats? Language bindings? Kevin Ruland and Howard Butler have done a lot of work over the last few months for a "next generation" swig binding approach for GDAL. This should provide bindings in many more languages. So new bindings is a big direction, including Java and .NET compatible interfaces. Also, I am hoping to address thread safety and multi threading performance issues at some point. Currently GDAL is completely not thread safe. Other than that, adding more drivers is the main direction. Also, I am hoping for a "grand unification" between GDAL and OGR at some point. Currently they live in the same source tree, but have separate approaches to managing drivers, metadata, etc. I would like them to share this, and also to make it easier to support raster/vector mixed formats in a more sensible fashion than now. GS: Lets switch gears a bit -- Can you give us your "State of the OS GIS World" view? Thats a pretty broad question. I think that variety and maturity of open source geospatial offerings is continuing to improve year by year. I also see a greater willingness among the user and integrator community to use open source software. I often feel like we have a bit of schism in OSGIS between the Java crowd (uDIG, geotools, etc) and the rest of the community using a mixture of C, C++, Python, Perl., etc. I would like to see more effective bridging of that divide. I am especially pleased with the growing momentum of the OS GEO conference. I think it can play a big role in bringing together the community. GS: Do you think there are too many OS GIS offerings? I'm a big fan of variety, and so I don't think there are too many. Of course, there is the risk of diffusing development resources so widely it is hard for any of the projects to do all they should. So, I definitely have to limit my direct participation to a few projects, even though I like to keep tabs on lots of what is going on. GS: What are the major obstacles facing the OS GIS community? One is finding funding models to support continued progress. Another is providing easier to use "binary installations" of individual tools, and suites of tools. Also, I think "data lockup" is a big obstacle to the use of OS GIS software by hobbiests, NGOs, or interest groups beyond the existing GIS crowd. This is a general drag on the geospatial field, but particularly frustrating in the OS GIS world where data availability is often the only thing holding back lots of interesting projects. I am also concerned about software patents, though there have luckily only been a few instances of this affecting the geospatial field directly. GS: What can be done to ease the entry in to OS GIS? Easier installs, free data, live CD's and so forth? Well, I think easier installs are very important. I'm hoping to have a ['birds of a feather meeting'] among some of the folks who prepare binary distributions at the OS GEO conference to see what we can do better. I think live CDs are neat, but ultimately not much more than a proof of concept solution. There is also lots of free data available for trying stuff, though it is often challenging to get the data for your region at the desired scale. But I am doubtful about us as a community being able to affect that much. I would add that better documentation is also key, and I have been very pleased to see the release of some books lately aimed at easier use. But mostly, I think producing some good pre-built versions of software is the most high leverage place to start. GS: The QGIS team is currently in the process of implementing projection support using proj4. Can you tell us the current state of the proj4 project? PROJ.4 is pretty mature, and most of the work done on it in recent years has been related to datum shifting and other issues outside of the core support for "projections". Gerald Evenden, the original author is now producing new releases of the core software as "libproj4", and my long term plan is to use that as the core for PROJ.4 with me maintaining all the "datum" and other stuff that Gerald doesn't want to focus on. I do think it is exciting how widely PROJ.4 is now used in the various projects, and I think it makes it easier for GDAL, MapServer, and GRASS users to all share coordinate system information. Hopefully that will extend to QGIS. That said, there are still challenges with PROJ.4. There is good documentation in the form of the original PostScript white papers, but it isn't very accessible as quick look-up reference docs. GS: You have partially addressed this question. One of the problems we encountered is lack of current documentation, or at least one volume of documentation. Is that going to change any time soon? Not as far as I know. Gerald may produce new unified documentation for libproj4 which would help a lot. But that will still leave everything about datums, alternate prime meridians, epsg lookup files and so forth separate. GS: It has "leaked" that you have been nominated for the first ever annual Sol Katz award. How do you feel about that? I am planning to counter by nominating several other people! Seriously, I think the award is a great idea. I interacted a bit with Sol back in the 90s, and he was definitely one of the few earlier movers in the world of data translator utilities. Also, at my first OGC meeting in 1999, they gave out the first Kenneth Gardels award for contribution to OGC and it's community. I went to that meeting a bit doubtful of the motives of lots of the folks involved. But the speeches and discussions at the time of the award really helped me see that there are lots of folks making big contributions in the geospatial world. And that taking some time to acknowledge them is useful, and helps us take a longer view of our accomplishments as a community and where we are going. I hope that the Sol Katz awards can provide a similar focus to the open source GIS community. I would add there are lots of fantastic contributors in our community, and I can think of lots of folks deserving of such an award. GS: We would like to thank you for your time today and especially for your work on GDAL/OGR/PROJ. It has made QGIS much of what it is today. Do you have any parting comments you would like to add? Folks should try to make it to Open Source Geospatial '05 if they can! And, I think these interviews are great. I look forward to seeing lots more. TS: I would also like to thank you very much for your time for this interview and for the great stuff you do with GDAL/OGR! My pleasure. |


