VSLive! Speaker Interview—Dave Reed on Mapping Your Data Between XML and Your RDBMS

As Microsoft's General Manager of XML and Data Technologies, Dave Reed runs the SQL Server group's Web data team. He's driving the direction of XML and data access, including ADO.NET, MSXML, System.XML (a library that ships in SQL Server), ODBC, OLE DB, MDAC, and the SOAP Toolkit (the native implementation of SOAP). And a few years back he was on the original development team of Microsoft Transaction Server. We're proud to feature him as a keynoter at this year's VSLive! Orlando where he'll be talking about Microsoft's roadmap for XML and SQL Server in the rapidly evolving data landscape. In this interview Dave will give you a sneak preview of his talk, along with other insights about where XML can (and can't) fit into your enterprise programming models.

XML represents a vast improvement over flat files for bridging disparate systems and providing for self-documenting data. It makes things like Web Services possible. It provides a great metagrammar for doing all this. XML is entrenched and ubiquitous, and has created a whole aftermarket of tools and expertise. This is all great, but do think anyone is left out there who doesn't know about XML already?

It's true that we did a lot of work to make VS.NET (especially ASP.NET) a powerful Web Services development environment. We've provided a lot of framework there. And we provided rich XML support, including the XML Reader. We've gotten a lot of positive feedback about that API. I think if you look at the development community, you'll find a nontrivial number of people building Web services. That number is probably larger than you or I would guess—but it's still not the majority. So to the majority still not building Web services, we need to say that even if you're not building Web services in VS.NET you're still going to be touching XML in some capacity—in a file format, or as the way you obtain a business document through a trading partner. XML is appearing everywhere. Microsoft Office is going to be using the XML file format. We rearchitectured it, refactored the APIs to reflect this new reality.

What's your vision regarding XML for Visual Studio developers?

The vision here is recognizing that your data is going to continue to live in relational form. But there's this tension in that a lot of data in the middle tier is now in XML format. So the core scenario is simple: back and forth data mapping between relational and XML. You're working with XML in the middle tier and data in relational form. You're not going to store your XML natively because of application requirements. Maybe you're processing XVRL documents. You'll have interesting data in there that needs relational mapping.

You could do this mapping by hand, using the DOM or some API like that, then process it manually using ADO.NET. In some situations today you have to do that anyway. It's required for complex mappings with many different tables and complex key relationships. That kind of scenario won't be supported until the next release. Otherwise the non-manual way is using SQLXML.

Here's how it works: You have an XML document. You're going to use a schema representing that document. Take that schema and annotate it with Microsoft metadata, mapping tags into columns. Go through and annotate a schema. Then our middleware maps it directly into the document, using XML Views or annotated schema (they're the same thing).

Do you have customers doing living the vision today?

We have case studies on customers such as JetBlue Airlines, and Accor, which runs the Motel 6 chain. They use XML and SQL server at a data level for their e-procurement application. And there are lots more. It's growing like mad.

How representative are they?

Enterprise CIOs have come around in the last year or so. They're telling us one of their top concerns for the next few years is how to develop Web services—it's the thing their developers are focusing on the most. It's about being connected. You have a connected economy where a company is sharing information with suppliers, vendors, customers, and employees. Now we need to do it more efficiently with Web services, which reflect that reality. Probably the company that comes most to mind with this is Ford, who wants all suppliers connected through Web services. Next thing you know you have 20,000 companies wanting to build Web services.

What are the SQL Server group's plans for Xquery (XML Query)?

We're deeply involved with W3C in developing the spec. We have three members in the tech committee and were the first company to put something on the Web for trial use. You can get it and play around with it. As for Xquery support in our products, we're committed to putting it into Yukon [the next major release of SQL Server]. We expect and hope that the spec will be complete by then. And we'll be doing some work on the update side as well, working alongside other companies—such as, potentially, IBM and Oracle—to come up with something to go into the standard. I don't know how much stronger we could be in our commitment.

What should developers be doing with Xquery today?

Until the release of Xquery and a number of products, you should be learning about Xquery, but just to familiarize yourself. The thing with a more immediate impact is GXA.

Speaking of GXA, are there any plans to extend the SQL Server Web Services Toolkit to support emerging GXA specifications such as WS-Security?

In SQL Server 2000 we released XML support in the flavor of relational tables you can expose as XML, using a mapping layer to get results in XML form. We called that SQL XML. It wasn't just for queries but for updates as well. If you want to interact with the database completely in XML you can. And you can compare two XML docs. This technology is really what the Web Services Toolkit is.

After we shipped the initial release of SQL 2000, XSD got standardized. The timing was perfect for VS.NET, but SQL 2000 had our own flavor of XML support. We needed to include XSD, so we did a subsequent release of SQL 2000 to do that, then recently re-released the Web Services Toolkit for SQL Server, incorporating XSD. The thing that made us associate the name with it is the added ability to invoke stored procedures through SOAP. As for the WS-Security spec, we're helping get that into OASIS (Organization for the Advancement of Structured Information
Standards).

These things will emerge more quickly than many people anticipate, due to demand. There's a nontrivial number of people building Web services, and they're discovering we need more security. It's also a momentum thing. Microsoft wants to keep the innovation coming and meet the needs of the emerging scenarios. So Microsoft is pushing these standards. We're the primary company to do so, besides IBM. So the answer is yes. I don't have an exact timeframe, but toolkits supporting some of these specs will come out later on this year.

How are developers using SQLXML today?

I recently spent some time talking with developers at customer sites about this. One interesting scenario I saw is people who want to expose their UI right from the server. They layer XSLT on top of SQL Server, getting documents from the server using XSLT to generate a style sheet on the browser. For example, a financial services company we know of is doing this with a trading desk app. They have a lot of JavaScript running in the browser, with lots of logic in the back end.

What do developers need to know if they're .NET newbies?

The most common problem has been us not providing enough prescriptive guidance. Then they'll build an app in an inefficient way, based on prior experience. They end up building something that doesn't perform because it's outside our prescriptive design pattern.

So they need low-level stuff. The info that's most critical to them right now is how to most effectively build apps using the new technology-design guidance, sample code. And there are still some paradigm shifts in there that people are struggling with. ADO.NET is pretty foreign to ADO programmers. Though people come back after they finally get it and they love it.

You need to know about SOAP. We've completed SOAP 1.1, WSDL, and to a certain degree XML schema, because you can't really have SOAP interactions without that. And you need to know about UDDI, which is outside SOAP but relevant. It's pretty critical to get up to speed on this stuff. People are implementing based on this.

You need to know where you'd use the dataset and where you wouldn't. There are two sides to the ADO API. The provider API for the SQL client is simple. Then on top of that is this disconnected cached dataset. You have a middle tier generating Web pages of read-only data, and you stick the data into the cache [kind of an ad hoc warehouse]. This has its uses, but it would be overkill if you were doing a simple dialog and filling in a form.

Also, people need to know the roadmap with respect to things such as Xquery, and all the W3C work that's going on.

What are you doing to help people over the humps?

To tackle the information gap we've put a tremendous amount of energy into getting many textbooks out. We've got some ADO.NET stuff coming out that looks pretty good.

We just released the 3.0 SOAP Toolkit for VS 6.0 users. We added DIME support-ability to associate non-XML data with an XML payload. A fax document for example. [DIME stands for Direct Internet Message Encapsulation, a message format protocol that simplifies certain interactions in Web services, such as the transmission of video, graphics, and audio files.]

And we just launched the MSDN Architecture Center to help people move to the revolutionary object-oriented .NET programming model—towards more of a multithreaded enterprise. VB6 was great and people extended it, often exploiting undocumented features. Now with VB.NET it's gotten a lot better. This parallels the move from client/server to multitier and over the Web. The architecture center will help with moving outside the firewall to disconnected, asynch operation, and in general with how to architect your solutions so you're not starting out with a bad design.

There's the whole community side as well. Gord Mangione as VP and I as general manager encourage our developers to interact with the various communities. My team is active on the relevant news groups. And not just MSDN and our own news groups.

Is a native XML database in the offing?

There's a lot of pressure—especially from smaller shops—about having native XML support in the database. They ask if we're going to have a native XML database. However, others don't care about XML in the database at all. Most of the interested people are on the document processing/content management side. They understand it as a data model.

I've done some research on the idea of a native XML database. I'm defining that as a database you access in XML style, using Xquery, Xpath and the like. So what you put into the data storage and what you get out is identical, including processing instructions and white space you'd lose if you store the data relationally. An XML database would be for preserving exact content of document for document processing, or for legal reasons. Those are the two primary purposes. The physical representation is secondary. It's the logical representation that matters. And in the Yukon timeframe we will support these properties. They represent an important addition to the functionality.

You'll have two ways to achieve this logical representation. The more important one is a mapping scenario, which we already support. This lets you access a SQL server database as if it were an XML database. The other way is storing XML docs in the server without mapping. That's the new functionality. But we don't think everyone will jump to do it. Many will want to keep storing data relationally.

What is your group doing to help with developing mobile apps?

We'll be releasing SQL Server CE 2.0 soon. I'll be discussing this in my keynote in Orlando. CE is our mobile, secure database for the pocket PC. I think of CE from a SQL Server standpoint—we want to scale SQL server up and down to the device, so you can build rich apps on Pocket PC devices. End users include salespeople, people with distributed apps—package delivery companies—mobile delivery forces such as food distribution and healthcare. They use wireless with SQL Server but CE provides local storage. We replicate the data back and forth.

I'll also be discussing our new Notification Services: Have you ever used MSN Mobile? That's what it is: a class of apps sending timely, accurate, personalized information out to customers in the field. MSN had such a solution running on 16 machines, with several million subscribers. This involved a forward loop—they had to check every subscription. The MSN Mobile people decided it sounded like database stuff, and said to themselves "Let's bring in the SQL Server folks to do subscriptions and notifications." We came in and thought probably many customers are building eventing apps like this. So instead of recreating the wheel in a nonscalable way, we decided to package and ship it.

Notification Services are built on SQL Server 2000 and the .NET framework, with a set of APIs developers will be able to use as a free download on the day of the keynote at VSLive!. MSN Mobile, Expedia, and the NY Times are participating. Anyone who has to build a subscription engine and notifications should check this out. Not only is it based on .NET but it's extensible out to listen to any type of event notification, including Oracle, Biztalk, and Sharepoint. Plus it's extensible on notifications it can send out—to email, cell phone , PDA, .NET Alerts, MSN Messenger, and more. You can post to a database, a Web page, or write out a file. You can build a VB component and ship it out this way.

Will your new mobile products help you provide a viable alternative to Oracle's mobile offerings?

Oracle isn't widely used. Sybase is probably more of a competitor. Their mobile solution is better than Oracle's. But ours integrates with tools such as Embedded VB and C++. At VSLive! We'll show smart device extensions and a version of the .NET framework running on a Pocket PC., using the .NET compact framework. You'll be able to use this to build rich clients or Web apps running on a local device. Plus we'll show good admin tools, such as a query analyzer.

Can you let us in on anything else you'll be discussing in your keynote?

My keynote will include what we're doing with object-relational mapping. You can access data three ways: as data row sets relationally, as XML, and as O-R. I'll also cover the taxonomy and direction we're taking with the APIs in each of these three areas—and where you can use which APIs with what skill sets and uses.

Any last thoughts?

One way you can evaluate developer tools is to ask what the tools themselves are written in. For example, ADO.NET is written in C#. We use our own technology to build our own frameworks. And if you're not using your own tools, you're probably not doing something that's useful.



Dave Reed

As General Manager of XML and Data Technologies for Microsoft, Dave Reed drives the direction of XML and Data Access. Technologies that Dave is responsible for include: MSXML, System.XML, ADO.NET, ODBC, OLE dB, and the SOAP Toolkit. Prior to leading the XML efforts at Microsoft, he was on the original development team of Microsoft Transaction Server version 1.0, through the release of COM+ in Windows 2000. Prior to joining Microsoft in 1995, Dave developed Saber-C++, a UNIX based C/C++ programming environment at Saber Software, Inc.