Generating Revenue from Information Infrastructure

Thoughts on Infrastructure Business

A Historical Context

Numerous historical examples indicate that inventions of infrastructure systems, whether physical or intellectual, do more to expand economic growth than perhaps any other human creation. Some examples are written language, the printing press, the railroad, electricity, ancient aqueducts, counting systems, and of course the Internet. Innovation of infrastructure systems is not without significant risk, however. More often than not such inventions are business failures before they are adapted into a successful format and even then it is typically unknown second and third order effects that drive the greatest revenue growth.

The machine powered railroad was invented in late 18th century England. Several revisions were attempted but the idea was not significantly adopted in either England or Continental Europe. The idea when applied in the United States was adopted slowly and for short distances starting in the early 1830s. By the 1850s railroads were run for long distances. This new infrastructure ushered in the American industrial revolution and quickly transformed the American economy. It should be noted that several of the wealthiest persons in human history, when adjusted for inflation, were Americans that ran industrial businesses shortly after this period.

The railroad example is repeated in a similar manner with computers. The first computers were large mainframe computers, but the companies (and individuals) that eventually generated the most money were software companies. The example repeats itself with the introduction of the World Wide Web. Many of the first web companies failed. The investment failure culminated into the "Dot-com Bubble". In the following years several web companies would grow to among the largest companies in the world whose leaders would become among the most wealthy living persons.

Infrastructure Revenue Models

New Infrastructure allows for the creation of new revenue. This new revenue can be realized directly, indirectly through access to transmission services and applications, or by creation of second and third order effects entirely unknown to the infrastructure.

The direct revenue model is about charging money to access the infrastructure. An example could be a toll road where a driver is charged every time they use the road or every time a commuter accesses a subway or city bus. Direct revenue is perhaps the most simple and visible means of returning value from the new infrastructure, but it also limits access to the new infrastructure. Limited access means the total value of the infrastructure is less because the quantity of revenue transactions is smaller. With limited access growth of acquisition is substantially slowed.

Indirectly generating revenue from supporting applications and services is far less limiting to infrastructure traffic compared to a direct revenue model. In this model anybody is free to access the new infrastructure at their liberty and revenue is generated from access to helpful features. While this model is far less limiting than the direct model it still limits the total value of the new infrastructure. By charging for access to premium features users are limited from the full potential of the infrastructure.

This model is particularly limiting to developers and lay persons interested in experimentation and extending the infrastructure with new features and innovations. Such efforts are responsible for ideas and inventions different from those who conceive of the initial infrastructure designs. This means second and third order effects are slowed or prevented.

The third model of generating revenue is directly related to second and third order consequences completely outside the vision and control of infrastructure developers. In this model of revenue the design of infrastructure must be executed in a manner that sets absolute boundaries. These boundaries must be technical qualities that intentionally limit access or engagement as necessary to prevent harm or behaviors that predictably would decline the value of the system as a whole. Simultaneously the system designers must also anticipate the emergence of other negative behaviors and design the system in such a way that it can be extended to limit access to qualities that are yet unknown. Aside from negative behaviors there are no arbitrary limits upon the system itself or features therein.

The third revenue model is the most lucrative but it is also the slowest to realize profitability and carries the most risk. It is certainly possible this model may never realize profitability as it is reliant upon factors beyond the control of the system developers and anticipates the value upon the system as a whole.

The ideal means of leveraging risk against maximizing revenue generation is to use all three revenue models where appropriate. The direct model is the most limiting so it can be used to provide a means of limiting negative behaviors and generating support revenue to solve for these limitations. The indirect revenue model is less limiting so it could be used more frequently and should be used to limit negative extensions onto the system.

Message Digest

A Service Oriented Web

The web is essentially the HTTP protocol. The way HTTP works is that an agent makes a request for a resource at a given address and the server containing the resource sends a response. HTTP is inherently designed to be simple and primitive, and so the responding server treats each and every request as new, anonymous, and independently autonomous. This is known as sessionless communication.

In the early days of the web content was inherently bound to a fixed resource at a given location. It wasn't until a few years later that languages like CGI, ASP, and PHP were written to supplement the primitive nature of HTTP and dynamically modify content at a given resource address. Now the web could be an application experience where content is dynamically populated onto a page in response to a given user's session with the a service application.

Several years later Microsoft created a feature called XMLHTTPRequest for use in the web portal side of their popular email server application. This feature allows web browsers to send an HTTP request to a given resource address without regard for the context of the page in the web browser. Several years later this feature became a common standard named XMLHttpRequest, and now transmission state could be separated both application session and content context of a given address.

The separation of state and transmission allowed for a service oriented web. With the advancement of fast interpretation of client-side application code, namely JavaScript, entire applications can be downloaded with regard to a page context at a given resource address and executing on the requesting agent without regard for the dynamically populated data supplied by application code supplementing the HTTP server. Since data could be dynamically requested at any time apart from an application state or application interface and execution upon this data may occur at a separate location the important consideration is the availability of transmission and the integrity of data returned upon the transmission's response.

The primary limitation of the service oriented web is that it is still reliant upon a primitive protocol, HTTP, and a supplementing application. Even if the supplementing application is itself primitive and does nothing more than populate data into a transmission response there are still two features which cannot be further separated. This limitation is problematic for seveal reasons:

Most of the problems experienced in the web at the time of this writing are due to the prior mentioned limitations. These problems may manifest as government wiretapping, server session concurrency, security limitations, extensibility only through supplement of additionally limited services, and so forth.

Message Oriented Traffic

The practical evolution of the web is to separate data from transmission, but this separation cannot occur on the web. A new format is required. Any content format will require two parts: a transmission protocol and a message format. This model is identified in both the web and email.

Fortunately, Mail Markup Language solves for one of these two requirements. A new transmission protocol is required. The ideal characteristics of this problem are: primitive simplicity like HTTP, routing independence from application states like SMTP, and an ease of configuration at both end points and routing relays.

Schemes fitting the characteristics here described should be known as message oriented apart from the web's service oriented. Examples of this evolution and accompanying schemes already exist.

Historical Example of a Message Oriented Scheme

Napster was originally formed to help people share music over the Internet. The original service scanned a list of songs on your computer and uploaded this list to a central server. Any other user that wanted your music could then download it from your computer provided the central service supplied the proper connection between the requester and data source. This model of data sharing was common in older technologies, such as DCC via IRC. What made Napster unique is that it is multicast in its search data, like IRC, but it is content specific to MP3 file formats.

Eventually Napster became the target of government action because it provided a central service, with central hardware, to violate intellectual proper protection. As a result numerous other decentralized services, commonly known as peer-to-peer, popped up to take its place. The most popular of these services became Kazaa and BitTorrent. Kazaa is an interesting example because its technology was later adapted to voice communications which became the popular application Skype. BitTorrent is interesting in that its central service provides a search mechanism for seed file broadcasting where a seed file represents a hash and map of a given media artifact. This feature allows any single media piece to be downloaded from as many different sources as possible all of which occurs without need for the central service.

Achieving Privacy by Means of Anonymous Traffic

Content traversing the web cannot ever be anonymous because HTTP is too primitive. The server is always known. A requesting agent can be anonymous to the service, other agents, and snooping parties through use of various schemes but the traffic itself will not be anonymous.

A message oriented scheme can be completely anonymous through use of an intermediary. This intermediary would have to know in advance that messages from one address bound to a specified other address require its help. Once a given rule is so established this service can act as a message relay that passes fully encrypted messages from source to origin. Since transmission to the relay and from the relay to the destination are all that are necessary even the transmission header of the original message could be encrypted along with the message body.

When a given relay server point moves a high enough quantity of traffic it is unlikely that any snooping party would be able to determine which input is related to which output. In the case where this is a concern output messages could be delayed by a random duration so as to prevent identification via traffic pattern analysis.

When a single anonymous relay is not enough these services could be used in plurality. One means is for relays to establish dynamically identified routes between themselves so that they known which messages go to which destination without regard for origin. Another approach is to encrypt a message at a relay with a public key that can be decrypted at another relay with a private key in such a manner that relay services could be stack with a series of encryptions.

Eliminating Single Points of Failure

Web services are single points of failure. If the Facebook domain goes down the entire service is unavailable. This problem would not exist in a message oriented scheme. In a message oriented scheme any agent or end point is a source of data regardless of their address. These sources of data could be dynamic, if supplemented with an application service, no different than a web service.

The benefit is that if a government blocks access to Facebook from one source any other source at a completely unrelated address containing the wanted data remains unblocked. The difference is that in a message oriented scheme data exists at end points while transmission is separated from that data. Once such a scheme is populated with communicating agents the only means by which to block access to data is by blocking access to the scheme itself.

Since access to data cannot be so easily blocked the interception of that data likewise becomes more challenging. The only way to intercept service data would be to target specific users. If specific users encrypt their messages then snoopers can only determine to whom they are sending data. If encryption and an anonymizing agent is used then snoopers cannot determine the contents of the communication or to whom that communication is directed. What remains visible to snoopers are only sent and received traffic patterns to a particular user. Unless the goal of snooping is specific to a single user this sort of oversight would not be helpful to monitor usage of a data system or any large scale social collaboration.

Simple Mashups

A mashup is a use of data from different sources combined in a certain way that is more helpful to a consuming audience. An example is deterministic shopping data where a user wants to know which restaurants in a given localized radius offer meals less than $10 dollars. Another example is using social data mined from your friend's contacts to influence gift purchasing decisions for that friend's birthday.

On the web mashups have proven extremely hard. Since data is not separated from transmission there is little motivation for anybody to spend the time and effort to cross-reference the data unless they already have all the data they need and the software development resources to write the supporting software. This is a horrible economic strategy since although revenue could prove lucrative the risk of failure is tremendous. Since the risk of success is tiny organizations that produce the data are unlikely to release their data for common consumption. In the rare cases of success this economic model proves lucrative, because there is limited competition.

In a system where transmission and data are separated the risks of data publishing/acquisition oriented business decrease over time as data sources become more openly available. The offline values of marketing and consumer awareness that result from publishing or consuming commercial data are present irrespective of the prior described risk. Therefore as demand increases and availability increases the risks of publishing and consuming data openly decrease until fully offset by technical factors of distribution and volume.

If these economic conclusions hold true then mashups would become exceedingly common in a new and more separated media. This commonality would bear significant second and third order consequences compared to the web and mobile app models of data businesses. As data becomes more available to analyze and analysis across various sources eases the value of application development and sales would transfer to more resultant considerations that further drives demand for additional data. By transferring the primary economic back onto the data, and its analyzed conclusions, opposed to applications in the middle of the process a step in the middle of a goods/services economic chain is devalued in favor of consumer interests. The decreasing costs between the consumer and their desired goods/services is a cyclic economic deleveraging.