In Taxing Data, Omri Marian argues that taxing data-rich markets requires rejecting income taxation—not only as implemented but also “in its optimal theoretical form”—as the best proxy for ability to pay. Instead, Marian makes the radical suggestion that data itself “may be a better proxy” for ability to pay, and he offers three fundamental features that should guide “a reimagined tax on data.”
The article is rich in detail and is at its most persuasive in discussing the income taxation of business entities. Drawing on the work of tax historians and scholars, Marian summarizes two dominant narratives explaining the origins of the corporate income tax: the corporate income tax as a proxy for shareholder income, and the corporate income tax as a means to rein in management. Marian points out that if corporate ownership and management is largely local and traceable, which it was “at the dawn of corporate taxation,” then “whether the attempt was to target shareholders’ ability to pay, or managerial interest, the taxation of corporate income made sense.”
On the horizon, however, was a perfect storm of globalization, dispersion, and “intangible-ization,” which, Marian asserts, “our data-rich economy amplifies by orders of magnitude.” These forces have now so completely swamped the corporate income tax’s ability to identify source or ownership and to measure value that it is time to “revisit our conceptual tax design.”
Dispersion, for example, is not only a matter of diffuse, multi-layered ownership, but it also “defines all other functions of the modern multinational corporate entity.” If corporate income taxation no longer operates “as a functional instrument to tax corporate owners or managers,” then there is “no substantively meaningful corporate ‘home’” to support residence-based taxation. Marian asserts further that, because “‘value creation’ is fragmented, outsourced, and facilitated through a multitude of intercompany transactions between affiliated companies,” “‘source taxation’ in the context of corporate entities seems equally meaningless.”
In turning to “intangible-ization,” Marian emphasizes that the “transformation of capitalism” wrought by intangible investment overtaking tangible investment has caused “insurmountable tax administration challenges.” Among these is the obvious difficulty of measuring intangible investment, and “[i]f we cannot measure investment, how can we measure the return on investment.”
As Marian notes, the challenges to the income tax arising from globalization, dispersion, and intangible-ization are well known and have yielded multiple reform efforts, including the OECD’s recent two-pillar approach. While Marian applauds these reform efforts, he concludes that they will be insufficient because they continue to rely on the concepts of source, residence, ownership, and monetary value.
Marian contends that, to reach the data economy, what is needed is a bold move away from reliance on income as a tax base. This would include moving away from reliance on proposals tied to consumption or savings since they are simply “economic components of income.” He argues that a data tax would provide an administrable, equitable and efficient solution, and he proposes three fundamental features for such a data tax. The combined effect of the three fundamental features is somewhat reminiscent of a wealth tax, but one that is tied to data volume, flow, and use rather than to monetary value and ownership.
First, the tax base should be raw data, so that the “tax depends on the volume of data, not on the monetary value of data.” Marian explains that “each little piece of data” generally has no measurable value because “[o]nly when terabytes of data come together do they become valuable, and only because they are aggregated.”
Second, tax should be “collected on the flow of data.” This approach alleviates the need to source the data to a particular location. Marian explains that the source of value of data is “probably not where people whose data [is] collected reside because . . . each individual piece of data has no value.” Trying to locate source where “the data is analyzed. . . also seems theoretically farfetched” given the dispersion of those locations and that much of the “manipulation of data is outsourced to robots.” He posits that such a tax would be readily administrable given that it is clearly possible to tie fees to data volume given the use of such measures by internet service provider and cell phone data plans.
Third, “the users of the data” should be the taxpayers. This approach allows for a tax not dependent on ownership, which is needed because owning data “is in no way an economically meaningful concept.” As an example, Marian discusses data “owned” by both an individual and by Google: “If both you and Google own your data, have you and Google experienced an equal increase in your ability to pay?” The answer is, unsurprisingly, that the “data is more beneficial to Google, because of the other data it has.”
Marian acknowledges that these features could lead to equity concerns, but he argues that, as is also the case with other types of taxes, “generous exemption[s]” could be provided in working out system details, and the tax burden could be made progressive based on data usage. He notes, “[d]ata-rich taxpayers are also the richest taxpayers in traditional terms.” Further, because a data tax would reach taxpayers who are adept at avoiding an income tax, adding such a tax will increase progressivity.
Marian reasons that a data tax would be efficient, in the sense that it is unlikely to change taxpayer behavior. He likens a data tax to mineral taxation: “Activity must happen where the minerals are. As long as the tax is not prohibitively expensive and there is profit to be made, activity will take place where the valuable resources are found.” It is possible, he concedes, that some data users will pass the burden onto others “by charging for . . . services that are now free.” He points out that this is not a bad thing—first because it may decrease use of platforms that are arguably causing social harms, and second, because this would generate income that could be taxed under the current income tax system.
In the final section of his article, Marian reviews existing proposals for direct taxes on data or for data proxies. For example, he discusses a proposal from the mid-1990s by Arthur Cordell for a “bit tax,” as well as suggestions for taxing the infrastructure for the transmission and collection of data. Although Marian does not dismiss any proposal out of hand, given the three fundamental features outlined above, he clearly favors a tax that functions as an excise on data volume.
While the article’s scope did not permit addressing the full range of detailed issues that would have to be worked out in implementing a data tax, Marian more than fulfills his stated goal of “start[ing] a discussion about a data tax as a remedy for the failure of income taxes in data-rich markets.”






