Demonstrating Hashing/Cryptography for Intellectual Property Protection - Backed by a $178b Market Cap

2020-05-30

Patenting or copyrighting an idea is an expensive, intricate, and time consuming process. If you’ve looked into it before, you’ve probably also come across the “poor man’s patent”. The poor man’s patent consists of outlining your creation, mailing it to yourself, and now owning a sealed letter dated with the idea inside. You can probably see the problem with this as can most people, it will never hold up in court. It would be easy for someone to tamper with or forge the sealed letter, making it untrustworthy. Speaking of trust, let’s talk about the blockchain.

Blockchain is a technology that implements numerous cryptographic functions to chain “blocks” that store important information. This is the infrastructure that Bitcoin and other cryptocurrencies are built on, and the most important takeaway is that blockchain technology doesn’t rely on trust between any parties whatsoever.

This must sound insane if you’ve never heard of blockchain technology before, but it’s been around since 2008 and cryptocurrencies such as bitcoin currently hold a market cap of $270B as of 5/30/2020. Because it would take pages to write about the inner workings on how a cryptocurrency like Bitcoin uses the blockchain, here’s what you need to know to understand how it can be used for copyright purposes:

The Blockchain is an immutable, digital, and distributed ledger consisting of blocks of transaction data
The Blockchain is written on heavily secure cryptographic functions and is almost impossible to forge (you’d be a proud owner of the $270B market cap if you could)
Because nobody owns the ledger itself, it is kept track of by all parties
In order for a block of transaction data to be added to the ledger, a user(miner) solves a complex math problem using lots of processing power and is rewarded mining fees from transactions in the block they solved
The whole network is constantly trying to solve the next block, and only the first person to solve it gets the reward

You might start to see the picture already. We have an immutable ledger recorded by computers all around the world holding transaction data that cannot be forged, while visible to anybody at any time. Remember how I mentioned that you pay a little mining fee with your transactions to incentivize people to put your transaction in the block they solve? Those fees are based on exactly how much data you want in your transaction. The structure of many blockchains that are behind some of the most popular cryptocurrencies(e.g., Bitcoin, Ethereum) have extra fields in transactions that you can use to put whatever data you want.

For Bitcoin, the most popular cryptocurrency, this field is called OP_RETURN and can hold 80 bytes of data.

For Ethereum, there’s essentially no limit on size when adding additional transaction data.

Putting two and two together, you can store your intellectual property in some transaction data in the blockchain, and it would be timestamped, immutable, and recorded across computers all around the world. While this would work for Ethereum because there’s no limit on additional transaction data, it wouldn’t work for Bitcoin due to the 80 byte size cap. In order to make this work for either blockchain and not pay for an unnecessary amount of space, we can use some of the underlying security technology that Bitcoin is built on. Let’s talk about hashing.

Hashing

Hashing is a function that can take any size of data and create a fixed-length value. The most important takeaway from hashing is that this is one-way function. You can hash any text, but you cannot find the text that turns into a given hash. This is possible because of prime numbers, which are used in almost all encryption schemes we use today. Here’s a demonstration:

If we take two prime numbers, such as x = 31, and y = 79, we can say that x*y=2449.

In this case 2449 is our hash value, and 31,79 are our inputs.

If you gave anyone a piece of paper with the inputs, within a minute they can give you 2449 by multiplying the two numbers. What would happen if you told them to find x,y from 2449? Because we used prime numbers, we know that there are only two factors, aside from 1.

In order to find out which numbers these are, you would have to try and multiply almost every possible combination up to 2449. While this isn’t a feat that can really be done on paper by a human, a computer can pull this off rather quickly.

Now let’s use two other primes that can be generated by a computer:

x = 134236157900586495516489390947 y = 111030495840055605471507897841

x*y = 14904307171366116288857224710986293915036692212598386245427

As you can see, we can find the output almost instantly programmatically, but finding the two inputs given the output would be a feat even using a computer. Taking this concept but making it much more difficult and using the numerical representation of characters, you end up with hash functions such as SHA256,SHA512,MD5, etc.

If we hash the string “2020 has been quite the eventful year.” in SHA256, we will get the output A7C71F3FC412DC08C8DA293B29396328931EA5000B1C883EF1BC632036B305C3. Regardless of our input size, running text through the SHA256 function will result in a 32 byte hash value, represented as 64 hexadecimal digits as shown above. Because we’re taking a bigger length and create a smaller fixed size length, it is possible to have two inputs that lead to the same hash, but it’s extremely unlikely. Even though Bitcoin runs on the security of SHA256 and it’s been around since 2002, not a single hash collision event has occurred and mathematically will almost certainly never.

Hashing out the conclusion

Now that we have a good basis of how hashing works, we can use this technology to still store our intellectual property on the blockchain but only using 32 bytes rather than the whole idea. This works because anybody that wants proof can hash your idea, and see that it’s an exact comparison to the string you posted to the blockchain which is immutable.

As an example, I’ve gone ahead and copyrighted this article using the methods I’ve mentioned.

In order to prove this yourself, you can use any SHA-256 checksum generator and turn my article file into a hash. You’ll then see that I included that hash in the transaction data below by going to the EtherScan transaction link, clicking “more”, then viewing the data as UTF-8.

This simple proof on the Ethereum blockchain took less than a minute and cost me only 21¢, which is extremely minimal considering this article’s copyright is now secured by a $26B currency. This could be done on the Bitcoin blockchain just as easily, but for this PoC I only had ethereum on hand.

File	SHA-256 Checksum	Proof
Article File	07603fe94c830ea4b77b129c0c216c656f477a00d756082c3497c90664968683	etherscan.io transaction link