Entropy Logo


Introduction to Entropy

A new open source distributed and encrypted P2P network for anonymous and uncensored communication.


Credits:



Table of Contents:

  1. What is Entropy?
  2. Why a "net inside the net"?
    1. Entropy History
    2. Stop 1984 History
    3. What is Stop1984?
      1. What does Stop1984 want to accomplish?
      2. Political Goals
      3. Net inside a Net
      4. Why should I use Entropy?
      5. Stop 1984 materials
  3. How Does Entropy Work?
  4. What does Entropy look like?
  5. How Does Entropy differ from Freenet?
  6. What 3rd Party clients are available?
  7. System Requirements
  8. How can I get more information?
  9. Inner workings of Entropy
  10. i18: Supported Languages
  11. P2P


1) What is Entropy?

ENTROPY stands for Emerging Network To Reduce Orwellian Potency Yield and as such describes the main goal of the project.

ENTROPY is developed as a response to increasing censorship and surveillance in the Internet. The program connects your computer to a network of machines which all run this software. The ENTROPY network is running parallel to the WWW and also other Internet services like FTP, email, ICQ. Etc.

For the user the ENTROPY network looks like a collection of WWW pages. The difference to the WWW however is that there are no accesses to central servers. And this is why there is no site operator who could log who downloaded what and when. Every computer taking part in the ENTROPY network (every node) is at the same time server, router for other nodes, caching proxy and client for the user: that is You.

After you gained some experience with the ENTROPY network, there are command line tools for you to insert whole directory trees into the network as a ENTROPY site. So ENTROPY does for you what a webspace provider does for you in the WWW - but without the storage and bandwidth costs and without any regulation or policy as to what kind of content you are allowed to publish. Everyone can contribute his own ENTROPY site for everybody else to browse through. The contents is stored in a distributed manner across all available and reachable nodes and no one can find out about who put up what contents into the network. Even if your node is not actively running, your contents can be retrieved by others - without knowing that it was actually you who published the files. Of course this is only true if you do not publish your name (or leave your name or other personal data in the files you publish)





2) Why a "net inside the net"?
a) Entropy History

A little history by PullMoll:

"In spring 2001, I had a time when I was drinking way too much, hanging around, sick of the world and the people surrounding me, but I could not tell the real reason.

In the time shortly before easter 2001, I decided to take a break and even practice the abstinence (or lent) that Christians hold in this time. After a few days my body was weeping all that waste that I put into it before. I had deep thoughts, great experiences and some insights on the things going on around me.

I felt, for the first time, what it really means to be a part of this world. It's not that you're "just there". Every single wink has enormous consequences for everything else around you. You are not just an external part of the world, but a very important wheel in this clockwork.

Now, what has this all to do with Entropy? I 'saw' the things that inflow you every day. I could really 'feel' the pain while looking at TV ads, seeing the crying colors of leaflets with advertisements, the hypnotizing impact of the holographic effect on the new Euro banknotes... and I realized how much and how often all the companies and state authorities are invading your privacy, just to sell more, know more, keep an eye on you.

I hated it. I wanted to be alone and no one should know what I did, how I felt. I wanted to speak freely, read freely and discuss with others without having to fear any form of surveillance, any harassment because of my non-conforming ideas etc.

This was the time when I decided to entirely refrain from using Windows - and even Linux. I began using FreeBSD. I also cared, for the first time, about tools to increase my privacy: I began using GPG. I used Freenet before, but didn't think too much about what it really was good for - whenever it was in a working state.

I realized the many things that always scared me when using windows. I was a little bit paranoid before, but now I saw many things that really scared me. I didn't want to be the slave of my operating system, not even for money (money is what makes you a slave, not other people. They only use money as a whip to let you jump over their barricades and hurdles).

Then, one day, after reading some things about encryption, I had that idea of transmitting a replacement for random data for a one-time-pad within the same stream of data that contains the encrypted messages. I had the idea to duplicate or 'blow up' the safely transmitted random values, so that they could be used to refill the one-time-pad data that was used to encrypt both, the plain text and the new random data... some kind of perpetuum mobile.

Well, this idea was not really good, as some people told me. I'm still not fully convinced that you can't make it work. Anyway, I wanted to try things out and therefore I needed a tool to play with my ideas for encryption. Freenet was there, but I never (really) understood how the algorithms used there did work. I still don't understand them in all detail - they're too much mathematics for my taste. I'm the algebra type and big numbers, primes or even elliptic curves make me nervous :-)

I liked the simple, clean approach of the one-time-pad and there is something very similar that is cryptographically not bad either: stream ciphers. They have the advantage of using new (pseudo) random data all the time, instead of one fixed, unchanged key for big amounts of transmitted data. You "only" have to be sure to a) use a good quality random source and b) have no stupid bugs sitting in your code, and you are very safe against unwanted listeners, wiretapping, surveillance etc.

I started working on a project that would be in many ways just like Freenet, while it should use only simple to understand (in my opinion) and still strong cryptographic algorithms. I decided to re-implement the front end interface: FCP. So I could use existing client software with my code.

From my Freenet experiences I knew about some of the pinholes of P2P networks. The main problem is availability of keys. You cannot run a network assuming that data will be available; you have to think of the network as a very oblivious mind. I first thought I could cure this by using a Hamming code to put some redundancy in the data. It took some time until I saw that Hamming codes were not suitable for the purpose. FEC (forward error correction) or - to be more specific - an erasure codec was what the network needed.

Freenet supports FEC on the client side of things which is, in my humble opinion, a bad idea. I already had implemented the low-level redundancy code, so all I had to do was switch over from Hamming codes to FEC using a fast erasure codec.

This gave (one of) the boost(s) to the network. Now any node that retrieves a piece of data would need only two thirds of the chunks (8 out of 12) and could, at the same time, regenerate up to one third (4 of 12) and put the chunks of data back into the network. Every piece of data that is successfully retrieved will lead to spread more of its redundancy chunks around, keeping it intact, healing the holes in the network.

All this happens invisibly for the user, on the lowest level of the implementation. No front end or client author would have to care about this.

There was one problem remaining, though. Every single file was encoded into a) the XML containing a list of hashes of at least 12 chunks, the 1 fragment and the whole document and b) the 12 chunks of data bits and FEC bits. So even a 1 byte file did take 13 chunks and you needed 9 out of 13 (the XML plus 8 data or FEC bits) to be reconstructed.

After I implemented the encryption of the XML texts (the main keys i.e. CHK@, SSK@ and KSK@), there was no good reason not to keep short files inside the XML.

Now every short file that can be squeezed into one chunk will require only one key. This was a great improvement, as it reduced the network load for the many, many small files quite a bit.

This is where we are today and I think that Entropy is a nice example or 'proof of concept' for my ideas. There are some things left to do like handling the retrieval of large files. But this is all about timing, retries, delays and won't lead to changes in the fundamental code.

I forgot to mention one of the other goals I had in mind when I designed the data layout: no one should fear to have 'illegal data bits' on his hard drive. In a big network, where data is spread like it is supposed to be, you may have bit 0, bit 3-7, check bit 8 and b of some Britney-Spears.mp3 in your data store but you do not have a copy of a copyright protected work. And since you cannot even tell if you have, perhaps, half a bit of something, you cannot be made responsible.

Only the one who actively asks Entropy to reconstruct some data that he knows the key of (the CHK@), will then - perhaps - be committing a crime. IMO this is the safest way to handle or condone unwanted, censored, 'illegal' data.

I don't even accept the term 'illegal' for any collection of data bits, but that's just my personal point of view and it is not widely accepted."


b) Stop 1984 History

1) What is STOP1984? Who are the people behind STOP1984?

STOP1984 - these are people, who work for informational self-determination, data security and free speech. People, who for these reasons reject surveillance, censorship and data abuse.

STOP1984 is an open project.

Everyone, who would like, could and should help!

2) What does STOP1984 want to achieve? What are the goals?

We would like to contribute by helping people to be conscious of:

1. the value of their own privacy

2. the value of their own data

3. the dangers of the abuse of data

4. the consequences of the loss of privacy

5. the political, social and personal consequences of an increasing surveillance · the dangers of the political lack of interest


3) Our political goals are:

1. A transparent examination and if necessary canceling of the TKUEV as well as the European data retention directive, which approves provisional data storage on the European Union level

2. Transparency regarding successes and failures of surveillance · Transparency regarding kinds and extent of past and current surveillance · A right for data protection and informational self-determination fixed in the constitution as well as on European Union level.


4) Net inside a Net

The Entropy-net can hardly be put under surveillance, this is the difference between the "direct" Internet and the Entropy-net. Nobody is able to know who has up- or downloaded which content as there are no central log files and no central server.



5) Why should I use Entropy?


If you do not want to accept the growing surveillance of any communication and the growing censorship -often in the name of copyright, branding or similar laws which are used to restrict communication - you should protect your communication.

Software to encrypt your e-mail (PGP, GnuPG) does help. But: anyone who is interested can still see that person A has communicated with person B. File transfer is usually unencrypted. HTTP, FTP etc. are protocols which are not encrypted. They do leave traces and so someone can find out which files have been up- or downloaded, by whom and to or from which server. Entropy tries to plug these holes (holes meant in the way of data security) by hiding connection details.

6) Stop 1984 materials

a) Humans - a private being

Is it nowadays, when people reveal their most intimate secrets about themselves in talk shows, actually still important to have any privacy? The answer is simple:

Even if many humans give their privacy up too voluntarily and carelessly, the total abolishment of privacy for all cannot be the result.

Privacy just means there are areas in life, in which, without our permission, nobody should have insight:

1. The private telephone call.

2. The short flirt in an Internet chat room.

3. The own preferences (not only, but as well, in sexual regard) · The daily mail

4. The critical book, about which one converses with the neighbor · The walk over the marketplace

Privacy refers to data protection, but it covers as well things like the communication secret and the secrecy of letters.

Having privacy means, being able to say:

Stop! Until this point and no further!


b) Surveillance

Do you know whether you are monitored?

In most cases, you don't! Non-out-described video surveillance, non-communicated telephone monitoring...

The list of the secret surveillance is long. And the uncertainty about being observed or not naturally also affects us. Personal contact and social relations become more difficult by the developing distrust.

Examples and forms:

1. Video surveillance

Video surveillance (also called CCTV) for the surveillance of objects or places mainly used with the argument of preventing and monitoring criminal activities. Video surveillance is not only expensive, but also often only leads to a displacement of the criminality into not supervised districts. By missing signs pointing out the cameras as well as by lacking clearing-up on details (are the films stored, for how long and who gets access to them) the citizens are left without transparency concerning by whom and whereby they are being watched.

2. Pattern search

This is the search for criminals and possible terrorists in national data bases.

According to fixed methods (the pattern) a group of persons is examined individually.

The patterns are often arbitrarily or vaguely fixed, so that law-abiding citizens may get stuck in the pattern easily.

3. Internet surveillance

A surveillance of the connecting data - content is desired, too. The TKUEV (German telecommunication monitoring act) is a notorious example of it and at the same time for provisional data storage. A goal is the collection of the data on provision, in order to search this data for potential perpetrators if necessary. At the same time, electronic profiles of the Internet users are developed. Such profiles may also be politically motivated and evaluated accordingly.

This is incompatible with the assumption of innocence, the individual's privacy and informational self-determination.

4. Data mining systems

Widely known are particularly "Echelon" and the planned "Total Information Awareness" system. Such mechanisms are maintained mainly by secret services, as for example the US-American "NSA". The goal is the comprehensive collecting of data from most different sources, like travel reservations, financial transactions or the E-Mail and telephone services.

These systems are also incompatible with democratic principles such as assumption of innocence etc..

c) Data security

Everyone has private data, but only few care about it.

Our own world of data starts with the own housing lease, recently includes the motor traffic as well (for example in London) and ends with completely everyday data. This means for example the connection data of telephone & Internet communication, but also the details of financial transactions, even for small payments - the credit card makes life easier but also contributes to the databases of financial corporations.

Another example would be personal data concerning health insurance - maybe your employer would like to take a look at these?

Who cares about one's data and its protection...

Examples from the world of data:

1. Data protection

Data protection concerns everybody, because every human defines himself over his data in our world. Or don't you have anything to hide? Isn't it annoying for example to have unwanted guests trying to penetrate into your domestic computer, in order to spy on your data there?

Thus data avoidance and data protection are important. In the Internet for example by suitable technical measures, like firewalls and proxies, as well as skillful configuration of the software. Thereby unpleasant contemporaries (viruses and worms or the alleged hacker) are kept away.

2. Security of telecommunications and TKUEV The TKUEV (see above) renders the security of telecommunication useless.

The legislator obligates the telephone companies and the Internet providers to store connection data of telephone calls or Internet connections. Allegedly, this is done only to search for terrorists or potential criminals communicating.

Like a grand-prize in a lottery this occurrence is rare, but the players try it again and again...

3. Data bases and electronic profiles

In our digital world, in which everything can be noted and stored, any protection is almost completely omitted. The data flood produced by us is noted automatically and stored arbitrarily for a long time. Whether credit card usage, presenting of the loyalty card in the supermarket or a simple phone call, data is collected everywhere, stored and connected, thus perfecting the citizen's profile more and more. This way the respective organization is able to find the - from its point of view - ideal way of communicating with us.

d) Censorship

Censorship is the control of information, that citizens are able to get, by considering which information is released and how it is accessible. Not only powerful groups are using that technique, the human being itself tries to maintain censorship by itself.

Examples for external censorship:

1. Censorship in the Internet

Censorship at the Internet is realized by using technical measures like IP-blocking and content filters. Widely known is the censorship in China and Saudi-Arabia, where religious, dissident and/or pornographic sites are blocked. In Germany, actually, Mr.

Büssow (President of the government of the state of NRW) along with others, is trying to force ISPs to block US-based websites because of xenophobic, racist and neo-Nazi content, as well as a site of which they claim that it is made up of inhuman content.

Not to mention Spain, where the LSSI Act is in effect and is used to stop unlicensed websites by applying political and economical pressure, which caused many websites to go offline.

2. Censorship in the mass-media

The "classic" media is censoring information, too. On the one hand they ignore - maybe - important news for merely economic reasons, on the other hand, they are trying to control the information due to actual political or economical reasons.

The result is obvious. The freedom of information and opinion is restricted. Unreflected and unique thinking is maintained, where the opportunity to inform oneself from freely accessible sources and uncensored sources is necessary, whenever and however one wants to do so.

Internal Censorship - The bars in your own mind As we know, information is filtered and censored, we should keep in mind, that the most important censorship is happening in our own minds. We have to learn to tolerate and to consider attitudes and opinions of other people. There is no other way to prevent us from our own intrinsic censorship.

It's a main target of STOP1984 to educate people to think that sort of freedom!


e) Information for anybody

1) Information for anybody

Imagine a world of tomorrow, where there is no free flow of information at all. No free-tv, no free-radio stations and in the shelves of the libraries there is just dust.

Unbelievable? Yet possible!

That's not reality by now:

The wired world delivers more and more information to us on a daily basis. But the right to receive that information is at risk. Especially on the Internet. No other media is faster and more off-borders, no one is cheaper and more flexible. None is more open and none seems to be a bigger threat to governments and the powerful. The well informed citizen seems to become a threat! Without a doubt, countries are trying to restrict the free flow of information and are going to manipulate their citizens. In addition, political, economical and religious groups, associations and companies are trying to restrict the freedom of information and opinion.

2) Freedom of opinion - any opinion is of value! Why such efforts to protect freedom of information, security of telecommunications, privacy and the fight against data-retention?

Only with free and open sources of information, powerfully protected from censorship and the right to use those sources with full respect to privacy, it's possible, for anybody, to set up his own mind on politics, daily news, religious beliefs, etc.

3) The freedom of thought - and act - is the matter! Basic civil rights! It is our freedom! Only by maintaining a free and open society we can reach a society rich of different attitudes, ideals and believes, far beyond an all-uniform society.

Is this a society worth to be engaged in?

We think: YES!

3) How does Entropy Work? (General)

Entropy is a software for several tasks - tasks for the purpose of an anonymous and uncensored communication:

1. it connects (via the Internet) your computer with other computers where Entropy is installed and running (P2P - Peer to Peer)

2. it distributes or downloads pieces of data (chunks) to/from computers taking part in the Entropy-Network

3. data-loss by "nodes" being (permanently or temporarily) offline is avoided by redundancy (FEC = forward error correction)

4. data exchange between nodes is encrypted

5. one file (the cache) is being used as store for the data - both your own as those of other nodes (you can, of course, define the size of this cache)

6. it does not store complete files or file names in a legible form(!) on a single computer

Entropy supports the Freenet Client Protocol (FCP) so that existing clients can easily and quickly be used for Entropy. Freenet and Entropy can be used at the same time.

One example for those clients is Frost, a software originally written for Freenet. Frost can be used for exchanging news and files (it serves as message board and file-sharing client at the same time) and can be used for both Freenet and Entropy.

4) What does Entropy look like?

Entropy can be used in different ways. The average user will probably use the web-interface (proxy).In this case, Entropy will not be looking much different to the known pages on the WWW. The difference is that there are no central servers and no "active content" which could be used to track users' surfing behavior.

Clients like Frost are also meant to address to the "average user" - for those users who want to exchange opinions, texts, pictures, documents, music etc. or who want to have access to "blocked" information.

For the sophisticated user, who is able to design HTML-pages himself, there are helping programs (tools) to put one or more websites into the Entropy network. The difference between this opportunity and a provider is that there is no regulation for the content because there is no chance for regulation (as mentioned above: any down- or upload is made anonymously)

Other tools are to be thought of. Some of them are already "work in progress" (a shared SQL database, HTTP-proxy using Entropy as cache). The Freenet Client Protocol offers relatively simple ways for the development of new applications if you are interested to create new applications.

5) How does Entropy differ from Freenet?

a) Why further develop entropy since Freenet exists?

Choice and freedom. It's always good to have more than one option available to you.

b) Why use entropy since Freenet exists?

Entropy is faster and simpler than Freenet.

c) What is the relationship between entropy and Freenet?

Entropy considers Freenet to be another means to the same end. We have a separate development paths but our end goals are generally the same: anonymous communication. Ian Clarke of Freenet has, unfortunately, not been very kind with his remarks on Entropy.

d) What instances should you use Freenet instead of Entropy?

If you need bullet-proof cryptography, use Freenet. Otherwise, we think that Entropy is very good, fast, and has a reasonably good streaming cypher set.



6) What 3rd Party clients are available?

Entropy also supports this protocol of Freenet, so programs designed to use it should work with Entropy as well. The only major difference is the default FCP port number, which is 8482 for Entropy (while it is 8481 for Freenet). You can either configure the clients for Entropy to use port 8482, or - if you don't use Freenet at all - configure Entropy to use port 8481 (changing the line fcpport=8482 in entropy.conf).

a) Samizdat

Samizdat is a NNTP gateway for Entropy (and Freenet). It is designed so that you can use a standard newsreader (such as Mozilla Newsgroups, Knode, or Tin) to read and post news articles securely and anonymously. You will just have to create a new news server entry for localhost using port (or service) number 1119 (usually news is on port 119, but that is a privileged port). Then you choose a fantasy email address and/or name and can subscribe the configured news groups. While Samizdat and Samizdat-nntp daemons are running, they will collect new messages or insert your postings in the background.

b) Frost

Frost is a Java program, that is something between a message board (comparable to discussion forums) and a file sharing client. There's a search function, too. However, quite different from other P2P networks such as Gnutella, E-Donkey oder Morpheus/KaZaa, these searches take place on your own machine after downloading a list of keys from the network.

That is why your own node must run for some time before it can find some of the lists (so called index files). Only then a search can be successful. In other words: after starting Frost you should look for some time at the log outputs and wait. At some point Frost will output a series of stars and then dots (.), showing that a list of keys is stored at your local drive. Also texts on the message boards take some time to arrive. Frost looks backward from today up to three days back. I suggest you manually add a board by the name of 'test' (without the quotes) and write a "Hello World" there, just see if you can upload and thus - most probably - also download messages.

c) Freenet Tools

The most important tools for those who want to insert their own content as a website into Freenet or Entropy, are the Freenet Tools (or similar tools from other authors :). For Freenet, there are some such programs linked from their http://freenetproject.org) pages. Not many of them will work with Entropy out-of-the-box, as they sometimes specialize on minor deviations in the FCP interface. Specifically the newer tools, supporting the FEC FCP v1.1 will fail with Entropy, as Entropy does not yet fully support the changes to the Freenet Client Protocol. So I suggest you use ft for Entropy for now, since I can help you there with problems or questions.

d) Freemail

Freemail allows you to send encrypted, private, and anonymous email over Entropy and Freenet. It's written in python.


7) System Requirements

Basic requirements:

You need the following environment and libraries on your *nix system:

1. GNU C Compiler (gcc)

2. GNU Make (e.g. Gmake on systems where GNU make is not the default)

3. Zlib compression library (http://www.gzip.org/zlib/) 1.1.3 or newer

4. Expat XML library (http://expat.sourceforge.net/) 1.95.2 or newer


System requirements:

1. *nix-box, Windows-PC, or Mac OS X/Darwin with Internet-connection (at least 56K-Modem)

2. at last 100 MB free disk-space (for the cache described above)

3. the Entropy software (download of ca. 556.4K source or 1.5M Windows setup)

4. some online-time with an IP-address not changing too quickly

5. a free TCP-port (you probably have to adapt your firewall or NAT)

6. a web-browser addressing localhost/127.0.0.1 without proxy (for security)



8) How can I get more information on Entropy? How can I help?

The Entropy homepage can be found online at the following URL: http://entropy.stop1984.com/

On the Entropy homepage you can find more information about how Entropy works as well as download various tools which work with Entropy.

If you have questions you can:

1. post a message to http://f27.parsimony.net/forum66166/

2. post a message to the "entropy" news board (from inside entropy)



9) Inner workings of Entropy (detailed)

Entropy is designed to meet several goals at once:

1. Distribute documents in the network in a wide-spread manner, so that it is not practically possible to locate it (i.e. Make censorship impossible).

2. Keep documents retrievable even if a part of the nodes keeping copies of them is offline -Hide from users, admins or authorities, what node on the network actually keeps parts (fragments, bit-chunks) of what documents (key encryption).

3. Optimize data retrieval for the most common user base with ADSL, where the downstream bandwidth is several times the upstream bandwidth (e.g. 768Kbit/s vs. 128kbit/s with many German ISPs).

4. Hide from network operators, what kind of communication happens between nodes of the network (transport encryption).

These and some more goals have been achieved. Some still have to be tested and optimized on a large scale network.


Entropy is divided into several modules, which are run as separate processes (forked) that use shared memory to communicate with each other. These modules are:

1.Peer management with bandwidth limiter

2.Peer outgoing connections launcher

3.Peer incoming connections listener

4.Data store management

5.Freenet Client Protocol (FCP) server

6.HTTP gateway, aka proxy

a) Peer management with bandwidth limiter

This module and its process is rather simple. You can configure a maximum bandwidth to use for incoming and outgoing connections. You configure a total number of bytes per second in entropy.conf. The limiter now runs with 10 ticks per second (in file include/config.h there's a line #define TICKS 10) and adds to two shared memory variables 1/10th of the bandwidth per second in either direction. The socket I/O functions use these variables to grab their required bandwidth from. If e.g. A send buffer is 10K bytes and there's currently only 1000 byte/s available, the sock_writeall() function will sleep and later try to get more bandwidth until it is finished writing all the data out. The various processes of the connections are concurrently trying to lock the bandwidth variables down and get a fraction (80%, hard-coded in src/sock.c) of the required bandwidth until they're done.

This method is perhaps far from perfect and some operating systems may have much better ways to limit a socket or network bandwidth. However, Entropy runs on systems where such things are not available and there is no general standard anyway. Entropy at least allows you to keep some spare bandwidth for your other jobs and this is all of what the bandwidth limiter is intended to do.

b) Peer outgoing connections launcher

Entropy has its very own way of finding out about possible peer nodes it can contact. Besides the initial list, which is defined by listing some hostname:port or ipaddr:port lines in your seed.txt file (or whatever you want to call it in your entropy.conf), a node tells its outgoing connections about other connections it has every now and then.

These so-called node announcements are made inside the network. For this purpose, a node creates a file with zero contents but meta-data only. In this meta-data there is an XML text specifying some information about a node:

1.node's IP address

2.node's contact port (world accessible port)

3.node's preferred encryption module

This information is packed into an XML text of the form:

<?xml version="1.0" standalone='yes'?>

<p2p>

<peer hostname='aaa.bbb.ccc.ddd' port='nnnn' /> <crypto module='something' />

</p2p>

The hostname= attribute here actually contains an IP address. It could contain a hostname as well, but all hostnames are resolved prior to spreading info around, since this saves many hostname lookups.

The crypto module name is a hint for a node wanting to contact this peer how to try to talk to it. The peer might just ignore the incoming connection, if it does not (actually) want to support the incoming node's crypto module.

These announcements are now kept in documents (with zero length) in the meta-data part. The documents have known names. The current names are:

entropy:KSK@utc-timeslice-ipaddr:port

entropy:KSK@utc-timeslice-2-digit-hex-value

The first form is just used to derive a hash value from it. The first byte of this hash value is then used to create the second form. This KSK (key signed key) is then redirected to the content hash key that contains the meta-data with p2p information. The timeslice part of both keys is a hex number derived from the current time UTC (or GMT) rounded to the next 10 minutes timeslice. So an active node can be found in another node's data store under some key KSK@utc-current-timeslice-some-2-digit-hex-value.

A node's outgoing connections launcher does look up all 256 possible keys of the current timeslice until it finds an entry that is:

1.not yet connected

2.not blocked (due to errors or permanently)

3.resolvable (if it is a hostname)

and then tries to make a new outgoing connection to that peer. The rate at which outgoing connections are sought decreases with the number of already existing outgoing connections. With other words: the more connections your node already has, the slower it will make new connections. If a node has all of its outgoing connections used up (currently 32), it does not search for more, until one or more connections are dropped.

If a connection fails, the peer's IP address and port are entered into a failure list (in shared memory). If this happens for the second time within some time (currently 1 hour) the peer's IP address and port is blocked for 10 minutes. It is therefore put into the blocked list with a timestamp of now + 10 minutes, which means the peer_node_search() function in src/peer.c will skip this ip/port for some time.

You might wonder if two nodes could produce the same 2-digit-hex value for a specific time slice? Yes, they could and they sometimes will do. It just isn't much of a problem, since one of the keys will be somewhere on the network. The next time slice changes the resulting numbers and the possibility, that two nodes constantly collide in their announcement keys, is very low (except they had the same ipaddr/port - which cannot happen if you think about it :). Should the net ever become so big that 256 slots are not sufficient for every node to find enough outgoing connections after some time, I might as well add another digit of the hash and then search for 4096 slots. I don't believe that this will ever be needed, though.

c) Peer incoming connections listener

This module and process is rather simple. It sets up a socket, binds it to the special IP address 0.0.0.0 (IPADDRANY) with the specified port number for Entropy (nodeport= from entropy.conf), so that anyone can contact it and then listens for incoming connections. Whenever the listen() call returns, a new connection is accept()ed and a new child process is started to handle the incoming connection.

This child process first checks that there is a free slot for incoming connections. If a node has 32 incoming connections, it won't accept any more. This did not yet happen, since the network is just too small, but it is supposed to happen.

Then the child process reads a first, initial message from the peer node that contains a XML text block of the form:

<?xml version="1.0" standalone='yes'?>

<p2p>

<peer hostname='node.somewhere.net' port='nnnn' /> <node type='entropy' major='0' minor='0.30' build='xyz' /> <store fingerprint='[32 hex digits]' />

<crypto module='something' size='xxx' initial='[2 * size hex digits]' /> </p2p>

Some of the tags and attributes are looking familiar? Yes, the message is similar to the node announcement's XML text and in fact parsed by the same functions. It has some more information, as in this case the node hostname= itself is contacting us. At least, that is what we assume. One of the first things the peer_in_child() function in src/peer.c does is verifying the hostname against the incoming IP address. The incoming address is set by the listener process during the accept() and kept in the connection info. If a hostname lookup matches the IP address, then Entropy updates its internal lists of contacted peers to show hostname:port for every connection to that IP address. But this is just cosmetics. If a node claims to be whitehouse.gov and is (most probably) not at this address, it will still work with Entropy. And since its address, not its hostname, will be spread around inside the net, it will also be contacted by other nodes - sooner or later.

The node tag and the attributes there should be obvious. This is purely informational for now but might be used to e.g. Block certain implementations or (broken) builds from polluting the network. Of course, someone with malicious intent could fake these fields, as they are not verified in any way.

The fingerprint attribute of the store tag does serve an important purpose. A node's fingerprint determines its place in the routing inside the network. Such a fingerprint consists of 16 byte-sized values, i.e. Unsigned numbers between 0 and 255. The node derives its own fingerprint by weighing the number of keys of certain kinds in its data store. For routing purposes and the fingerprint, Entropy simply uses the last digit of the SHA1 hashes of the keys. So if a node has a maximum of 1000 keys ending in '9', 500 ending in '3' and '100' ending in 'f', and some more keys in unimportant amounts (say below 10) its fingerprint will look like this:

01 03 00 7f 00 02 00 00 01 ff 00 00 02 00 00 19

You see the ff in 'slot' number 9 (counted from zero), the 7f in slot number 3 and the 19 in slot number f. These numbers tell when a node is asked for a key (request) and when it is told about a new key (advertise).

Fingerprints are updated from time to time (currently every five minutes) when a node sends its current fingerprint as message to all contacted peers. The difference from other messages is that this special message contains a hops to live value of zero (messages with hops to live zero wouldn't normally leave a node). So if a node's fingerprint changes - slightly or dramatically, e.g. Because it collected a lot of data from local or from other peers - its peer nodes will be informed about what to expect from it or what to send to it. The whole story with fingerprints is nothing more (and nothing less) than a way to load-balance the network and to diverge keys, so that they are kept in different places in the network.

In the final tag of the initial message (crypto) a node tells its peer about what crypto module it intends to use and what initial data (comparable to a session key) it will use. The crypto module= names are simply placeholders for the implemented methods. I am experimenting with some stream-cipher number generators and Entropy currently prefers a module 'crypt3', which is an implementation of an S-box algorithm. Other modules do nothing (crypt0 is a null-layer), are obsolete (crypt1 is the old, now unused method of Entropy up to 0.2.x) or are experimental and not working.

Since this initial message is currently not encrypted, the whole communication between two nodes could be eavesdropped by just logging the initial= attribute of a connection and re-running the same algorithm. And though the communications encryption becomes to make sense only when the node-key handling is there, so that the initial messages can be sent encrypted, too, it makes it unpractical for a listener to follow the communications on the line.

For the time being, this communications layer encryption is nothing more than hiding away what goes on on a connection. Still, this is much better than what most P2P networks do about their user's privacy: nothing.

d) Data store management

Entropy's data store is kept in a tree of directories below a configurable base path; storepath=store is the default in entropy.conf. Depending on the setting of storedepth=x in the configuration, there is a zero, one, two or three level tree of directories. Every directory's name is just one lower case hex digit, that is a number between 0-9 or a-f.

The best storedepth to choose depends on the abilities of your filesystem. For most Unix filesystems and also NTFS, a storedepth of 1 seems to be a good choice. Whenever Entropy is looking for a key or looking for a place to store a key, your system has to traverse the directory contents to see if a given filename already exists. If you intend to have a huge store, it will probably be wise to not use a flat directory, that is: all files in one directory (storedepth=0), because then there will be several hundred thousand or even million filenames to be scanned for every action.

On the other hand it is not wise to choose a deep nesting for the directories, if you're going with the default store size or not much more. The reason is that your system will be able to keep some directories in memory. The system will try to do this for some recently accessed directories, but perhaps not for 256 (storedepth=2) or even 4096 (storedepth=3). On FreeBSD on a UFS file system I got the best results with storedepth=1.

Now how does Entropy decide where to look for a file? Each file's name is a 40 hex digit string. It represents the SHA1 hash value for the file (= key). For most files, this is the SHA1 hash of the contents of the file: for the bit chunks. For redirecting keys (CHK@, SSK@ and KSK@) however, the file's name is the SHA1 hash of the contents of the file you would get if you reconstructed the contents from the list of chunks inside the key.

Right now the contents of the CHK@, SSK@ and KSK@ keys is not encrypted. It is stored as plain XML text file - it might be gzipped for larger files. In the next stage of Entropy's development, those keys will be encrypted. Only then, after this is done, no one (not me, not you, no authority) will be able to tell exactly what is contained in your data store.

Until now it would be possible to scan for the XML texts and reconstruct a key hierarchy and then check if and what chunks of the keys you have in your data store. This is bad and thus will be impossible soon.

For the details of how a directory for a key is chosen by entropy, take a look into src/store.c. The last four digits of the SHA1 hash are used to pre-select a directory and determine the routing. The fourth digit from the right is used to determine the first directory level, the third digit for the second level (if any), the second digit for the third level (if any). The last digit is not used in the store, but to build fingerprints and to decide on routes.


e) Freenet Client Protocol

The Freenet Client Protocol, as the name suggests, was designed by the developers of Freenet. It is a specification for a set of names and conventions for a client application to talk to a Freenet node. Entropy implements a very similar interface, except for some minor differences that should not hurt any well designed client (some differences do hurt FCP clients, but that is because they are relying on undocumented details or definitions of Freenet).

A Freenet node usually listens on 127.0.0.1:8481 for FCP clients, that's why I've chosen the default port number (or service) 8482 for Entropy. You can run both, Freenet and Entropy, on the same machine without collisions.

After a client made a socket connection to the FCP port, it will send a request that is plain text for the most part. But prior to any text, it must send a header of four bytes. This header might be used for protocol identification in the future. Right now, the only possible and accepted header is 00 00 00 02 - this is true for Freenet as well as for Entropy and all FCP clients seem to have this hardcoded.

The text part of every Request consists of two or more lines, terminated with CR (\n in C notation). The first line is the type of request that is send. Then there can be from zero to several lines with parameters for a request, and finally there is a message terminating line.

The list of commands or requests that entropy understands is as follows:

ClientHello ... EndMessage

ClientGet ... EndMessage

ClientPut ... Data

GenerateCHK ... EndMessage

ClientDelete ... EndMessage

GenerateSVKPair ... EndMessage

FECSegmentFile ... EndMessage

FECEncodeSegment ... Data

FECDecodeSegment ... EndMessage

FECMakeMetadata ... Data

GenerateSHA1 ... Data


ClientHello

The client says "Hello!" to the node and expects the node to reply. Nodes are friendly beings and usually tell who they are and what they are willing to do. The request has no parameters and so the second line is a EndMessage text. After sending this line, you should read from the socket until end-of-file and/or until you received an EndMessage line. So this is what your client application should send on the socket connection to a FCP Server (values in square brackets are binary, i.e. Bytes):

[00][00][00][02]

ClientHello

EndMessage

Then you can expect the server to reply with some lines like this:

NodeHello

NodeHello

Node=ENTROPY,0,3.0,215

Protocol=1.2

MaxFilesize=1fffff

EndMessage

This is the reply you'll receive from Entropy. Node= line contains the Node's name and version and build numbers (they will increase with every release). The Protocol= line describes the current version of implemented FCP features. Entropy is not fully compatible to Freenet (yet) and does though reply with 1.2 to not confuse some clients. The MaxFilesize= line is the smaller of 2GB - 1 and storesize - 1.

As of build number 284 and newer, Entropy defines an additional protocol header that is used to support clients which understand some additional replies sent by the node. The header for this extension is:

[00][00][01][02]

ClientHello

EndMessage

If your client sends this header, there will be two more fields in the NodeHello reply:

NodeHello

Node=ENTROPY,0,3.0,215

Protocol=1.2

MaxFilesize=1fffff

MaxHopsToLive=a

SVKExtension=BCMA

EndMessage

These two fields MaxHopsToLive and SVKExtension are unique for Entropy. Freenet does not return them for a NodeHello. The default for the SVKExtension of a client should be PagM (if it is intended to run on both, Freenet and Entropy), because this is how Freenet extends a public sub space key (SSK@). For Entropy this Extension is BCMA. So if your client sees this Line SVKExtension=BCMA, you should change your (otherwise hard-coded?) defaults. The MaxHopsToLive value can be used to scale your client's range of retries to insert or fetch data. Freenet usually is configured to support a maximum HopsToLive of 25 (decimal; 19 hex). However, there is no way for a client application to know the current setting.

ClientGet

ClientGet is the work-horse of the FCP. It is used to request data that was inserted under a specific key. Each ClientGet requests must at least specify the URI of the request (uniform resource identifier) and should specify a HopsToLive= value. In case of Entropy, the URI is of the form entropy:CHK@xxxxxxxx,yyyyy for a content hash key, entropy:SSK@xxxxxxxx,yyyyyy/path/file for a file below a sub space key or entropy:KSK@somename for a key signed key (which in Entropy is nothing but a redirecting key to the CHK@ of the contents of the file). The HopsToLive= value is specified as hexadecimal digits, e.g. HopsToLive=a for a request that should go 10 hops at most. So a request looks like this:

ClientGet

URI=entropy:KSK@gpl.txt

HopsToLive=19

EndMessage

This would request the infamous test key gpl.txt with a hops to live value of 25 (19 hex is 25 decimal). The reply from the node depends on some things, like if your request was formally okay, if the key can be requested (outgoing connections with sufficient routes for the key), if the data is available, in the local store or - perhaps after some incoming connection sent it. Here's the list of possible replies:

NodeFailed

NodeFailed

Reason=Some description of the reason why the request failed EndMessage

The node failed to successfully complete the get request, usually because of some internal problem. You shouldn't see this reply too often, except for broken builds or bad configurations.

URIError

URIError

URI=xxx

EndMessage

Your URI was wrong. This could be for one of several reason:

1. You specified a CHK@ in wrong format (is it perhaps a Freenet CHK@?) -You specified a SSK@ without the correct SVKExtension (BCMA for Entropy)

2. The URI is too long or contains a typo. Watch out for URL encoded strings. You will have to decode them.

3. Some of the characters of an Entropy URI that might be URL encoded are the at (@), the tilde (~) and the colon (:)

RouteNotFound (aka RNF)

RouteNotFound

Reason=No route found

EndMessage

A suitable route for requesting a key (either the main key or one or more of its internally following bit chunk keys) could not be found, even after some retries. This message is to be expected more often, especially in these cases:

Your node has no or very few outgoing connections. Be sure to check your seed.txt has some valid entries and your DNS is able to resolve the nodes listed there.

Your node is overloaded with requests and hardly finds enough free queue entries in outgoing connections to handle your local requests. You could try to lower your inbound bandwidth or increase your outbound bandwith. As a last resort you might try to use HopsToLive values closer to the maximum, because this leads to a broader spreading of keys (to not so well matching routes, too).

DataNotFound (aka DNF)

DataNotFound

Reason=No data found

EndMessage

This is the most common message. A key you requested could not be found. This could be for one or more of several reasons:

1. You have no inbound connections (look at /node/peers.html).

2. The key never existed.

3. The key existed but it fell out of the network, because all participating node's data stores were full.

4. The key is somewhere, but it did not make it to the nodes that your node has contacts to (or to be more specific: nodes that did contact your node).

Only in the latter case it makes sense to try again to fetch the key, perhaps with an increased HopsToLive value. It also makes sense to retry with the same or even lower HopsToLive value, because it could simply be to high network load that a key did not yet arrive close to your node.

The general format of a ClientGet request is:

ClientGet

URI=xxx

[HopsToLive=xx]

[MaxFileSize=xxxx]

[Verbose={boolean}]

EndMessage

The lines in square brackets are optional fields for the request. Note that the MaxFileSize= and Verbose= fields are Entropy extensions; Freenet does not support them. MaxFileSize= expects a hexadecimal value for maximum size your application wants to receive. This can be used to limit the impact of retrieving unknown keys in some polling client application, if there are malicious spammers polluting your name space. Entropy uses this option itself, internally, to limit the size of news messages to 64K. No message longer than this limit will be retrieved or displayed.

The Verbose= option is also Entropy specific. Freenet sends a reply only in case of errors or retries (Restarted message). Entropy can be more verbose. With Verbose=true (or Verbose=yes, Verbose=1), your client will receive a NodeGet message for every fragment that Entropy is trying to collect and reconstruct. The format is this:

NodeGet

URI={internal-fragment-key}

Offset=xxx

DataLength=xxx

Percent=dd%

EndMessage

You can use this info to display some kind of progress information for long lasting downloads. Note that the Offset and Percent values may "jump back" after an internal retry.

The general reply for successful request is:

DataFound

DataLength=xxxx

MetadataLength=xxxx

EndMessage

This is the general format of the node's reply if the data you requested could be retrieved. The DataLength and MetadataLength lines tell your application, what amount of meta data and data will follow now. The DataLength is the total amount of bytes following, including meta data!. So if you have a reply DataLength=123 and MetadataLength=23 this means that you should expect 23 hex (35 decimal) bytes of meta data first, followed by 100 hex (256 decimal) bytes of document data. I don't know who invented this specification - I would have designed it differently; anyway, the meta data and data now follows in packets:

The DataChunk reply

DataChunk

Length=xxxx

Data

{raw binary data}

This is how the node sends the meta data and data to your client application. There is no guarantee about alignment of meta data (if any) and data chunks. For Entropy the meta data part will come in its own chunk (or multiple chunks), but you cannot rely on this. You will have to count the number of bytes that were given in the DataFound header to know when meta data ends and raw document data starts.

ClientPut

GenerateCHK

ClientDelete

GenerateSVKPair

FECSegmentFile

FECEncodeSegment

FECDecodeSegment

FECMakeMetadata

GenerateSHA1

This command is a helper for clients that do not have a native function to generate a SHA1 hash for a sequence of data bytes. Many languages have this type of function or a library you can include and it will be faster most of the times to use one of these. Sending huge files over a socket can be adventurous, so use this only as a last resort.


10) i18: Supported Languages

Currently English and German are supported. Entropy has an external translation file that allows for easy translation into additional languages. Entropy supports UTF-8.


11) P2P

There have always been ways to protect ones e-mail communication. You have probably heard about Pretty Good Privacy (PGP) from Phil Zimmermann; or you have heard about the GNU Privacy Guard (GnuPG or short GPG), which comes as free software under the GPL (GNU General Public License). Those programs are very well suited to make it hard for eavesdroppers to read all your electronic communication with others.

But there are a lot of privacy concerns in unveiling with whom you had e-mail contact and when, even if the contents of the communication is encrypted. That's why I was looking 'for more'. Please don't take this too serious ;-) It is not that I want to replace or overcome PGP or GPG but rather look for another way to anonymize data and information exchange. And I want to achieve impossibility of censorship, if this goal is reachable at all.


a) What is a Peer-To-Peer network (P2P)?

Peer-To-Peer means 'hand-in-hand' or 'face-to-face' and describes how computers are connected to each other. The most commonly used form of a connection on the Internet is where one computer does the job of a server and where there are one or more (many) clients. This approach makes a distinction between who serves and who requests. True peer-to-peer solutions on the other hand let computers play all roles: the server, the client and sometimes also the router or forwarder. That means that while many protocols such as HTTP or FTP have dedicated machines for the server job and where data is stored in one location and always retrieved from there, many peer-to-peer networks have no dedicated servers and data is also spread over many computers in a network. This is the main difference of the re-invented p2p technology compared to many established services.

The difference between a server and a client on the Internet isn't actually too big. For small server, which do not have to server hundreds or thousands of requests, any 'normal' Internet connection would be sufficient. Now a p2p network uses the fact that small networks with few connections can be handled by almost any (non-dedicated) computer. The network as a whole may hold a lot of data and also give a high speed at accessing the data, but these accesses do not have to be handled by a single, big machine. The 'nodes' of a p2p network are serving the content together, in a distributed manner. And the nodes also play the role of forwarding or routing the data. The main job of a peer-to-peer software is to define and implement a routing protocol by which data can be put into and retrieved from a network of nodes. This routing does connect 'neighbor' nodes to each other, where neighbor does not necessarily mean a geographically short distance, but a short distance in the network topology.

If some connections between computers are established they can be used to exchange texts like e-mails or messages like ICQ, or data and files. The files could be pictures, music, films... anything which you can store electronically on your computer. The p2p network software now assigns a key (or a short cut) to any such file or message and it is these keys that make their way through the network first. It would not make too much sense to simply use ones local filenames for a key. Many people will use the same names for their files, like 'text1.txt' and still have different contents in their files. So what is needed is a system to categorize and label files in a usable way. And it must be possible to look out for a certain file or message on the network and identify it right.

b) About keys and hashes - specifically SHA1

The key to bring some order into the chaos is the key, or hash value, of a file. A hash value is something like a short cut or a handle describing the contents of a file in a short but unique way. There are several methods to find such unique hashes for a file contents. One of the newer algorithms used to do it is SHA1 (secure hash algorithm 1) and it is described here for example.

To describe the function of a hash value in a non-technical way you could assume the aim was to fine a unique name for every file on your hard disc. Unique not only for your hard disc but unique in the whole world (or even unique in the universe and all other universes). If you're going to use filenames like 'file1.txt', 'file2.txt' and so on, you would not come too far. The 'trick' of SHA1 is that it does something like adding up the characters (or bytes) a file. Of course it does not simply add the values of the bytes, because otherwise a file containing '12' would yield the same SHA1 hash as a file containing '21' (and it does not). The effect however should be clear: every file gets its own unique 'number' assigned based on its contents. This 'number' is the hash value, which in case of SHA1 is a 160 bits number.

It might seem astonishing that it shall be possible to uniquely identify any file, but the length of the hash value and the quality of the hash algorithm are the reason why unique hashes are possible. As I said before, the SHA1 hash length is 160 bits and so there are 2160 possible keys.

Do you remember the story of the emperor of China, who was told about a new game 'chess' and who wanted to give something to the inventor of this game. The inventor seemed to wish only a small fee for his invention: he asked the emperor to put one rice corn on the first field, two corns on the second field, four corns on the third field and so forth... The emperor was first impressed by the modesty of the inventor. However, he did not expect what this wish would really mean when he agreed to pay the price. The number of corns on the fields is 264-1 which is somewhat above 1'844'674'074'000'000'000 - and I don't know if there were that many rice corns on earth since rice is cultivated. Now, if you got that figure, imagine as many universes with earths, as the number of rice corns would have been and in every universe a Chinese emperor with a problem understanding the exponential growth. And if you finally have that idea embraced somehow, imagine as many collections of universes each with gazillions of emperors with unavailable amounts of rice corns: this is SHA1.

So we have solved one problem: we can assign a unique number to any file, any name and even any version of any text, where only a single letter is modified. This is what we need to tell anyone else on the world, exactly which file (text, picture, music) he could get from us. Nobody can really remember 40 digit numbers (which is the length of 160 bit numbers if you write them down in hexadecimal notation). Even that ENTROPY, just like Freenet, uses a different notation called 'base64' which reduces the length of the numbers to 27 digits doesn't help too much. Computers, however, have no problem juggling around with 160 bit numbers. It is their everyday job to do it and so the numbers aren't a problem at all. And for the human beings handling files, there is another type of key, which I will describe now: Key signed keys.

c) Content Hash Keys and Key Signed Keys (CHKs and KSKs)

The keys described in the previous paragraph are called content hash keys in technical terms. This term isn't restricted to the Freenet or ENTROPY, but a rather widely used term for the functionality of this type of keys, which assign a unique key to the contents of a file. This is why ENTROPY, too, uses the key type CHK, even if the technical details differ from the ones found in Freenet.

A name in Freenet and thus in the ENTROPY project results in a unique SHA1 hash value, too. Therefore the name itself is seen as a file contents, that is the characters making a name are the contents. Such a name does not reference a file contents, though. It references another SHA1 hash where the contents can be found. This kind of keys is called Key Signed Keys or short KSK. Entropy uses this terminology just like Freenet does, though there are some differences in the implementation. You can find the contents of the GNU General Public License below a KSK with the name gpl.txt. In a browser window you can retrieve /gpl.txt, /KSK@gpl.txt or written in the long notation /Freenet:KSK@gpl.txt. If you retrieve this key, the Entropy code will quietly forward your request to the contents hash key of the file gpl.txt. You can try it here if your setup uses the default values for the fcpproxy address and port values. With a system like this I could, in theory, insert the contents of my entire hard disc into Entropy below KSKs like Freenet:KSK@pullmoll/harddisc/file1.txt etc. Anyone who knows the first part of the name (pullmoll/harddisc) and the filenames would then be able to fetch the files.

A very welcome and positive side effect of splitting the names from the contents is that if two people would insert exactly the same file under a different name, the network would only hold one copy of the data and in addition the two (or more) references to the contents. In fact re-using identical data goes even further, because files are split into fragments and chunks when they're inserted, so even identical fragments of two files will share the same data blocks (chunks).

It is important to keep in mind that this forwarding of names to contents, which key signed keys do permit is also a weak point. Keys of the type KSK are insecure, because there is no guarantee that you will find what a name suggests below the key. Simply put there could be two people running nodes, unconnected and unknown to each other, and insert two different files under the same name. Now what a requester would retrieve if he asks for a KSK would depend on which node replied quicker to his request. If you want to be sure to retrieve a specific file, you would have to request a CHK key. Those are impossible (as far as I know) to fake. But it would be quite possible to insert e.g. A picture below the KSK@gpl.txt from a newly connected (and not yet well connected) node. This applies for Freenet, too, and recently someone managed to fake the Freenet copy of KSK@gpl.txt and you got a picture of some naked girl when you asked for that key.

So you always should be aware that a KSK is even less secure and more questionable than a CHK. It is practically impossible to fake a CHK and furthermore ENTROPY checks that only valid blocks (with matching SHA1 hashes) are inserted into a node's data store.

Fortunately there is an exit out of this dilemma, which seems to be induced by the KSK key types and this exit is called Sub Space Keys or short SSK. A sub space is a region of the network, where only one user - the one who knows a private key - can insert files.

d) Sub Space Keys (SSKs)

If now every user of Freenet or ENTROPY had his own space, where only he had the right to publish some data, then there would be no more name collisions. And this is what is achieved by means of the Sub Space Keys.

A SSK is really a key perpended to a file or path name. Only one person has the right to use this special sub space. Of course, there could be several persons be sharing a private key to publish under the same SSK. However, to keep things simple, we for now assume that a private key is in one person's hand - for now.

Freenet utilizes a method, that is known under the name of DSA (Digital Signature Algorithm). An electronic signature of the sender, i.e. A signed file or message, can be verified against a known signature template. I means, that in Freenet your node can verify that a file is really coming from a claim-to-be sender - with a very high probability.

However, dealing with electronic signatures and checking or verifying them at the receiver's end is a very time-consuming job. It is all about calculations with very big numbers, which are difficult to handle, even on nowadays computers. I'm not yet sure, if we need this scheme for Entropy, too. The reason I doubt it is, that for ENTROPY the main goal is to give you an anonymous way to communicate, not to give you guaranteed authenticity of content. There are other tools (e.g. GnuPG) to take care of that part of communications. And finally, you can't trust content from an anonymous for a full 100% anyways. If you want to ensure authenticity of messages, you should create a private and public key pair with GnuPG specifically for Freenet or Entropy, just as you would do for the (untrustable) e-mail transfers.

What Entropy does, is to give the average user no chance to accidentally overwrite sub space keys of other users. It does so by simply hashing (SHA1) a random number (the private key) so that it results in a public key. This is a non-reversible action and so no one would be able to guess the private key from a publicly visible sub space key. However, since mapping a SSK@something/file is done by simply creating the content hash key for this string, a malicious attacker could guess e.g. The date based redirect or next edition key of a sub space and fake it by widely distributing data under a wrong CHK@. I want to see this happen, before I continue to think about a solution to avoid this type of attack. For now, just be sure to not assume anything about files popping up under a certain sub space key. An Entropy sub space key does tell you nothing about the authenticity of the contents. And I must admit, that this could even be seen as an advantage, since no one can proof that a file under a certain SSK@ did come from you either.

To sum it up: SSKs in ENTROPY are used as a means to avoid name collisions, not as certificates of authenticity. To derive some form of security from SSKs is a wrong approach. No one can say that contents, which appears under my SSKs, did really come from me - and this applies to anyone's content, too. We will see, if this is a problem or an advantage. As long as cooperation and exchange of data is the main interest of the majority, it should not be a problem.

I can't stress enough the consequences: Do not execute programs downloaded from this network. You should never run programs from an unknown, untrusted source. Who does this could just as well lend his door keys to anyone on the street. You can do this, but think about what you're going to do.

Finally a comment on Freenet, DSA and signed contents. If you trust this system, it means that you trust the algorithm, the implementation, the current source code or binary on your machine, which (supposedly) does DSA. Did you understand DSA? Did you understand the implementation? Did you read the source code and verify, that it is really detecting wrong signatures? What I want to say is this: there is a long chain of things to verify and double check. Security is not in a program, security is a whole system. Or how do you verify, that the pgp.exe or /usr/local/bin/gpg is still the binary that came out of verified source code, and not a Trojan? I admit: I don't...

e) McEliece Crtypo & MECH

Error correcting codes in a public key algorithm. McEliece Cryptography is used in Entropy and in MECH (McEliece Crypto Harness) . Entropy uses this method to encrypted the initial node communications. MECH is a simple PGP or GnuPG substitute, featuring the crypto routines from Entropy. This is the McEliece PKCS for the public and private keys and the Lili2 PRNG bit stream used to encrypt messages and the secret keys.

By looking at the easy to follow fact that intentionally induced errors in a crypto text make it much harder for a crypto-analyst to decipher the message, Mc Eliece in 1978 had the idea to use error correcting codes as a basis for a crypto system. In this system the generator matrix of a Goppa code is converted into a linear code of your choice by matrix multiplication. Since decoding a linear code is NP-hard [ 1], where a Goppa code takes linear time, one can see the matrix multiplication as a one way function and the secret dissection into single matrices as the trap door information.

[1] Arto Salomaa, Public-Key-Cryptography. Springer, EATCS Monographs on Theoretical Computer Science Vol. 23, (1990)

Key creation, Encryption, Decryption, and Security details as well as a mathematical example are available on the Entropy website.