review of "A measurement study of peer-to-peer file sharing systems"
Baoning Wu

This paper is mainly focus on the measurement of two peer-to-peer systems. The measurement is especially aimed at characterizing the population of end-user hosts in these p2p systems, such as the bandwidth, IP-level latencies etc. Many figures of the experiment result are given. And a lot of analysis about these results can be found in the paper.

The strength of this paper:
1. This paper gives us a lot of experiment results about p2p networks. We can get many useful information from their figures and analysis. For example, we can know how many users connect to the p2p system by T1 link, how many by cable, ect. Their results broaden our view about p2p systems.

2. Their measurement is aimed at the feature of end users, such as the bandwidth. So some results should be useful to the protocol designers to take these features into account, adjust the protocol to make the system more balanced. For example, the servent can give different speed responses according to their latencies, or delay some time for different bandwidth host.

3. Different kinds of figures are shown here, such as we can see line figure, bar figure, dot figure etc. So the paper seems lively. And even the topology is shown in Figure 15.

The weakness of this paper:
1. The experiment duration is not quite long. They only tested for 4 days to Napster and 8 days for Gnutella. This may not be enough for knowing such big systems. And their crawler seems to be single threaded, or begin from a certain point in the system, it would be better if multiple crawlers can run simultaneouly and from many parts of the world, more data can be got then.

2. There seems to be some inaccuracy in measuring the latency. As we know that the latency is quite dynamic, and from their description, they only send a TCP packet to measure. It is quite possible that there is a burst when they do the measurement. So it may be better to measure one host several times at different time, and caculate the average value and use that value as the latency to that host.

3. The authors thought they measured the extent to which peers deliberately misreport their bandwidths. But in fact from these information we still can not know how much peers are willing to cooperate, maybe they do not know their exact bandwidths or do not know how to set it in a right manner. This may give a future suggestion to the protocol designer, that is to measure the bandwidth by the p2p system itself instead of letting users to configure it. And since the system can know the bandwidth, why bother letting users to do so?

4. Because the Gnutella and Napster have different structures, this may be the reason for some experiment results. THis paper doesn't take this feature into account. For example, Gnutella hosts may send more packets to keep the mesh structure of the system. Maybe this can count for why Gnutalla hosts have larger uptime than Napster hosts, for the hosts in Gnutella needs some time to join the mesh and send some messages when leaving the mesh. So more reasons may be found here to address some issues in the paper.

5. For the nature of shared files, it seems that more work can be done for that part. For example, we can get file information from other nodes, so we can categorize them, such as MPEG files, MP3 files, GIF files, etc. So more figures or results can be achieved more than Figure 11 in the paper. Then we can know more about the different kinds of file distribution in the p2p systems.

6. How did the authors know that some nodes are connected to the system through DSL or cable? I am not sure of this, for no information in the p2p protocol tells us this. Maybe they should tell how they get this information.

7. There are many features tested in this paper. And from the paper, we can see that they use several tools to do all these experiments, such as Crawler in Java, Sting platform. So we may think to build a complete tool that can test all these features of p2p system in the future. It is a possible research topic. How to build? WHich feature to test? WHich system to test?

8. There should be some problems or extensions with their research. So future work is quite possible, but no such stuff in this paper now.

Genarally speaking, this is a good report for so much data are given. Many p2p researchers can get helpful hint from this paper. Still more work need to be done to make this a better paper.