RIPE 58 - Amsterdam, the Netherlands

The plenary session commenced as follows:

CHAIR: Good morning to you all. I think we are just passed 9:00 so we will get things started for this morning. For the opening plenary session of Tuesday, we have three speakers this morning, covering a couple of different topics, and hopefully something of interest to everyone.

So our first speaker is Tom Vest, if you wish to asend the stage and he will be talk about adverse selection risks and IPv4 transfer markets in all their glory.

TOM VEST: Thanks very much, thanks to the forgiving me a /TKHAOPBS talk. A change name in the talk to reflect the content a little more. I will present just introduce the simulation that we have been working on that provides an overview of industry-wide sort of pooled risks and opportunities associated with the looming addressing transition. I am struck by, OK, I think most of you know me, my name is Tom Vest external consultant with the science group, this is work which is generously supported by the science group but any conclusions or observations are mine alone personally.

I am struck by all of the talks I see, all the time on v6, is just almost everything that is said is expressed in terms of long-term risks if you decline to adapt to the reality sooner rather than later, long-term opportunities, it's basically a mix of extensive discussions about risks and opportunities associated with the change of state, sin flexion point in the - and so this simulation I have been working on is a way to try to integrate and those kinds of things and to express them and to investigate them systematically. So I guess we all know that there is -- there will be some kind, some expectation of a requirement for accommodating the reality of the addressing transition by both incumbent, current operators as well as any aspiring new entrants that emerge, any time after the last day when the unallocated pool is exhausted. So, you can think of in some sense the need to adapt to a future v6 or a potential future v6 by operators is again, it's sort of a medium to long-term risk based strategy or proposal and for new entrants who might emerge on the day after v4 is, the ununallocated pool is exhausted the converse securing some minimum v4 in order to enable basic connectivity with the rest of the Internet, basic prerequisite, 100 percent extension risk for new entrant which is unable to accomplish that, that new entry will not enter, will not exist. So, we have two basically kind of populations or candidates for surviving the looming transition, each with very different kind of risk and opportunity profiles.

To put the relative populations in context, for some reason or other the caption disappeared from this but basically, this is a sort of a decade-long view of the series of initial allocations, initial allocations, so this is not all, this is basically the closest approximation we have to new entrants, basically people that are arriving at the RIR for a initial allocation of IPv4 and actually, when -- so this is a quarterly view. When you normalise it down to a daily view it is basically during very low periods, which is for example, the end of 2002 2003, there is approximately two to three aspiring new entrants which walk away with an initial allocation every single day. Somewhere in the world, so that is global average. RIPE's part that have would be a little bit less than third of that. And that you 2 to 3 per day can be double or more when the industry is actually humming along so that is the daily demand for new entry in the marketplace that the future, whatever future addressing regime is going to have to accommodate, if we are going to continue to enjoy the kind of growth process and dynamics that have been -- that we have enjoyed to date.

So, try visualise this is sort of frying to graft an incompatible new component on a core of existing kind of homogenous resource, so just a way of thinking about the task ahead and thinking about the average annualised new entry rate, and considering churn and everything else in the currents population demographics of the Internet, if nothing else changes we might -- the population of, say, notionally native v6 post run out new entrants might approach the population of incumbents, those who have got started based on native v4, you know, got their resources before the run out. In 15 to 20 years. So it's assuming nothing else changes at all. Again, this is sort of an approximation of the run rate for new entrants looking forward, assuming that we are -- that every single year is not kind of a depression, deep recession kind of year as of assume the first year being.

So -- but of course, for that second ring and subsequent rings, to be facing a different, better future than the first, the aspiring new entrants in the first ring, which is the first year after run-out, need to find some way to achieve connectivity with the rest of the built Internet. So let's talk first about the operators who secure their fortunate enough to get resource before the run-out. Initially have a fairly low level of simultaneously risk and in fact, I would say that you look around at the distribution of activity in terms of v6 adoption now and you would say this is probably the consensus view, that most operators are cognisant of the need and the risk and opportunity but don't feel it as substantial risk or substantial opportunity right now, but they recognise it may be, which means it may not be, in the future.

Right. And again, the risk part of the profile, the risk only increases to the extent that future subsequent operators who are not native v4 based, that is to say future post v4 operators, are actually able to emerge, they are successful and they are able to start constituting an alternative population of connectivity opportunities, so that is the only I think that causes the risk part of the equation to go up over time, so it's contingent on that assumption. Now, I guess the situation for new entrants is pretty straightforward. Basically, no possibility of emerging at all unless they are able to secure some kind of minimum v4, and, you know, many will not even be able to interconnect or exchange traffic with each other so there is not much chance of small pockets of v6, except extremely localised one, developing, at least initially.

So now, this is the optional feature of this post v4 environment that we are counting on at the moment to -- is there -- there is nothing I think below that line, I guess it's not too bad. The optional feature, we are counting on to make this growth dynamic possible is the possibility that in addition to the -- while simultaneously doing something to accommodate the future of potentially v6 based that incumbent v4 will be able to part with some of their resources and not with them not just to anyone but to entrants, to facilitate their entry into the marketplace for some kind of compensation.

Right. And again just to summarise: The rates of market entry for post v4 based operators will be pretty much completely determined by the rate of participation and the level of participation in that kind of dynamic by incumbent operators. Basically, (rate of) the 100 percent risk of failure for new entrants will continue to be the norm unless incumbents began to make themselves transparently accessible to native v6 based networks and/or to transfer some resources to the new entrants.

So again, this actually makes this situation quite familiar. You have a very large population of low risk institutions, entities and a smaller population of high risk entities and the, in some sense, basically the two can't interact, the degree to which the smaller population, the -- high risk population is determined by level of participation by the low risk group. Basically makes this exactly like an insurance pool, the kind which are managed by insurance companies that we all have some relationship with. The one I think that distinguishes this kind of economic dynamic from a standard insurance pool is that one of the parties is actually the insurance provider, as well. So that does create some serious complications.

Now, in order to try to systematically to provide a means for -- to investigate some of these complications, have developed a simulation with a mathematical kind of modelling platform and it is something which is user tuneable, you can tune all kinds of different parameters and in fact, I am going to be posting the source code for this to the conference website after this and there is a free Mathematica reader, like Acrobat Reader, programme you can download for free which will enable to you at least toggle the parameters that have made variable there on the left side. So basically, you can evaluate for yourselves what -- if you think that the level of actual risk is this amount, the level of perceived amount is that amount and cross the two and you can judge for yourself how those two interrelate. And again, the key features of the simulation is that it provides equilibrium price for both participants of both kinds in the common (kinds) insurance pool and that is actually the price -- the price would be the equivalent to one unit of insurance for one new entrant. So it's basically, is equivalent to the unnumbering and renumbering cost for one average, say, /24 which is held by incumbent v4 holder. So that is the pricing model. And everything that I -- I will be showing is expressed in terms of multiples of that one price. So of course for a large operators, unnumbering, renumbering, one slash 24 is not going to insurance your whole business against a future which is primarilily IPv6 based but assuming you have to do that for some large number -- some number of multiples of 24s, is basically the equation is still the same, basically you will be able to, as an insurance provider, get whatever that multiple is of the quantity or that is what the risk spread would, the kind of -- you multiply it by the number of 24y equivalents that you would need to renumber for your own business arrangement.

So, in the simulation, basically the X axis is actual risk and the Y is perceived risk. Basically, as you adjust the parameters, members of the population which you also define will enter this matrix, either as insurance takeers, and they will have a dark., or basically people that decline to participate and they won't be factored in directly into the pricing, but it changes the availability of the overall insurance for both parties. And again, if you have -- this is a very unlike normal insurance pool because you have one population where the risk is very, very low and another separate population where the risks are quite different and in fact radically different, the exact opposite side.

So here is a couple of examples. Assume for a moment that there were 1,000 institutions that constitute the built entered, IPv4 based Interneted to and say the first month or week or so, 50 aspiring new entrants that are seeking to join the Internet at that time, and at that point again there is no risk, there is no real risk and there is no real non-monetary incentive for incumbent either to make themselves a v6 accessible or release v4 to the aspiring entrant so basically, in the absence of anything else we are looking at pure monetary compensation and based on the spread of risks and opportunities at that level, the one -- the eequivalent of one minimum size routable prefix would be whatever the cost is to unnumber and renumber that, the price that the risk spread would support would be for a new entrant would be 975 times that quantity. Right. That is just for the very first few aspiring entrants.

So imagine going forward, we have arrived at a point where v6-based operators, native v6 operators now represent 10 percent of the total, and again I didn't adjust up the population parameters too much, it just slows the simulation down without changing it all that much, but you do have that option if it's something you want to pursue when you are using this independently. And again, tune up the incremental marginal risk of a connectivity problem for incumbent and the incremental marginal opportunity, you might -- of securing new transit customers or -- so basically, adjusting those changes upward, roughly in tandem with the change in population and at 10 percent level it's still basically the equilibrium price would be 31 times the notional average cost of renumbering routable prefix. So it's reflecting the fact that v6 entrants at that point still face a near absolute requirement for v4 to do anything they might want to do.

OK. How about the 20 isn't that correct at the 20 percent mark it still is over 10 times differential that the risk spread that would be proportional to that would support. Now, again, there are lots of different factorsings you might think about that could affect the actual risk and perceived risk and if you want to take the time to download the source code and get the Mathematica prayer you can tweak them yourself, this is basically a standard transparent insurance model and the only things which might be controversial are the parameters which you can tweak yourself.

So, assuming we get to some future in which, say, 3 or 4,000 new entrants that are native v6 are in operation, they still would need v4 mediation to reach most of the rest of the Internet and in fact, at that level the spread of risks and opportunities would support pricing, which is roughly, you know, more than 10 times the actual cost of unnumbering and renumbering. That is assuming that every single IPv4 address which is liberated is going directly to new entrants only, not to other incumbent v4-based holders. So, of course, that is not very realistic in the current environment, perhaps. Anyway, just to summarise to this point: The real risks and opportunities, the measurable ones, don't really become relevant to the kind of rational incentives to participate in this transition until way, way way after we would -- we tend to talk about them. People like in the talks yesterday at the Google conference last month and again I think that people have been voting with their feet on average roughly in line with this. And again, based purely on the distribution of risk and opportunities, I think it would be fair to say that if v4 transfers are actually priced at the levels that would be supported by the -- that distribution of risk across the group, then most aspiring new entrants would in fact not emerge; there would not be supportable pricing which would enable new company to enter marketplace so that would be a problem. And then of course, the more likely challenge that we are face something that some non-trivial portion of v4 which is liberated and which is rendered available in whatever market, will actually be sought out, also, by other v4-based operators which would further reduce the supply that would be available to support this over a stream of aspiring new entrants and would raise the price further. So again, think of it; just to show how the simulation, the platform can be used to tweak -- to evaluate some of these assumptions, assume that the incumbent v4 operators only sell v4, only sell v4 and still only sell it to new entrants but the decline to actually make themselves transparently accessible to native v6 traffic and services and operators, then in effect, you are making the possibility of entry possible but not reducing the actual demand, the bottleneck requirement for v4. So in some sense the real risk and perceived would not go down and even though maybe at 20 percent the actual distribution of connectivitytivity opportunities and risk would still support super crazy pricing.

So, that brings us to the original name of the talk which is adverse selection, which is a common concept in economics where because of asimilar trees and information, the outcome which would you prefer not to happen, is produced bay market interactions. And I think that given the fact that we have basically created a kind of insurance pool kind of dynamic, that this is a real risk we need to be thoughtful about.

There have been a bunch of other people that have written about IPv4 transfer markets. Numb of them have considered the impact of asymmetries and information available to two different parties that need to be involved, so I think in fact they are really not very relevant to the issue at all.

And again, demand, you know, the secular rate of demand for new entry is something which goes up and down but doesn't disappear, even in the blackest of times that we have experienced so far. So that is the kind of implied underlying demand for aspiring new institutions to enter, you know, this industry, the industry -- however you want to call it -- that requires that you are able to enjoy connectivity with the other incumbent operators and also to provide opportunity to customers. (Also to). If the dynamic activities of whatever kind of market-based transfer mechanism are unable to keep pace with this sort of underlying level of demand, well, it's probably worth thinking about what is going to happen, what are those other people going to do? Probably not, you know -- maybe some of them will give up and go home but, you know, there is only so many 100s or thousands of frustrated new entrants that will get turned away before they might start thinking about other options.

So, as I said, I am going to be uploading this to the -- the source code to the conference site, the Mathematica is free, can be operated on any platform. Once you have a chance to look at it I would be grateful if could you give me some feedback and it's possible to optimise this kind of analysis, these kind of tool quite a lot to make it more useful, if, in fact, there is any demand and you think it might be useful to the community, you know, a full mod that he will would reflect all the ways that incumbent v4 operators and aspiring new entrants would interact is not impossible and if it's something that would be of value. And I guess I am done. Unless there are any questions?

CHAIR: Thank you very much. Are there any questions? Just say who you are and where you are from when you are speaking. Bush bush from outer space. What I did understand from this that was you are concerned about new entrants and you have a large and complex model about it and I am just a stupid hacker, so Phillip Smith, who is also here, probably, yes, and I put fort a proposal in the APNIC region which seems to have been accepted and Philip has it in front of this region and suggest you take a look at it, which essentially says the last IPv4 /8 and thousand should be used and it essentially takes it and chops it up into the minimal allocation and says, one per customer and considering the size the /8 and allocation you are talking about something that will last quite a number of years and allow new entrants to have something in front of their IPv6 site or their RSC 20 -- 1918 site, to join both the v4 and the v6 Internet. And yes, it's a disgusting little hacker solution but this is an operates' group and I wonder if you could comment on that regarding this complex model and downloading Mathematica and so on and so forth.

TOM VEST: Sure, on the complexity of model, I am sorry if it was complicated to describe. Model itself is dumb simple: Four parameters, it's population, actually the actual risks of the two groups and the perceived, and that is it. And it's justy equilibrium model over those two things, I apologise it's somewhat complicated to describe. As to your proposal, you may recall that I am acquainted with that. And I think that is a useful contribution to addressing this particular part of the transition challenge. If I remember correctly, that pool is available on a one time only basis, not only to aspiring new entrants but also to existing operators, is that still correct?

Randy Bush: That's correct. If you take the existing operators it takes about one or two percent that have pool away so it's not worth arguing about who is an operator, who is existing and who is new. As I said, I am just a humble hacker, it kind of works.

SPEAKER: I appreciate you repeating that. That is for your perspective. I guess the equivalent proposal which has been accepted in North America is /10 and I think --

Randy Bush: I am not accountable.

SPEAKER: I will have to sit down and do the back of the napkin thing. Maybe at the APNIC level where there is a higher level of concentration of resources and a smaller number of incumbents, it might only be one or two percent. If you were to apply the same ratio and the same dynamics in the particularly in the RIPE region I think it would be a much larger than one or two percent.

Randy Bush: I am sure somebody could come out with the nice Mathematica model to give you how much to give to it

SPEAKER: A times B is not all that complicated. I don't have the values right now.

Malcolm: Thanks. I am much more humble and certainly a lot more stupid than Randy so I would have thought it would be helpful if you were a bit clearer about what it is that you were -- you were propose to go model because for me you seem to jump in a third of the way into the presentation for me to be able to grasp what you were talking about. At some point you said we have got the problem of potentially topologically isolated pools of IPv6 and the goal is to get them connected. And then you spoke at some point in your talk about renumbering costs, for me as an incumbent operator and you also talked about release and Randy was just bringing up there ahead of me -- of another strategy, so I wanted to know what strategies for solving that top logical thing. If I have got a big pool of IPv4 space, potential business of providing a gateway of translation. Is your pricing the price I would charge on top of normal transit for that.

SPEAKER: First of all, if -- I am not talking about -- somewhere there is a footnote, basically I am not talking about the ability to support customers, right? I would say almost nerve this room enjoys a level of autonomy, basically have a relationship with the RIR, you are not customers for somebody else for your primary address space. Fundamental categorical distinction between ought on news and not. The minimum prerequisites to be autonomous and I said the unit for that would be the equivalent of something that the sort of thing which is reflected in Randy's proposal which is the minimum conventionly routable prefix, so that is the thing about is the unit one, and the input costs for releaseing that is the cost of unnumbering that quantity of IPv4 that is in reserve or in production with an incumbent and making those resources reachable by some other means, whether it's Nat -- 1918 space or IPv6 and then basically, this is that you have resource and the cost of one, normalised to one and based on the distribution of risks and opportunity in the marketplace, you know what is the -- if that thing you could give it away for a price of one and you break even, well, this is a market, market incentives, what is the actual price that the relative risk spread for new entrants, what what should they be willing to pay if they had unlimited amount of money? And the numbers that I quoted were the numbers that were multiples that have number. Malcolm: That is very helpful, what you are talking about is the cost of becoming autonomous and not becoming connected?

SPEAKER: That's right. I am talking about the cost of the industry which is the routing services industry remaining open. I am not talking about the ability for incumbent routing providers to attach new customers. Thank you.

Lorenzo: One question. It appears that there is another part of the story. This is looking at new entrants only. I am wondering how -- have you thought about modelling the risk of, let's say, systemic failure for the existing incumbents if the transition does not occur? Because this is what seems to be spurring some of the existing incumbents to deploy IPv6. A couple of ideas, number one is how -- when do they hit the scaling boundary and number two, is it possible to envisage a situation where somebody gains a competitive advantage by deploying IPv6 and somebody does not? For example, in the lowering operational costs? I am wondering how would you model that

SPEAKER: It is possible to do that with -- to adjust this. In order to make the risks and opportunities less abstract and more a reflection of the top logical variety of the Internet, develop a notion of how many locations there are and relate the size of different providers spread across the world's locations and then think about what is the reachability for native IPv6 based operators from those locations and then the probability of new entrants being close to one of those. The answer to your question is, just with the parameters that are in there now, think about if they are -- if they are 10 percent -- so say there is 15,000 actual management entities that are behind the 31,000 ASNs that are visible, that are actually in production out in the wild, so maybe and of those, 10 percent are v6 and you can get a relative sense of the size of them by just looking at what they route in v4 before. And from that you can gate very rough probability of the likelihood, given your customers and your knowledge of their traffic profile and everything else, the likelihood you are going to hit a problem because you can't reach them and you can also look at the rate of new entry, the actual rate of people that are entering the industry, you know, after that last day when v4 is available from the RIRs and use that to judge the point at which it actually would be a positive opportunity, an incentive that would justify the cost for reaching out to them. I can't tell you how many days it is today but this is the kind of model which would make it possible to roughly estimate.

CHAIR: Is there a last chance or is there a very quick last question?

AUDIENCE: Every time I listen to you, Tom, I get more questions than answers. I guess that is the point. One of the things that, is there a parameter in your model that basically describes external pressures that step in when a market doesn't function?

TOM VEST: There isn't at the moment. When you think about this, one important distinction between a normal insurance model or on first consideration, a normal insurance model and this situation. In pricing insurance of course, insurance company has -- their risk is that if they price too low or they provide too much insurance to high risk individuals, then the odds are that over time, they are going to have to pay out so much money that they go broke. So that is their -- that is their, as an individual insurer, that is the structural risk that they face. I would say in some sense, there is the equivalent for our own industry. Basically, if there are a lot of -- again, when you tune these up with different set of parameters or if you say that the, a new entrant has no more than 10 or 20 or 50 times the equivalent cost of unnumbering and renumbering a /24 and anybody who has to pay a higher price than that will not enter the market is basically they will be a back pressure of frustrated would have been entrants and as the rate of growth at the edges of new entrants of the industry declines, I would say that the industry itself faces a kind of systemic risk which is not unlike that facing the insurer that prices the wrong way, which is to say we will go out of business and somebody else will take over. Now, again, it would be possible to, in fact this is the kind of model that would make it possible, with some of the background data to people to estimate what is the frustration level and the acumulative frustration over time as the actual rate of new entry in the industry starts to wind down relative to everything we know about every time period in the past from the -- from the, you know, to the peaks to the lowest levels. If we start seeing, say, the date -- any time after the unallocated pool is gone, if we see a level of, say, maybe between Randy's proposals and transfers which have been recorded, if the sum of those things does not actually match at least the minimum level that you would have expected, given the long industry trend, then we probably should anticipate some kind of trouble.

AUDIENCE: Credit where due, Phillip Smith's proposals.

AUDIENCE: The short of it, if we don't allow for a network that can change and that can grow, we die?

SPEAKER: If the industry becomes closed, OK, so a lot of thinking, I think, in the industry tends to concentrate around --

CHAIR: We are running quite a bit over.

TOM VEST: A lot of thinking focuses kind of on supporting future customers, right. We might solve the problem of supporting future customers by any number of means, if we don't also solve the openness of the industry to new entrants that is a separate and independent and more complicated problem, but if we don't solve it somebody else will, for us.

(Applause)

Second speaker this morning is Bob Bruen from KnuJOn and he will be speaking about his own solution and the product and the project and all the rest that is included in all of that.

BOB BRUEN: Can you hear me OK? I am an American, if I talk too quickly or say something insane, also wave. I am from KnuJOn, it's no junk I will have backwards. We put it out there a lot of people thought there was something interesting, there isn't. All this is, is my son and myself. We are not funded by anybody, we have no agenda except we tried to stop SPAM and we decide in spite of the wonderful filters they make, it kept increasing, we look at it as a gateway into crime so we started looking at other ways to handle this. We have been very successful, OK. We have been using what is called policy enforcement and sunshine, which means we publish our results and we have been called the name and shame game. We started using Whois data accuracy because there is a rule in the registrar accreditation agreement between ICANN and the registrars and there is north American centric, that if you have an inaccurate Whois record and I tell you you have to fix it, or take away the domain name. And my son started this about five years ago with about ten people, we shut down a whole bunch of spammer sites and they went away and got no more. SPAMers have evolved and we get better they have increased my coding skills significantly. We believe the registrars are the key. Take away domain name, they have got nothing to work with.

And one thing I will say and say it every time, except for us it's all about the money. As ICANN the registrars, you know, the re-sellers, ISPs and the criminals, they are all the same, all they care about is the money. What that means if they have a decision to make, they will make it first on the money, then on whether they are going to do something good or bad.

We have looked at policy, I don't know how many have written policy or thought about it but real good policy can make a difference on how things run. When you hit novel situation you don't know what to do, a good policy will guide you into make a decent decision. That has lots of unintended consequences and those can go on forever and they are really all of. I don't know what to do with bad policy you can't change. It can ruin your company and do all kinds of nasty things.

Now, this is how I see the world and there is a reason why I put this up here. Up at the top is US government, joint project agreement supposed to end September 2009, goes to ICANN. As a sidelight I don't believe it's going to terminate, J Rockefeller put a bill in place they are going to create a commission which would tell the president whether they should terminate or modify their contract so I don't think it's going to happen. That is my personal opinion. From there is IANA, the NROs, the top domains, the ccTLDs are a problem, not under the same contract. There is registries like dotcom.net etc.. only two of us. My son is full of computers and 19 interacts but there is only two of to us do this. And the re/SERL, the registrars and the registrant and various hosting services. Now, from my perspective, the criminals spammers abuse phishing, whatever you want to call it, they find pathways in every single one of these. It's not a simple separate ecosystem where they do criminal things, they find pathways in all of the stuff and they do exactly the same thing as normal people do.

Now, the Whois data accuracy problem has got a serious history. And I know in Europe, the privacy rules are different than they are in the US all we care about are the commercial entities, I don't care about personal information. And all the pressure that we are put on happened on commercial entities because they are selling SPAM services or doing criminal enter prices and exchange of money that makes them commercial, illegal but still commercial. (Money that)

You have been hearing back five years ago, six years ago, promises of change by the GSA, never happened. Nothing has changed. It's still controversial within ICANN. People say I don't want to see anything about this, don't want people to know nothing. They stopped it initially so SPAM wouldn't happen. Right now, the question is whether or not a law enforcement or anybody else can get access and do something to it. If you take too much shots at getting Whois data they will slow you down. We have found ways around it but still slows us down between 10,000, 15,000 look ups a day.

Now, they do have to enforce this. What we do is we have clients, you can pay us 27 dollars a year or once, you can sign for free, if you register either way, you will get daily reports from us and you can check on the web. Or you can send stuff anonymously, we don't really care, we process it all the same way, so we get the stuff in, like 37 countries where people send it to us, a couple of thousand people do T some people send one or two a week or 10,000 a day it varies considerably. We get a lot of zip files and P S D files etc. And we verify the Whois data for each site. Now, this has become very complicated, we have taken the simple way out. If I send to you mail to detect admin content and bounces your Whois data is inaccurate and we will file a complaint with ICANN who will check it and send it to the registrar, they will check it, if you don't fix it as a registrar, they will take away domain name. It started out about a month time-lag, gone to 45 days and we started crashing their system after a while because sending too many of them. Look at that as policy enforcement. I would rather see policy written as a policy statement but policy is written through the contract. So it's done really well. What we did, we republished the stuff, aggregate all the registrars where the SPAM transaction sites are, and we write up who has got the worst set, the best set. Again, I will repeat: Not interested in personal information. Publish daily reports. I have a database going back three years of millions of SPAM mails where they came from, I have written some stuff, to go and search through the stuff. We have image SPAM, we pull that out, send it to research at university of /A*BG /KA and they can building a corpus of SPAM from SPAM images over the years and update on all the registrars.

We have tried to work with ICANN, it's not that easy. They have changed over the past few years. Initially went out there to California and we talked to them for a couple of hours, they had just hired staysy burr net the single -- we will help you do this and they paid no attention to us and after a while started paying attention, Garth kept getting in contact with them. Now it's changed dram 59ly, they have nine staff members, the president quit in Mexico in Mexico, they have got a lot of people who want to see this done right. My son actually co-chaired an e-crime summit in Mexico city. Cairo small group of people interested in crime now a lot are. I know the ARIN people and RIPE people started to notice this is an important topic and it really is.

Another thing is that over the last several months there was a lot of recommendations put forward to change the RAA and this meant if you want to be a registrar, you had to sign this agreement or stop being a registrar. And put about 12 different changes, a couple of more we want to put through but we really believe the infrastructure that was on that picture has got to be strong and any of the technical stuff is strong or it's not going to worse. Criminals don't care what you do. If it's not technically strong, get good code writers. If you have bureaucratic procedures that are not good they will find a way around it. We contribute significantly to the WTPRS, this is a Whois data problem report system. When we started put not guilty 3,000 a day, we did about 50,000 in total system would crash. They used to leave the whole databases. Since that time Garth has worked with them to get a new system so they can take bulk stuff, password protected, uses XML and we can load 10 How that day which we are going to start doing and if you have one time only stuff you can put in a file, complain about this and fill the form out and let it go. So there will be two systems now, and we are very happy to see that one go forth.

And when I was in Cairo, most of you should know who Steve Crocker is. By the time my tie has significance, if you are old enough. If you can guess it is, you get a bonus prize of ten points.

AUDIENCE: We don't wear ties here.

SPEAKER: I don't either, I just do it because I am representing America. Normally it's a T-shirt. But this is a special tie. People about new generations, and some of the newcomers meeting, well that is much older generation.

While I was there Steve Crocker who has been around a long time, was on the board of directors. It's just myself and my son doing this. He said you guys are casting a really big shadow. We have pretty much terrified the entire registrar industry, including the Chinese. They send messages through guy in New Zealand they were going to fire their entire abuse staff and it would be our fault. At that time they invited us to become an ALS, it's a terrible acronym. It means an at large -- we are allowed to help out, we get some money to go to the ICANN meetings and Garth is a co-chair and did everything else.

What we did, we published last May the top ten registrars who the worst registrars, and we did this based on our criteria which I will get into, and then we published another list in February. As you can see /SH*EUPB net still up there, other eight people were not on there before. How did that happen? OK. EstDomains. They had a felon running it in he is stone I can't, published a very long detailed sordid history of his arrest records, had them translateed from Estonian and you can look them up in the Washington Post, they were deaccredited, they were working using privacy protection and they stopped giving them that. I think what has happened they have moved over to Russia and are using their privacy protection. P DR co-operating. KnuJOn from last year a really big dust up at us yelling at us and us yelling at them. ..behaving themselves. They ended up getting the domains that EstDomains gave up. One of the best things ICANN did set up the escrow St. If any of you know of the registrar fly problem ICANN was not prepared when they deaccredited them and that causes problems still suffering from today. They were not ready to do. Cosmos and jokers all got breach notices and said I will be Haive myself and the other ones didn't. And they were recently deaccredited, Garth did a really simple things, he sent them a letter through the US postal mail and it got returned no, such person at this address. That means the Whois data reckon was wrong. They did not live with they said they lived in Texas. So we published that on KnuJOn dotcom and eventually they.deaccredited. A couple of other sites, one of the other things we are trying to do is if you have a registrar where you got your site that looks like they might be a deaccredited you should go to a registrar that is a good one. It's a business decision. Why should you risk your business. And that is been happening, the market loss. And Dynamic Dolphin, millions of dollars worth of lawsuits they haven't paid. On top that have we had background involvement in the inter cage stuff several months ago, host exploit wrote the report, the ISP stopped doing business with them. They never fully recovered. There was a range of about 120 billion spasms going out last year. This year the highest I have seen is 80 billion, it's not gone but dropped down significantly. Mc/KOL low report, Garth was putting that together, Brian crab called up hurricane electric, do they work with you guys, the customers. They said wait a minute, hung up, checked it out and took them down. And then they went somewhere else and moved to Sweden and they came up and went down again and that caused a lot of problems for them. Ukrainian take down nothing to do with it. SPAM dropped considerably. One of the things we do at the top ten is we have shown that the problem is actually tractable, it's not like you have got billions of Spam from everywhere in the world and, we have found if you delete it 10 to 20 registrars out of the 900 in the world, almost all the Spam will disappear. Everyone sends some. But some registrars send lots more than others, and if you take away their domains and ability to do that Spam will drop by huge numbers. And it's not that hard to find. We have found them, we will tell you who they are.

We try look at how many Spam to send, how many different places send, how much each one of those send and try to do the stuff so it's repeatable so you can look at T it's all public in case. The subjective part if a registrar is cooperating with us we are not going to call them really bad, we will give them a few more points, you are trying work it through, we have found that most registrars will not admit that they do not know what to do. Some don't care, but most of them have no clue how to handle the abuse stuff. And sending mail out automatically is see if it bounces back or not F someone has to investigate it, you saw how long the police take, very long time, five day cycles.

We believe the policy works. It's been a long struggle against Spam. The past year has been the best so far. We know it can be done, it's a trackable problem.

For the registrars, they are not happy with us at all. A few are our friends but most think we are evil people. They deny responsibilities. We have had success with fake pharmacies, shut down half a million sites. We had a direct Reese list from EstDomains to see if we knew who they are. We can cross-reference and see if they know who these things are coming from. We believe the registrars RIR entrepreneurial group, don't want anybody to tell them what to do. I don't like reglation particularly but they should behave themselves. Tax on registrars is one recently and I forget the details to it but domain not, when Israel invaded Gaza they were attacked by Moroccan group. They had access to newspapers and corporations by breaking into the registrars. Network solutions, they had a problem. I mean, Comcast had a problem. It's not a new problem if we report the security and stability advisory committee at ICANN, they put out 2005 a report on problems, a whole list of registrars who are not following the rules and they didn't do anything about it. I think they took out the slide that said lessons learned: None. But wholesale registrars, registrars are moving towards using simply re-sellers and not actual selling domain names. Providing a channel from someone from Soviet Union to send in bad data into get domain names, they don't care if they get caught in a week, up five days anyway. There are real reasons to have re-sellers, it's not a bad thing to be a reseller or have resellers, it is bad to not check the channel to make sure I think so are OK and they are not dock /*R doing that. I see lots of atracks on registrars. The Chinese registrar, if do you a Whois look up on a Chinese site the name of the registrar is in Chinese. Most of you English is a second language and Whois data was supposed to be in English. So they are just hiding T I don't read chain ease but I can match the characters up and figure it out from a friend who is, what they are. When other registrars start doing that it's going to be a problem. We want transparent see, cooperation, we don't the Whois done at registration. Right now, there is a choice, they can register and check or once a year they can send out mail saying is your data correct and all the criminals will go yeah, yeah it is. And then not change it. So we want to change an or to an and in the data. Fake on-line pharmacy gathering human amount of data for them. In the US at least, if you don't have a pharmacy licence and the user don't have a prescription you can't sell drugs. I don't care if steroids are legal in India and you don't need a prescription, you can't sell it on US soil and all of spammers are using US servers, they are faster and network is stronger and they are not having any kind of problems. It's not in the US. So there will be pressure this year. But support of illegal activity is against the RAA and we have used that to get them shut down. We are going to do more this coming year. We are in talks with the pharmaceutical industry, the national boards of pharmacy in Europe, a friend or partner who had spent time in Europe last week talking to people about this kind of stuff so we are trying to have a cooperation between US and Europe to handle these problems. There is a pumped up on Internet, LegitScript, starting with steroids is one area, we are going to go in all the illegal stuff and we have already shut down hundreds of thousands of sites.

Beyond fake pharmacies: We are seeing anything to make money. Want a new mortgage because there is a mortgage crisis, you know, we had a hurricane, you want to give charity money to us we will take it, we don't care. Mall wear, all kinds. Criminal ecosystem, better organised, the happy sacrifice low level workers. How we view this: Law enforcer put handcuffs on them, worry about bureaucracy, we don't care about the details, I think most criminal look like everybody else. If you tell me which person is a pickpocket, I will be happy. Unless I watch you can't really tell. We are looking for the ways for the infrastructure to be able to identify such people. And that is me and that is Garth.

CHAIR: Just to say Bob will also be present for the anti-abuse working group session on Thursday afternoon so there will be more time for discussion on this topic then but if there is any quick questions now.

AUDIENCE: I am Ronald Perry asking this question in a personal capacity. Very interested in international jurisdiction and all sorts of problems you get trying to apply one country's flaw another country. What would your view be on buying a pharmaceutical which is available legally over-the-counter in the United States but which requires a prescription in Europe. Would you allow European to buy that over-the-counter by mail order.

SPEAKER: Yes I would, and in at the moment we are focused on the US because of jurisdictional problem. We even a have that between States, it's not just between countries. Something is legal some places and not in others. In the US very simple thing: You need too far pharmacy licence and I believe that is true across the world and you need a prescription for certain drugs and they are listed out and it's all public knowledge. So if you are selling it in the US, then it's illegal.

AUDIENCE: I am not quite sure whether that answered my question. My question was, say there is a drug which is legal to buy over-the-counter, you buy it in wall mart in United States but for some reason you need a pharmaceutical licence to sell in the Netherlands, from the Netherlands would you allow me to buy that pharmaceutical from the US and have it shipped to the Netherlands? Is that a possible thing to do for you or is that one of the illegal things to do?

SPEAKER: Personal thing I think it is illegal to do that because each country has its own laws. And to go around it, pretend it's OK somewhere else or ship not legal, you are still purchasing from the territory. You know if you live in the Netherlands, you can't buy it somewhere else and send it to. Most customs agents will stop you. If I decide to sell illegal drugs on the street corner in about ten minutes I will probably get areed but on the Internet nobody seems to care.

AUDIENCE: Tim Denton from ARIN and my question is, well, the one thing I wanted to find out all the time listening to you tell of your good works is how do you get paid?

SPEAKER: We don't

AUDIENCE: You don't.

SPEAKER: No, floss money involved. If you want to give us spare change we are open to it, my son work full-time job does this at night. I work part-time teaching and and doing this about three years and take care of myself. There is no money involved, about twice a month somebody gives us 27 bucks.

AUDIENCE: I don't know where to start.

CHAIR: Could you start quickly.

AUDIENCE: If you look in Whois data for /TPHO*PL larks you will find address record there and that address record happens to be in Dutch.

SPEAKER: Well, the thing is I believe that there is a common language of English it should not be in Dutch unless there is English record to go with it. I don't know -- I know it doesn't work.

AUDIENCE: Can I -- /KR*UFP land 19 is Dutch, there is no English variation in that. It happens to be written in English script or as key script -- I should be very careful what I say, but it happens to be Latin script, which we share, we can read that, but the -- the thing with all this is that you have -- I am trying to up level this because what I have heard you say was a very American centric presentation and I wonder this room, which is European and Middle East and far into Russia area, can take away from it? What is the lesson that you want to spread, the message that you want to spread?

SPEAKER: We have actually got a KnuJOn site which is under construction, we would have too far European site where European rules apply, I still think spammers are using their native language not to help their native speakers but to hide what they are doing from the rest of the world and I think that is not the same thing. I know Geoff Schiller, I have worked with him for 17 years I am going to check that quote.

RANDY BUSH: You do that. I understand your desire to have it all in English which I support and agree not just because I am an American but because I am a measurement kind of person, Internet measurement and the actual people who measure Spam, indeed the majority of it does come from the United States so having it in English kind of makes some sense. The Spam originates from the --

SPEAKER: I am talking about the Whois record being in English.

Randy: That is nice and -- there is a real world out here, people will kind to you and speak English but they do have their own language and there is a billion or so Chinese that would like laughing.

CHAIR: Thanks very much, Bob. And our last speaker this morning is Alexander Seewald from Seewald Solutions who will be talking about the detection and identification of bot-nets. We might run a little bit into the coffee break.

ALEXANDER SEEWALD: So, I will be talking about the different spamers operate, one of the things normal RIPE addresses, BotNets, just taking over machines of international Internet use and using them for their purposes as DNS servers, as web space, anything you could think of. So that is what I am talking about.

So, we have built an, what we have called early warning system for BotNets which is essentially two words to summarise: Essentially we have been revisiting DarkNet research. Analysing traffic is called DarkNet although there is a second meaning I believe so that is not the second meaning. It's just unused IP addresses which have not been used for some time and it's completely passive so and it relies on BotNet propagation so we have a lot of machines logging to our DarkNet, seeing whether there are vulnerable machines there, something behind just pinging, parts of it, opening specific ports all of this thing. And we have a very simple analysis which is very efficient so it's like with can analyse about 300,000 IPs, currently we have 256 we have from the university of Vienna which is also involved, got 256 IP addresses, distributed to four different nets and it was all in the same C class net, so yes.

So how, how do we do this? One of the things that we have reference data about known Bots and BotNets and we have a feature construction at several levels and we do machine learning. We learn association between certain traffic patterns of content in the to specific Bots which works well for some and doesn't for others. We did some validation test and some further work on this.

So, we have also built our ideas kind of fuse all information which we can so one of the things which we can definitelily use is analyse the single packets which we get a lot of and this is what we have used for the Spam Bots identification system which is available on the web which you can install yourself and test. Analysis of network traffic, it's access patterns of Spam Bots, not only taking the single packets which don't have that much information but also try, if we have the same IP address accessing multiple IP address on DartNets which direction they are going, is it linear, random, which patterns are there and the third one is of course analysis of traffic content, so here we did, apart from the concept, so this is essentially it's like a conceptual overview and we also for each level we did one demonstration of short work where we demonstrated what can be gleaned from this level and the third level is all taking the traffic contents. In level 2 you only analyse the network traffic like the traffic patterns so you don't care about the contents, it's only who connects to which IP basically. The third level would also include some information about the content which is transferred. There are of course legal issues with all of this but we are only addressing the technical problems here.

The reference data we got was from by Marshal via their trace system and we got IP timestamp and Spam Bot type and matched to our legs, we got about 6 percent overlap and the data collection, we built the system to from a single packet and we didn't have any packets outgoing from our DarkNet so new SYN /ACK so very little information. Features we used whether it's ICMP, TCP and UDP traffic, basically destination port, not source port which is a problem because used spat Bot uses same and uses this to access as many IP addresses as it wants to, would taint this analyse so if you add the you get better results in case it goes to second IP will use same source ports or correlate which doesn't tell you anything. And we also have basically two byte gram of the payload for ICMP and UDP -- and TCP doesn't have a payload because we don't send any acknowledgment packet back. That is basically what we use for our first system.

And one of the things which you have when you get a lot of problems, one of the problems which is not very well addressed right now is the difference between static and dynamic IPs, so most of the Spam Bots are obviously on dynamic IPs so this always changes. So for example, if you use DNS black list which keeps the IP addresses for 24 hours that is far too long, there have been some detail analysises on how stay the same and the rough value is about plus /minus one hour P most stay the same and we also did our own model for this and found we had about 4 percent static IPs, and 95 percent dynamic IP as you would expect from BotNets which are machines on die up accounts, on remote accounts on mobile Internet accounts, stuff like that, so that means you can only match to the data within plus minus one hours. All the reference data we got from Marshal, which actually only contained the IP addresses, we saw within one month. We had an overlap about 5 percent with this data.

What we could of course do is check the black list overlap, we used to Spamhaus XBL, let's say say IP addresses which have some running on them and we had at the time we got a packet from a Spambot, we had an overlap of only 2.7 percent with the -- so they might have been at it later, we didn't check tote we got the first IP address from our DarkNet and is it in black list or not this. Means our system is possibly useful as a DNS black list. Of course the coverage is a problem with us because we only have 256 IP addresses.

So this is the result from the first system. Which just analyses single packets, ICMP, UDP, TCP we get from DarkNet and compared about with the reference data matched to the reference data cross validation and you see and this is precision and recall, information theoretical research, so it's the higher this is the better and of course the F measures is is a confirmation of research with the linear trade off, measure better 0 .5 so this is what it is. And as you see, we have some Spam bought, Srizbi and stuff you will probably known, but unknown, they have a pattern they think it's a specific Spambot behind this, they don't have executable. What Marshal does to get this and they do active traffic analyses so it's kind of getting back to the ports logging into the RSC control centre, checking which IP addresses and things like that.

So for this, they didn't have a specific name at the project that is why they called it unknown, it's not that -- they probably have a name but they didn't get their executable to check. OK. What you see is some of them work very well, up to here the first three work very well, Rustock low precision and pretty high recall, the others don't work that W well. Here you have high precision, some of them don't have pattern for kind of analysis which you can see. And some of them seem to have, which is actually quite surprising result because why would this be the case? If you randomised IC and P loads and UDP you get essentially very little information. So the spamers could probably make this unuseable but it seems they have not done so probably because it's too much work or maybe there is some /PA*RBG implementations we don't know. This used to be an animation where you see the activity of a BotNet but now you have to go to the web page to see this. It didn't survive the conversion to PDF, obviously. But basically what we did is just take the 24 hours from our systems and we just did an animation of our page and different Spambots and we just pointed out it, turns out even for small DarkNet we could access from all overt world, from America from US, even from Australia, from New Zealand everything is on there, so visible effort is background of the IP localisation. You can see the animation. This is our home page so it's all on there. It's updated once a day. So, now to the level 2, so what we have done right now is really to take a look at single packets. So every packets you take a look and say OK what kind of Spambot is this based on local information which is extremely hard to do. What we did second part we thought would be simpler to check the multiple accesses from the same IP. When we have same IP accession different IP addresses in BotNet what is the pattern? We did access patterns which basically is just a sequence of the access from the specific IPs and what you see here is is a visualisation of the similarity, we used edit distance to compare all the access patterns and you see some structures, sure you have the short patterns which are usually quite similar because this is of course and it spreads out so you have some very -- access patterns, so this is basically edit distance transformed two dimensions with multi dimensional technique. So the nearer the points are here, the more similar the access patterns are. Here a cluster very near and some which are very different so some pattern, it doesn't look completely random, I would say.

And so what we did based on this result we thought OK maybe there is some pattern there maybe we can use this. We tried to assign it to -- we took each of the access patterns, we tried to find out what Spambot type it is so we matched it to the reference..from Marshal, we checked whether all the occurrences of the access patterns were correlated with the same Spambot type. We didn't have that much data so it's about a dozen different patterns here. And I am not really sure if you can see this properly because it's a bit of a problem but you see some clusters so it's like here it's dark is very similar access patterns and the columns and the rows are essentially the same access patterns. So you have the same so lowest possible distance and you have some clusters where you have here a group of access patterns which are related and here, although you have a small cluster free patterns and lass cluster free access patterns which are extremely similar. We didn't find Spambot specific patterns. For all of these free clusters which have shown which consist only of free access patterns, we found that three different Spambots have ex generated these extremely similar patterns so we thought indicate some line control, spammer who uses different kind of Spambot and gives them the same command to -- which is why so similar. So this was -- this is of course a preliminary result. You can't really see that much but found it interesting. What we also did as an example of a level 3, we checked Spam content vs. Type. You can't do this with DarkNet. We had Spam trap running on different machine and we did a correlation of the contents of Spam mail, we Spambot type we get from the reference of Marshal and you see a lot of Spambots don't send out Spam because you don't see them or at least they don't sends out Spam that we can correlate, this is a cluster which corresponds to Spam mail and this is sent out by Rustock so you see disdistinct mails from the content. This is more similar mail, the shorter the distances here and we use the content of the mail as 6 grams and distance of course visualisation down to two dimensions. So that is basically it. And you see actually some sub patterns here which look quite distinctive and of course you can have a large difference and some other -- some other Spambot types where we only have very, very little data. So that was also very surprising because there is no -- no compelling reason why different send out different Spam mails because these things are perm triesed, you can use every Spambot to send out every Spam mail if you programme it correctly. As to how these things are used in practice.

So that is basically it, my talk. So we already have a journal paper on this which has been submitted to a large channel and what we have of course is server with the capacity, up to 300,000 DarkNet IPs we can analyse. Working prototype. We have lots of ideas how to proceed this but we are lacking sufficient reference datay superb leer for BotNets, that is of course very little information so we don't have spot information which tells us these IP addresses belong to this BotNets which use for example this system, this is what we are currently lacking, this is probably the major thinking to lacking, our DarkNet is -- so that is another thing which are lacking. We also lack -- that is actually not the large problem. If the first things can be addressed, of course there will be no problem to get funding for this. So thanks for your attention. So I am open to questions.

CHAIR: Thank you very much. Are there any questions? Or is everyone simply focused on the coffee break at this point in time. No. OK. Thank you very much.

(Applause)

CHAIR: That is the end of our plenary session for this morning, so the next plenary session is in here in half an hour's time which, yes would be 11:00. So thank you all very much.

(Coffee break)