Topic Table Structure #10

harnen · 2021-04-03T09:17:57Z

One of the most fundamental questions that we have is the structure of the topic table. We can assume that a topic table will have a large, fixed size (~50k). We should:

fully utilize the available space
provide fairness across multiple registrant and topic
protect against all kinds of spamming attacks

The main tool that we have to our disposal is the waiting time that we return to registrars. By returning longer or shorter waiting times, we sort incoming requests and decide on the ads that are coming in first. Our initial discussion was to decide between fixed and non-fixed per-topic limit (e.g., we won't accept more than N registration for specific topic), but it seems that there are more things to take into account. Fixed limit doesn't satisfy 1) and 3) if an attacker uses multiple topics. With non-fixed, we have to be careful to provide 2) and avoid starvation of less popular topics.

Another issue to solve is assigning the waiting time in a way that prevents all the registrant, who are waiting to get in, from coming at the same time. If all the registrants do come at the same time (e.g., when a first slot frees up), we can admit only one of them, while the rest came for nothing generating a lot of overhead. On the other head, if we schedule only one node to come at the first available slot (and the rest one by one at later slots), the node might disappear leaving the table underutilized. It also seems difficult to fill this "hole in the queue". Either we contact all the waiting nodes and tell them to come one slot earlier (probably unfeasible and requires keeping a lot of state on the registrar) or we admit without some node that comes to ask for the ticket for the first time (may be exploited by free raiders that will constantly ping registrars hoping to get in without waiting). Finally, one last solution is to schedule more nodes at slots opening at the end of the queue. This will safely fill "the whole", but with a delay (we have to wait until all the initial nodes get in).

Currently, we are leaning towards a 2 step approach for assigning waiting times:

determine the order by which we want to admit requests based on the diversity score - the more different the request (in terms of IP, topic, ID) is from what we already have in our table, the higher the score (see details here)
assign slots freeing up based on the order from above

If our table is full, but 3 ads are about to expire in 7s, 12s, and 18s; we'll order all the waiting requests by the diversity score and tell the first one to come in 7s, the second one in 12s and the third one in 18s. This should work, but has one major problem - what if there is a queue of 100 waiting registrants with already assigned slots, but suddenly, there are some new nodes coming, with high diversity score, that should be admitted before everyone else? This is exactly the same problem as filling the whole in the paragraph above.

One final question is "when should we start giving tickets with waiting times instead of admitting registrants right away?". Currently, we give tickets only when the queue is full. However, this leads to a straightforward attack, where a malicious nodes spams requests and will be admitted every single time without waiting. The diversity score will kick in when the table is full and put all the legitimate nodes at the beginning of the queue (pushing the attacker at the end), but at this time, the damage has already been done. It will take long time for all the malicious ads to expire and the attacker can repeat his actions as soon as there's no queue of waiting registrants. We probably have to switch to giving ticket even if the table is not full. It causes an additional round-trip, but it's probably worth it.

The text was updated successfully, but these errors were encountered:

harnen · 2021-04-06T13:43:54Z

Some comments after discussing with Etienne:

If we start giving waiting times before the table gets full, it give us a buffer to work with
There's a chance that new requests getting at the beginning of the queue and disappearing registrants will even up
The waiting time should be based on table occupancy + request diversity score
We might use the initial walk used to join the DHT to bootstrap our table with some registrations (e.g., by asking each encountered node for its topics)
The cost of entry for topic announcements from the same IP (or announcements for the same topic) be mathematically designed to increase polynomially (i.e., not linearly) with the number of entries currently in the table for the same IP/topic
We need to make sure that we don't fully block users who share the same data centre on AWS (and thus IP poll) from registering

harnen · 2021-04-21T10:36:40Z

We should prevent nodes from keep coming back hoping for better waiting time. I.e., we should guarantee that the same node, coming back multiple times, won't be admitted faster than initially indicated.

In the current approach this can happen: if a node comes initially it receives an initial waiting time based on the current state of the table. After a few second multiple ads present in the table expire thus lowering average waiting time for newcomers. So the node can come again and again hoping for some ads expire.

We can take into account expiring ads. However, there might be also other registrant that could get admitted faster than node in question (thus inflating the waiting time of the original registrant).

Another approach is to make sure that the initial node, presenting itself as a new node and asking for a new waiting time, looses more on abandoning his original cumulative waiting time then he can get from other ads expiring.

harnen assigned harnen, srene and oascigil Apr 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Topic Table Structure #10

Topic Table Structure #10

harnen commented Apr 3, 2021 •

edited

Loading

harnen commented Apr 6, 2021

harnen commented Apr 21, 2021

Topic Table Structure #10

Topic Table Structure #10

Comments

harnen commented Apr 3, 2021 • edited Loading

harnen commented Apr 6, 2021

harnen commented Apr 21, 2021

harnen commented Apr 3, 2021 •

edited

Loading