BEGIN:VCALENDAR
VERSION:2.0
PRODID:researchseminars.org
CALSCALE:GREGORIAN
X-WR-CALNAME:researchseminars.org
BEGIN:VEVENT
SUMMARY:William Kuszmaul (MIT)
DTSTART:20211201T180000Z
DTEND:20211201T190000Z
DTSTAMP:20260423T021014Z
UID:TCSPlus/33
DESCRIPTION:Title: <a href="https://researchseminars.org/talk/TCSPlus/33/"
 >Linear Probing Revisited: Tombstones Mark the Demise of Primary Clusterin
 g</a>\nby William Kuszmaul (MIT) as part of TCS+\n\n\nAbstract\nThe linear
 -probing hash table is one of the oldest and most widely used data structu
 res in computer science. However\, linear probing also famously comes with
  a major drawback: as soon as the hash table reaches a high memory utiliza
 tion\, elements within the hash table begin to cluster together\, causing 
 insertions to become slow. This phenomenon\, now known as "primary cluster
 ing"\, was first captured by Donald Knuth in 1963\; at a load factor of $1
  - 1/x$\, the expected time per insertion becomes $\\Theta(x^2)$\, rather 
 than the more desirable $\\Theta(x)$.\n\nWe show that there is more to the
  story than the classic analysis would seem to suggest. It turns out that 
 small design decisions in how deletions are implemented have dramatic effe
 cts on the asymptotic performance of insertions. If these design decisions
  are made correctly\, then even a hash table that is continuously at a loa
 d factor $1 - \\Theta(1/x)$ can achieve average insertion time $\\tilde{O}
 (x)$. A key insight is that the tombstones left behind by deletions cause 
 a surprisingly strong "anti-clustering" effect\, and that when insertions 
 and deletions are one-for-one\, the anti-clustering effects of deletions a
 ctually overpower the clustering effects of insertions.\n\nWe also present
  a new variant of linear probing\, which we call "graveyard hashing"\, tha
 t completely eliminates primary clustering on any sequence of operations. 
 If\, when an operation is performed\, the current load factor is $1 - 1/x$
  for some $x$\, then the expected cost of the operation is $O(x)$. One cor
 ollary is that\, in the external-memory model with a data block size of $B
 $\, graveyard hashing offers the following remarkable guarantee: at any lo
 ad factor $1-1/x$ satisfying $x = o(B)$\, graveyard hashing achieves $1 + 
 o(1)$ expected block transfers per operation. Past external-memory hash ta
 bles have only been able to offer a $1+o(1)$ guarantee when the block size
  $B$ is at least $\\Omega(x^2)$.\n\nBased on joint work with Michael A. Be
 nder and Bradley C. Kuszmaul. To appear in FOCS 2021.\n
LOCATION:https://researchseminars.org/talk/TCSPlus/33/
END:VEVENT
END:VCALENDAR
