Presentation on Roaring bitmaps for the Go Montreal meetup (Go 10th anniversary).
Roaring bitmaps are a standard indexing data structure. They are
widely used in search and database engines. For example, Lucene, the
search engine powering Wikipedia relies on Roaring. The Go library
roaring implements Roaring bitmaps in Go. It is used in several
popular systems such as InfluxDB, Pilosa and Bleve. This library is
used in production in several systems, it is part of the Awesome Go
collection. After presenting the library, we will cover some advanced
Go topics such as the use of assembly language, unsafe mappings, and
so forth.
4. tests: ?
intersections: ,unions: ,differences:
Similarity(Jaccard/Tanimoto):
Iteration
x ∈ S
S ∩2 S1 S ∪2 S1 S ∖2 S1
∣S ∩1 S ∣/∣S ∪1 1 S ∣2
Fastindexeswithroaring-DanielLemire#gomtl-10November19th.
11. Howfastisit?
index = x / 64 -> a shift
mask = 1 << ( x % 64) -> a shift
array[ index ] |- mask -> a OR with memory
Onebitevery cyclesbecauseofsuperscalarity≈ 1.65
Fastindexeswithroaring-DanielLemire#gomtl-10November19th.
29. Castingasliceistricky
func byteSliceAsUint16Slice(slice []byte) (result []uint16) { // here we create a new slice holder
if len(slice)%2 != 0 {
panic("Slice size should be divisible by 2")
}
// reference: https://go101.org/article/unsafe.html
// make a new slice header
bHeader := (*reflect.SliceHeader)(unsafe.Pointer(&slice))
rHeader := (*reflect.SliceHeader)(unsafe.Pointer(&result))
// transfer the data from the given slice to a new variable (our result)
rHeader.Data = bHeader.Data
rHeader.Len = bHeader.Len / 2
rHeader.Cap = bHeader.Cap / 2
// instantiate result and use KeepAlive so data isn't unmapped.
runtime.KeepAlive(&slice) // it is still crucial, GC can free it)
// return result
return
}
Roaringbitmaps