License ~~~~~~~ GPL v3 or (at your option) any later version Terminoloy ~~~~~~~~~~ base device: this file/device/database holds all macroblocks, our worst case threat-model assumes an adversary can make snapshots of it at will, each block has a number scubed3 partition: a block device whose contents are managed by a scubed3 deamon, writing to a scubed3 partition causes changes to the macroblocks in the base device macroblocks: the smallest bit of information to someone who doesn't have the cryptographic key corresponding to it, a macroblock will either change randomly (each bit changes with a probability of .5) or not at all mesoblock: in a macroblock a series of mesoblocks are stored, along with information on where the mesoblock belongs on the scubed3 partition. indexblock: the first mesoblock in each macroblock contains information about the other mesoblocks in said macroblock ESSIV is a function that turns an uint64_t and two uint32_t's in a 128 bit block for use as IV. indexblocks are encrypted with IV = ESSIV(0, macroblock_number, 0) mesoblocks are encrypted with IV = ESSIV(seqno, macroblock_number, index) the indexblock IV will repeat when the same disk block is written the mesoblock IV will never repeat (as long as the seqno is not reset) Threat model ~~~~~~~~~~~~ The adversary has full knowledge about the macroblocks (including history) and may even have write access to them. This adversary may have keys to some scubed3 partitions. The goal is to be able to plausibly deny the existance of partitions to which the adversary has no keys. Only paranoia level 3 protects completely against this. We assume the adversary cannot detect read access. Paranoia levels ~~~~~~~~~~~~~~~ When a new macroblock needs to be selected: Level 0: not paranoid (NOT IMPLEMENTED YET) it is selected from all allocated blocks based on it's emptyness, the most empty block gets selected. Level 1: moderately paranoid it is randomly selected from all allocated blocks. This hurts write performance. Level 2: just paranoid (NOT IMPLEMENTED YET) a random block of the device is selected. If it happens to be allocated to the current device, it is used. If it happens to be allocated to another device, it is updated with a new seqno (therefore every bit will change with a probalility of .5). If the block is unallocated, it will be added to the device, another block will be deleted. (this level of paranoia is required for flash media, this kind of media keeps an internal record about the order in whichs blocks are written to it) It is only safe if all scubed3 devices are known to and (if active) managed by the scubed3 deamon managing the current device. [can be implemented as: select device (weighted by size), and write a random block] Level 3: extremely paranoid (NOT IMPLEMENTED YET) while active, the daemon writes blocks to random locations at regular intervals to hide any real activity, this severely hurts performance. Considerations of level 2 apply. It will wear out your flash very efficiently. How to start ~~~~~~~~~~~~ You need to provide a set of macroblocks to scubed3, to which it has read/write access. Normally this is either a file or a block device, but it may be anything. (you would, however, need to write a custom blockio_init_* function and provide read, write and close methods to it if it is not a file or a block device) Control protocol ~~~~~~~~~~~~~~~~ /MOUNTPOINT/.control is a file to which commands can be written, and scubed3's response can be read. A command is a line of text terminated with a linefeed '\n'. A response starts with "OK\n", or "ERR\n". One or more lines follow. The last line must be ".\n". This terminates the message. The command scubed3ctl uses this interface to communicate with scubed3. Commands: - status shows the status - open NAME MODE KEY opens a scubed partition * NAME is the name, like root or swap, whatever, the name has no real meaning, you can open any partition under any name * MODE is the ciphermode, eg CBC_LARGE(AES) * KEY is the cipher key bas16 encoded (hex). - resize NAME MACROBLOCKS resizes a scubed partition * NAME .... * MACROBLOCKS, the amount of macroblocks owned by the partition Example calculation if indexblock size ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For this example we assume the following. Macroblock size 4MB (m). Mesoblock size 16kB (s). So, there are 256 mesoblocks per macroblock. Mesoblock 0 is the indexblock, the other ones are data blocks. Whatever values are used, the indexblock MUST fit in one mesoblock! Required size of the indexblock depends on the maximum amount of macroblocks scubed3 can support (the size of a backing device can change, so scubed3 should be able to deal with it, up to a maximum size). The useful size of the device is (1<<(macroblock_log - mesoblock_log) - 1)(no_macroblocks - reserved_blocks) Note that the usable size is limited by 2^32*(1< 0123450123451 0123450123451 ++ 4 -> 0123450123514 0123450123514 ++ 5 -> 012345023145 012345023145 ++ 1 -> 012345023451 012345012345 ++ 1451 -> 012345023451 012345023451 ++ 2512 -> 012345034512 012345034512 ++ 3123 -> 0123045123 0123045123 012345012345 SIMULTANEOUS_BLOCKS=0 -> all blocks are immediately safe SIMULTANEOUS_BLOCKS=1 -> a block is safe if it is followed by another block SIMULTANEOUS_BLOCKS=2 -> ? due to cascade effect, you are never safe ? for example: NO_BLOCKS=4, SIMULTANEOUS_BLOCKS=2 01230123 21 0123012321 2 0123012312 13 01230123213 1 01230123231 32 012301232132 3 01230123123 ----------------------------- NONSENSE, to delete: In this example, we use 6 macroblocks, each write a group of 3 blocks needs to be preserved (as can be seen in testing/, this is not possible (if you try it, every block will be marked as filler)), but it leads to a nice example that illustrates the recursive workings of the algorithm. Randomly select blocks to write is easy; just use an RNG. But how to select roles? We must do it in such a way, that after every write the group of 2 blocks (that contain the state of the scubed3 device) are preserved. Suppose we have the random numbers 20312015.. The algorithm works as follows. The the first random number, mark the block as used on de disk as a DATA block. Take the next random number: - if we overwrite an old data block, a filler block or an empty block, just continue - if we overwrite a data block that is not old enough, backtrack to that block, mark it as filler and then revisit all blocks between the old block and the current block (revisiting: because we lost a data block, we have to check that new data blocks do not overwrite data blocks that are too young, if we overwrite a block that is too young, backtrack and mark that block as filler etc....) step | disk | rng output | commentary | 012345 | 20312015.. | --------------------------------------------------------------------- 0 | 0 | 0 | use empty #2 as DATA0 1 | 1 0 | 01 | use empty #0 as DATA1 2 | 1 02 | 012 | use empty #3 as DATA2 3 | 1302 | 0123 | use empty #1 as DATA3 4 | 1342 | 01234 | overwrite DATA0 at #2 with DATA4 5 | 5342 | 012345 | overwrite DATA1 at #0 with DATA5 6 | 5*42 | 012345* | conflict at #1 with DATA3, rewind to 3 6.3 | 1-02 | 012- | mark #1 as FILLER 6.4 | 1-*2 | 012-* | conflict at #2 with DATA0, rewind to 0 6.4.0 | - | - | mark #1 as FILLER 6.4.1 | 0 - | -0 | use empty #0 as DATA0 6.4.2 | 0 -1 | -01 | use empty #3 as DATA1 6.4.3 | 0--1 | -01- | block #1 already marked as FILLER 6.4.4 | 0-21 | -01-2 | overwrite FILLER at #2 with DATA2 6.5 | *-21 | -01-2* | conflict at #0 with DATA0, rewind to 1 6.5.1 | - - | -- | mark #0 as FILLER 6.5.2 | - -0 | --0 | use empty #3 as DATA0 6.5.3 | ---0 | --0- | block #1 already marked as FILLER 6.5.4 | --10 | --0-1 | overwrite FILLER at #2 with DATA1 6.5.5 | 2-10 | --0-12 | overwrite FILLER at #0 with DATA2 6.6 | 2310 | --0-123 | overwrite FILLER at #1 with DATA3 7 | 2310 4 | --0-1234 | use empty #5 as DATA4 ETC | ETC | ETC | ETC step | disk | rng output | | 0123 | 0123012301231 | ------------------------------------------------ 0 | 0 | 0 | use empty #0 as DATA0 1 | 01 | 01 | use empty #1 as DATA1 2 | 012 | 012 | use empty #2 as DATA2 3 | 0123 | 0123 | use empty #3 as DATA3 4 | 4123 | 01234 | overwrite DATA0 at #0 with DATA4 5 | 4523 | 012345 | overwrite DATA1 at #1 with DATA5 6 | 4563 | 0123456 | overwrite DATA2 at #2 with DATA6 7 | 4567 | 01234567 | overwrite DATA3 at #3 with DATA7 8 | 8567 | 012345678 | overwrite DATA4 at #0 with DATA8 9 | 8967 | 0123456789 | overwrite DATA5 at #1 with DATA9 a | 89a7 | 0123456789a | overwrite DATA6 at #2 with DATAa b | 89ab | 0123456789ab | overwrite DATA7 at #3 with DATAb c | 8*ab | 0123456789ab* | conflict at #1 with DATA9, rewind to 9 c.9 | 8-67 | 012345678- | mark #1 as FILLER c.a | 8-* | 012345678-* | conflict at #2 with DATA6, rewind to 6 c.a.6 | 45-3 | 012345- | mark #2 as FILLER c.a.7 | 45-* | 012345-* | conflict at #3 with DATA3, rewind to 3 c.a.7.3 | 012- | 012- | mark #3 as FILLER c.a.7.4 | *12- | 012-* | conflict at #0 with DATA4, rewind to 0 c.a.7.4.0 | - | - | mark #0 as FILLER c.a.7.4.1 | -0 | -0 | use empty #1 as DATA0 c.a.7.4.2 | -01 | -01 | use empty #2 as DATA1 c.a.7.4.3 | -01- | -01- | block #3 already marked as FILLER c.a.7.4.4 | 201- | -01-2 | overwrite FILLER at #0 with DATA2 c.a.7.5 | 2*1- | -01-2* | conflict at #1 with DATA0, rewind to 1 c.a.7.5.1 | -- | -- | mark #1 as FILLER How can we be sure that te numbering of a data block is definitive? Well, just look at maximum backtrack the algorithm produces, if we are far away from the maximum backtrack, we assume everyhting is ok. If we would continue with the above example, all blocks will eventually be labeled as FILLER. [TODO: find criterium for definitive numbering] Suppose we have 4 macro blocks with the following roles: D1 D0 D2 D3 D1 D0 0 1 2 3 4 5 D1 D0 D2 D1 D0 D3 0 1 2 3 4 5 D1 -2 -3 -0 -3 -2 -3 -3 D2 -3 -1 -3 -1 D0 D3 D1 0 1 2 3 4 D1 -2 -3 -0 -3 -2 -3 -3 D2 -3 -1 -3 -1 -0 D3 D1 D0 0 1 2 3 4 -1 -2 -3 -0 -3 -2 -3 -3 D2 -3 -1 -3 -1 -0 -3 D1 D0 D3 0 1 2 3 -1 -2 -3 -0 -3 -2 -3 -3 D2 -3 -1 -3 -1 -0 -3 D1 D0 D3 D2 0 1 2 3 4 -1 -2 -3 -0 -3 -2 -3 -3 D2 -3 -1 -3 -1 -0 -3 D1 -0 D3 D2 D0 0 1 2 3 4 -1 -2 -3 -0 -3 -2 -3 -3 -2 -3 -1 -3 -1 -0 -3 D1 -0 -3 D2 D0 D3 0 1 2 3 NO_BLOCKS=5, SIMULTANEOUS_BLOCKS=2 D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D1 D3 D0 D2 D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D1 D3 D0 D2 D1 D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D1 D3 D2 D1 D0 D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D3 D1 D0 D2 D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D3 D0 D2 D1 D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D3 D0 D2 D1 D2 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 2 1 1 2 3 4 5 1 2 3 - 5 2 4 1 2 3 4 5 1 3 2 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 2 1 2 3 4 5 1 2 3 - 5 2 4 1 2 3 4 5 1 - 3 - 4 5 2 3 1 2 3 4 5 1 - 3 - 4 - 2 3 5 1 2 3 4 5 1 3 4 5 2 3 1 2 3 4 5 1 3 4 5 2 1 2 3 4 5 1 3 4 5 2 3 1 2 3 4 5 1 3 4 5 2 1 2 3 4 5 1 3 4 5 2 1 1 2 3 4 5 1 2 5 3 4 1234512345 1234512354 12345125432 12345125423 123415234 y 512345 ++ 43234 -1---5--234 15234 ++ 134 152--134 NO_BLOCKS=3 SIMULTANEOUS_BLOCKS=2 012012012 ++ 1 0-2-1-0-21 32103213201 321032132012 32103213021 Maximum backtrack (in situations where not all blocks will be marked as FILLER eventually) is about equal to two times NO_BLOCKS for a run of 300000 blocks. So we wait 5 times NO_BLOCKS to be sure. How to restart the RNG after close/open or resize? Generate a random sequence, and assign roles according to the algorithm. Then permute the numbers so that the new sequence is consistent with the state of the disk. An example: Suppose the disk is at sequence 39: (the numbers here are sequence numbers, every block has a sequence number, in step n of the writing process, the block with sequence number n is written) 39: 0/35~ 1/39- 2/34~ 3/38- 4/37D 5/29~ 6/32~ 7/36D This means that: block 0 OLD data block established at seqno 35 block 1 FILLER block established at seqno 39 block 2 OLD data block established at seqno 34 block 3 FILLER block established at seqno 38 block 4 CURRENT data block established at seqno 37 block 5 OLD data block established at seqno 29 block 6 OLD data block established at seqno 32 block 7 CURRENT data block established at seqno 36 Now we want to sync the RNG to this harddisk state. Assume the RNG generated 464047060734627375331140... our algorithm gives -D-DDD-DDDDDDD-DDD-D-DDD... Interpret RNG output as symbols using this translation 01234567 stuvwxyz Note that this translation does not affect the roles assigned to the blocks by the algorithm. We get: wywswzsyszvwyuzvzxvvttws... This is the reconstructed timeline based on the pattern on disk and our new data. data block and prospective datablock turned to FILLER blocks ages, age is 0, 1 or '~' (meaning OLD), '.' overwritten or ' ' empty 3333333344444444 2345678901234567 ~.~~DD!!-D-DDD-D 6?207431wywswzsy --old--><--new-- 0 ...01~~~ 1 .......0 2 ..01~~~~ 3 ......01 4 .....000 5 ~~~~~~~~ 6 ~~~~~~~~ 7 ....0111 ' ' is an empty block on disk (MAY overwrite) (not shown in the example) '~' is an old data block on disk (MAY overwrite) '.' is a block that is not on disk anymore (ignore) 'D' is a current data block (MUST NOT overwrite until it's obsolete) '!' is an old FILLER block (MUST overwrite Now we must choose the values of stuvwxyz based on the constraints. Since each data block encodes the disk block that obsoletes it, we already know the value of y and s. Furthermore: - w or y must be equal to 3 to obsolete disk block 3 in time - w, y or s must be equal to 1 to obsolete disk block 1 in time