License
~~~~~~~

GPL v3 or (at your option) any later version


Terminoloy
~~~~~~~~~~

base device: this file/device/database holds all macroblocks, our worst case
threat-model assumes an adversary can make snapshots of it at will, each block
has a number

scubed3 partition: a block device whose contents are managed by a scubed3
deamon, writing to a scubed3 partition causes changes to the macroblocks in
the base device

macroblocks: the smallest bit of information to someone who doesn't have the
cryptographic key corresponding to it, a macroblock will either change randomly
(each bit changes with a probability of .5) or not at all

mesoblock: in a macroblock a series of mesoblocks are stored, along with
information on where the mesoblock belongs on the scubed3 partition.

indexblock: the first mesoblock in each macroblock contains information
about the other mesoblocks in said macroblock

ESSIV is a function that turns an uint64_t and two uint32_t's in a 128 bit
block for use as IV.

indexblocks are encrypted with IV = ESSIV(0, macroblock_number, 0)

mesoblocks are encrypted with IV = ESSIV(seqno, macroblock_number, index)

the indexblock IV will repeat when the same disk block is written

the mesoblock IV will never repeat (as long as the seqno is not reset)

Threat model
~~~~~~~~~~~~

The adversary has full knowledge about the macroblocks (including history) and
may even have write access to them. This adversary may have keys to some
scubed3 partitions. The goal is to be able to plausibly deny the existance of
partitions to which the adversary has no keys.  Only paranoia level 3 protects
completely against this. We assume the adversary cannot detect read access.


Paranoia levels
~~~~~~~~~~~~~~~

When a new macroblock needs to be selected:

Level 0: not paranoid (NOT IMPLEMENTED YET)

it is selected from all allocated blocks based on it's emptyness, the most
empty block gets selected.

Level 1: moderately paranoid

it is randomly selected from all allocated blocks. This hurts write
performance.

Level 2: just paranoid (NOT IMPLEMENTED YET)

a random block of the device is selected. If it happens to be allocated to the
current device, it is used. If it happens to be allocated to another device, it
is updated with a new seqno (therefore every bit will change with a probalility
of .5). If the block is unallocated, it will be added to the device, another
block will be deleted. (this level of paranoia is required for flash media,
this kind of media keeps an internal record about the order in whichs blocks
are written to it) It is only safe if all scubed3 devices are known to and
(if active) managed by the scubed3 deamon managing the current device.

[can be implemented as: select device (weighted by size), and write a random block]

Level 3: extremely paranoid (NOT IMPLEMENTED YET)

while active, the daemon writes blocks to random locations at regular intervals
to hide any real activity, this severely hurts performance. Considerations
of level 2 apply. It will wear out your flash very efficiently.


How to start
~~~~~~~~~~~~

You need to provide a set of macroblocks to scubed3, to which it has read/write
access. Normally this is either a file or a block device, but it may be
anything. (you would, however, need to write a custom blockio_init_* function
and provide read, write and close methods to it if it is not a file or a block
device)


Control protocol
~~~~~~~~~~~~~~~~

/MOUNTPOINT/.control is a file to which commands can be written, and
scubed3's response can be read. A command is a line of text terminated
with a linefeed '\n'.

A response starts with "OK\n", or "ERR\n". One or more lines follow.
The last line must be ".\n". This terminates the message.

The command scubed3ctl uses this interface to communicate with scubed3.

Commands:

- status

shows the status

- open NAME MODE KEY

opens a scubed partition

* NAME is the name, like root or swap, whatever, the name has no real meaning,
you can open any partition under any name

* MODE is the ciphermode, eg CBC_LARGE(AES)

* KEY is the cipher key bas16 encoded (hex).

- resize NAME MACROBLOCKS

resizes a scubed partition

* NAME ....

* MACROBLOCKS, the amount of macroblocks owned by the partition


Example calculation if indexblock size
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For this example we assume the following. Macroblock size 4MB (m). Mesoblock
size 16kB (s). So, there are 256 mesoblocks per macroblock. Mesoblock
0 is the indexblock, the other ones are data blocks.

Whatever values are used, the indexblock MUST fit in one mesoblock! Required
size of the indexblock depends on the maximum amount of macroblocks scubed3 can
support (the size of a backing device can change, so scubed3 should be able to
deal with it, up to a maximum size).

The useful size of the device is
	(1<<(macroblock_log - mesoblock_log) - 1)(no_macroblocks - reserved_blocks)

Note that the usable size is limited by 2^32*(1<<mesoblock_log) - 1, because the
device is partitioned in mesoblocks, and we have 32 bits available to index
the mesoblocks.

Layout of macroblock:

0x000000 SHA256_HASH_INDEXBLOCK hash of 0x000020 - 0x003FFF
0x000020 SHA256_HASH_DATA (hash of ciphertext)
0x000040 SHA256_HASH_SEQNOS  /* seqnos of used macroblocks */
0x000060 uint64_t seqno
0x000068 uint64_t next_seqno
0x000070 8byte literal "SSS3v0.1"
0x000078 uint32_t no_macroblocks
0x00007C uint32_t reserved_blocks
0x000080 uint8_t[128] reserved space, MUST be 0
0x000100 uint32_t no_indices
0x000104 uint32_t idx0x01
0x000108 uint32_t idx0x02
......
0x0004FC uint32_t idx0xff
0x000500 bitmap: each bit represents a macroblock of the base device
		 0 FREE
		   this block may be ours: if so, it has our signature
		   on disk (otherwise we wouldn't be able to know that
		   it is ours) and we don't care about the contents of
		   this block
		 1 USED
		   this block is ours: it may not have our signature yet
		   because it has never been written and we care about
		   the data
		(notice that it only makes sense not to care about the contents
 		of a block if there are no older revisions of that data available
		that we DO care about...)
0x003FFC last 32 bits of bitmap (the bitmap is stored in units of 32 bits)
0x004000 mesoblock1
0x008000 mesoblock2
0x00C000 mesoblock3
........
0x3FC000 mesoblock255
0x400000 end

There is space to index 0x4000 - 0x0500 = 0x3B00 = 15104 bytes,
which corresponds to a base device size of 15104*8 = 120832 macroblocks,
which correspond to 4MiB * 120832 = 472GiB

Maximum usable size of the device is 2^32*(1<<mesoblock_log) = 64TiB, which is
sufficient in the light of the maximum size of the backing device.

In general the indexblock requires:
- fixed size: 260 bytes (including 128 bytes of reserved space)
- 4 bytes for every mesoblock in a macroblock
- 1 bit for every macroblock

Random block selection
~~~~~~~~~~~~~~~~~~~~~~

How to randomly select blocks and assign roles to them?

In this example, we use 6 macroblocks, each write a group of 3 blocks needs to
be preserved. 

Suppose we have the random numbers 20312015..

The algorithm works as follows:
- assume the current list of random numbers is valid
- add a number
  * check for problems caused by this number, stop if no problems
  * mark problem causing number as unused, check for problems
    caused by this shift, of so continue here

step  | disk   | rng output | commentary
      | 012345 | 20312015.. |
---------------------------------------------------------------------
0     |   0    | 2          | check none, use empty #2 as DATA0
1     | 1 0    | 20         | check [0..0], use empty #0 as DATA1
2     | 1 02   | 203        | check [0..1], use empty #3 as DATA2
3     | 1302   | 2031 	    | check [0..2], use empty #1 as DATA3
4     | 1342   | 20312      | check [1..3], overwrite DATA0 at #2 with DATA4
5     | 5342   | 203120     | check [2..4], overwrite DATA1 at #0 with DATA5
6     | 5642   | 203_201    | check [3..5], conflict at #1 with DATA3, mark as unused
6.5   | 5642   | 2_3_201    | check [1..5], conflict at #0 with DATA1, mark as unused
6.4   | 5642   | __3_201    | check [0..4], conflict at #2 with DATA0, mark as unused
7     | 5642 7 | __3_2015   | check [], conflict at #2 with DATA0, mark as unused

So, how to know when a block designation as datablock is definitive (i.e.: will not be changed to _)?

Let's take for example (NO_BLOCKS=6, SIMULTANEOUS_BLOCKS=3)

012345012345

This looks pretty safe. Lets try to add numbers to break it

012345012345 ++ 145
0123450231451
012345023451 251
01234503425

012345012345  ++ 1 -> 0123450123451
0123450123451 ++ 4 -> 0123450123514
0123450123514 ++ 5 -> 012345023145
012345023145  ++ 1 -> 012345023451

012345012345 ++ 1451 -> 012345023451
012345023451 ++ 2512 -> 012345034512
012345034512 ++ 3123 -> 0123045123
0123045123

012345012345 

SIMULTANEOUS_BLOCKS=0 -> all blocks are immediately safe
SIMULTANEOUS_BLOCKS=1 -> a block is safe if it is followed by another block
SIMULTANEOUS_BLOCKS=2 -> ? due to cascade effect, you are never safe ?

for example: NO_BLOCKS=4, SIMULTANEOUS_BLOCKS=2
01230123 21
0123012321 2
0123012312 13
01230123213 1
01230123231 32
012301232132 3
01230123123

-----------------------------
NONSENSE, to delete:

In this example, we use 6 macroblocks, each write a group of 3 blocks needs to
be preserved (as can be seen in testing/, this is not possible (if you try it,
every block will be marked as filler)), but it leads to a nice example that
illustrates the recursive workings of the algorithm.

Randomly select blocks to write is easy; just use an RNG. But how to select
roles? We must do it in such a way, that after every write the group of 2
blocks (that contain the state of the scubed3 device) are preserved.

Suppose we have the random numbers 20312015..

The algorithm works as follows. The the first random number, mark the block as used
on de disk as a DATA block. Take the next random number:
- if we overwrite an old data block, a filler block or an empty block, just continue
- if we overwrite a data block that is not old enough, backtrack to that block,
  mark it as filler and then revisit all blocks between the old block and the current block
  (revisiting: because we lost a data block, we have to check that new data blocks
  do not overwrite data blocks that are too young, if we overwrite a block that is too young,
  backtrack and mark that block as filler etc....)


step  | disk   | rng output | commentary
      | 012345 | 20312015.. |
---------------------------------------------------------------------
0     |   0    | 0          | use empty #2 as DATA0
1     | 1 0    | 01         | use empty #0 as DATA1
2     | 1 02   | 012        | use empty #3 as DATA2
3     | 1302   | 0123 	    | use empty #1 as DATA3
4     | 1342   | 01234      | overwrite DATA0 at #2 with DATA4
5     | 5342   | 012345     | overwrite DATA1 at #0 with DATA5
6     | 5*42   | 012345*    | conflict at #1 with DATA3, rewind to 3
6.3   | 1-02   | 012-       |  mark #1 as FILLER
6.4   | 1-*2   | 012-*      |  conflict at #2 with DATA0, rewind to 0
6.4.0 |   -    | -          |   mark #1 as FILLER
6.4.1 | 0 -    | -0         |   use empty #0 as DATA0
6.4.2 | 0 -1   | -01        |   use empty #3 as DATA1
6.4.3 | 0--1   | -01-       |   block #1 already marked as FILLER
6.4.4 | 0-21   | -01-2      |   overwrite FILLER at #2 with DATA2
6.5   | *-21   | -01-2*     |  conflict at #0 with DATA0, rewind to 1
6.5.1 | - -    | --         |   mark #0 as FILLER
6.5.2 | - -0   | --0        |   use empty #3 as DATA0
6.5.3 | ---0   | --0-       |   block #1 already marked as FILLER
6.5.4 | --10   | --0-1      |   overwrite FILLER at #2 with DATA1
6.5.5 | 2-10   | --0-12     |   overwrite FILLER at #0 with DATA2
6.6   | 2310   | --0-123    |   overwrite FILLER at #1 with DATA3
7     | 2310 4 | --0-1234   | use empty #5 as DATA4
ETC   | ETC    | ETC        | ETC

step      | disk | rng output    |
          | 0123 | 0123012301231 |
------------------------------------------------
0         | 0    | 0             | use empty #0 as DATA0
1         | 01   | 01            | use empty #1 as DATA1
2         | 012  | 012           | use empty #2 as DATA2
3         | 0123 | 0123          | use empty #3 as DATA3
4         | 4123 | 01234         | overwrite DATA0 at #0 with DATA4
5         | 4523 | 012345        | overwrite DATA1 at #1 with DATA5
6         | 4563 | 0123456       | overwrite DATA2 at #2 with DATA6
7         | 4567 | 01234567      | overwrite DATA3 at #3 with DATA7
8         | 8567 | 012345678     | overwrite DATA4 at #0 with DATA8
9         | 8967 | 0123456789    | overwrite DATA5 at #1 with DATA9
a         | 89a7 | 0123456789a   | overwrite DATA6 at #2 with DATAa
b         | 89ab | 0123456789ab  | overwrite DATA7 at #3 with DATAb
c         | 8*ab | 0123456789ab* | conflict at #1 with DATA9, rewind to 9
c.9       | 8-67 | 012345678-    |  mark #1 as FILLER
c.a       | 8-*  | 012345678-*   |  conflict at #2 with DATA6, rewind to 6
c.a.6     | 45-3 | 012345-       |   mark #2 as FILLER
c.a.7     | 45-* | 012345-*      |   conflict at #3 with DATA3, rewind to 3
c.a.7.3   | 012- | 012-          |    mark #3 as FILLER
c.a.7.4   | *12- | 012-*         |    conflict at #0 with DATA4, rewind to 0
c.a.7.4.0 | -    | -             |     mark #0 as FILLER
c.a.7.4.1 | -0   | -0            |     use empty #1 as DATA0
c.a.7.4.2 | -01  | -01           |     use empty #2 as DATA1
c.a.7.4.3 | -01- | -01-          |     block #3 already marked as FILLER
c.a.7.4.4 | 201- | -01-2         |     overwrite FILLER at #0 with DATA2
c.a.7.5   | 2*1- | -01-2*        |     conflict at #1 with DATA0, rewind to 1
c.a.7.5.1 | --   | --            |      mark #1 as FILLER


How can we be sure that te numbering of a data block is definitive?

Well, just look at maximum backtrack the algorithm produces, if we are far away
from the maximum backtrack, we assume everyhting is ok. If we would continue
with the above example, all blocks will eventually be labeled as FILLER.

[TODO: find criterium for definitive numbering]

Suppose we have 4 macro blocks with the following roles:
D1 D0 D2 D3 D1 D0
 0  1  2  3  4  5

D1 D0 D2 D1 D0 D3
 0  1  2  3  4  5

D1 -2 -3 -0 -3 -2 -3 -3 D2 -3 -1 -3 -1 D0 D3 D1
 0                       1              2  3  4

D1 -2 -3 -0 -3 -2 -3 -3 D2 -3 -1 -3 -1 -0 D3 D1 D0
 0                       1                 2  3  4

-1 -2 -3 -0 -3 -2 -3 -3 D2 -3 -1 -3 -1 -0 -3 D1 D0 D3
                         0                    1  2  3

-1 -2 -3 -0 -3 -2 -3 -3 D2 -3 -1 -3 -1 -0 -3 D1 D0 D3 D2
                         0                    1  2  3  4

-1 -2 -3 -0 -3 -2 -3 -3 D2 -3 -1 -3 -1 -0 -3 D1 -0 D3 D2 D0
                         0                    1     2  3  4

-1 -2 -3 -0 -3 -2 -3 -3 -2 -3 -1 -3 -1 -0 -3 D1 -0 -3 D2 D0 D3
                                              0        1  2  3 

NO_BLOCKS=5, SIMULTANEOUS_BLOCKS=2
D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D1 D3 D0 D2
D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D1 D3 D0 D2 D1
D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D1 D3 D2 D1 D0
D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D3 D1 D0 D2
D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D3 D0 D2 D1
D4 D3 D0 D2 D1 D3 D0 D4 D2 D0 D1 D3 D2 D1 D0 D2 D4 D3 D0 D2 D1 D2

1 2 3 4 5 1 2 3 4 5
1 2 3 4 5 1 2 3 4 5 2 1
1 2 3 4 5 1 2 3 - 5 2 4
1 2 3 4 5 1 3 2 4 5

1 2 3 4 5 1 2 3 4 5
1 2 3 4 5 1 2 3 4 5 2
1 2 3 4 5 1 2 3 - 5 2 4
1 2 3 4 5 1 - 3 - 4 5 2 3
1 2 3 4 5 1 - 3 - 4 - 2 3 5
1 2 3 4 5 1 3 4 5 2 3
1 2 3 4 5 1 3 4 5 2
1 2 3 4 5 1 3 4 5 2 3
1 2 3 4 5 1 3 4 5 2
1 2 3 4 5 1 3 4 5 2 1

1 2 3 4 5 1 2 5 3 4

1234512345
1234512354
12345125432
12345125423
123415234
y

512345 ++ 43234
-1---5--234

15234 ++ 134
152--134

NO_BLOCKS=3 SIMULTANEOUS_BLOCKS=2
012012012 ++ 1
0-2-1-0-21

32103213201
321032132012
32103213021

Maximum backtrack (in situations where not all blocks will be marked as FILLER

eventually) is about equal to two times NO_BLOCKS for a run of 300000 blocks.
So we wait 5 times NO_BLOCKS to be sure.

How to restart the RNG after close/open or resize?

Generate a random sequence, and assign roles according to the algorithm. Then permute the numbers
so that the new sequence is consistent with the state of the disk.

An example:

Suppose the disk is at sequence 39:

(the numbers here are sequence numbers, every block has
a sequence number, in step n of the writing process, the block
with sequence number n is written)

39: 0/35~ 1/39- 2/34~ 3/38- 4/37D 5/29~ 6/32~ 7/36D

This means that:

block 0 OLD data block established at seqno 35
block 1 FILLER block established at seqno 39
block 2 OLD data block established at seqno 34
block 3 FILLER block established at seqno 38
block 4 CURRENT data block established at seqno 37
block 5 OLD data block established at seqno 29
block 6 OLD data block established at seqno 32
block 7 CURRENT data block established at seqno 36

Now we want to sync the RNG to this harddisk state.
Assume the RNG generated  464047060734627375331140...
our algorithm gives       -D-DDD-DDDDDDD-DDD-D-DDD...

Interpret RNG output as symbols using this translation
	01234567
	stuvwxyz

Note that this translation does not affect the roles assigned to the blocks
by the algorithm. We get: wywswzsyszvwyuzvzxvvttws...

This is the reconstructed timeline based on the pattern on disk
and our new data.

data block and prospective datablock turned to FILLER blocks ages,
age is 0, 1 or '~' (meaning OLD), '.' overwritten or ' ' empty

  3333333344444444
  2345678901234567
  ~.~~DD!!-D-DDD-D
  6?207431wywswzsy
  --old--><--new--
0 ...01~~~
1 .......0
2 ..01~~~~
3 ......01
4 .....000
5 ~~~~~~~~
6 ~~~~~~~~
7 ....0111

' ' is an empty block on disk (MAY overwrite) (not shown in the example)
'~' is an old data block on disk (MAY overwrite)
'.' is a block that is not on disk anymore (ignore)
'D' is a current data block (MUST NOT overwrite until it's obsolete)
'!' is an old FILLER block (MUST overwrite

Now we must choose the values of stuvwxyz based on the constraints. Since each data block
encodes the disk block that obsoletes it, we already know the value of y and s.

Furthermore:

- w or y must be equal to 3 to obsolete disk block 3 in time
- w, y or s must be equal to 1 to obsolete disk block 1 in time