File tree Expand file tree Collapse file tree 1 file changed +32
-3
lines changed Expand file tree Collapse file tree 1 file changed +32
-3
lines changed Original file line number Diff line number Diff line change 1
1
# datalake-worker
2
- Data lake implementation integrated with S3
2
+ Data lake implementation integrated with AWS S3
3
3
4
4
# Supported features
5
5
6
- - Async download a chunk from S3
6
+ - Async-Download chunks from AWS S3
7
7
- Persist on-disk in a lock-less manner
8
8
- List all persisted chunks by ID from a cache
9
9
- Find and lock a chunk - Once locked, chunk cannot be deleted
10
10
- Scheduled deletion - Scheduled for deletion, a chunk will be removed once it is no longer in use.
11
11
12
- - Backend-agnostic datamanager. The RocksDB backend can be substituted with any in-process NoSQL or SQL storage engine.
12
+ - Backend-agnostic datamanager. The RocksDB backend can be substituted with any in-process NoSQL or SQL storage engine.g
13
13
14
14
15
15
# Design
16
16
17
17
![ image info] ( ./design.png )
18
+
19
+ # Datasource structure
20
+
21
+ ### Cache - in-memory map of chunks IDs to a lock permit
22
+
23
+ | Chunk_ID | Permit |
24
+ | -------- | ------- |
25
+ | 0x0A0B | 0 |
26
+ | 0x0A0C | 1 |
27
+ | 0x0A0C | 0 |
28
+
29
+ ### OnDisk Tables and Indexes
30
+
31
+ | Chunk_ID | Encoded Chunk Data |
32
+ | -------- | ------- |
33
+ | 0x0A0B | 0x.. |
34
+ | 0x0A0C | 0x... |
35
+ | 0x0A0C | 0x |
36
+
37
+ | DatasetID_BlockNum | Chunk_ID |
38
+ | -------- | ------- |
39
+ | 100_0 | 0x0A0B |
40
+ | 100_1 | 0x0A0B |
41
+ | 100_2 | 0x0A0B |
42
+
43
+ | Metadata | Value |
44
+ | ---------| -------- |
45
+ 0x1 (Size_Key) | 2000000 |
46
+
You can’t perform that action at this time.
0 commit comments