-
Notifications
You must be signed in to change notification settings - Fork 0
/
u.2.5.4 - nosql.txt
104 lines (99 loc) · 4.89 KB
/
u.2.5.4 - nosql.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
// LESOSN 1.0 - INTRO TO NOSQL
definition
- does not use a relational data modeling technique
- better at scaling out than rdbms because of its horizontal scalability
// LESSON 1.1 - FOUR TYPES OF NOSQL DATABASES
key-value store
definition
- data are stored in key value pairs, somehow like json
sample data
row | key | value
1 id1 {"name":"Renz","age":20}
2 id2 {"name":"Rene","age":21}
document-based store
definition
- datas are stored in a document like json, xml, etc.
- you could think of json as python's dictionary or javascript's object
sample data: json
"json":{
"name":"Renz",
"age":20
"items":{
"obj1": 1,
"obj2": 2,
}
}
wide-column-based store
definition
- pretty much like rdbms except much preferred for semi-structured/unstructured and big data
- stores information in columns, rather than in rows which is more usual with sql
- data is stored in columns, which are grouped into families, and these families are further grouped into more columns
sample data: column oriented
column 1 column 2 column 3
name | id | grade | id | gpa | id
john 001 | Senior 001 | 4.00 001
karen 002 | Junior 002 | 3.67 002
graph-based store
definition
- treat the relationship between any two pieces of information as being just as important as the information itself
- this type of data model is really made for any information that you'd usually represent on a graph
- it uses relationships and nodes, with the data being the information itself, and the relationship is formed between nodes
sample data
// LESSON 1.2 - SCHEMA DESIGN FOR NOSQL
- nosql doesn't really require designation of schema or how the data should look like (e.g. no need to define constraints and datatype unlike in rdbms)
// UNFINISHED - nosql's really just important for oltp and we won't need that in de so i might not finish this - LESSON 2.1 - NOSQL DATA MODELING TECHNIQUES
conceptual techniques
denormalization
aggregates
application side joins
general modeling techniques
enumerable keys
dimensionality reduction
index table
composite key index
inverted search
hierarchy modelign techniques
tree aggregation
adjacency lists
materialized paths
nested sets
nested documents flattening: numbered field names
nested documents flattening: proximity queries
batch graph processing
// LESSON X.1 - RELATIONAL DB VS NOSQL DB
quick definition to know
vertical scaling: keep existing infrastructure but add more computing power
horizontal scaling: add more insfrastructure
structured data
semi-structured data
unstructured data
main advantage of nosql
- simplicity of design
- horizontal scaling
- faster because it uses different data structure compared to relational databases
nosql vs rdbms
nosql
- supports a very simple query langauge
- has no fixed schema (although you could add one)
- only eventually consistent
- used to handle data coming in high speed
- can manage structured, unstructured and semi structured data (IMPORTANT)
- can handle big data (IMPORTANT)
- gives both read and write scalability
rdbms
- supports a powerful query language that can perform even more complex query
- has a fixed schema (you need to define format for how the table should look like)
- follows acid properties (atomicity, consistency, isolation, and durability)
- used to handle data coming in low speed
- manages only structured data (IMPORTANT)
- used to handle moderate volume of data (IMPORTANT)
- gives read scalibility only
conclusion: right time to use sql vs nosql | how to decide which database to use
sql
- data is very structured and acid compliance(all sql are acid compliant, in nosql only some database are acid compliant) is a must
- vertically scalable (small and medium data) only: must increase the power of your single computer
nosql
- semi structured or unstructured | although you could store structured data in here, its main advantage focus on semi structured and unstructured
- sql requires structured data meaning attributes, datatypes of column shall be defined explicitly, this isn't possible to some of the datas (those are unstructured data) thus we use nosql
- horizontally scalable (big data): could split data into multiple servers
- if the data is big and but is a structure data, it's still better to use nosql since nosql could scale much better